Table Of ContentAD-A261 002
i t
of Souther
California
OTIC
03627
93a
EIijg
mJ
INFORMATION
SCIENCES
INSTITUTE
476 Admirbun WaylMarina del ReylCalifornia 90292_6695
98 2 ] i
ISI Research Report
ISI/RR-93-302
January 1993
A Design of a Fast and Area Efficient
Multi-input Muller C-element
Tzyh-Yung Wuu, USC-ISI
Sarma B. K. Vrudhula, UA-ECE
ISI/RR-93-302
January 1993
This research was sponsored in part by the Defense Advanced Research Projects Agency under contract number
MDA903-92-D-0020 and in part by a grant from the National Science Foundation under award number MIP-9111206.
Views and conclusions contained in this report are the authors' and should not be interpreted as representing the
official opinion or policy of DARPA, NSF, the U.S. Governmenit, or any person or agency connected with them.
REPORT DOCUMENTATION PAGE Fn
isIII I lip ft a o" Nomat o "
""*NNWt Ic OwgmmWWO4~amof
1. AGENCY USE ONLY (Loo &1&*) 2. REPORT DATE J. RE1PORT TMP AND DATE COVERED
IResearch
17 9g,9 I Ta n Renar
4. TITLE AND SUBIUI. UNNLNM@
A Design of a Fast and Area Efficient Multi-input MDA903-92-DO020
Muller C-element MIP-9111206
4.A UITHOWS)W
Wuu, Tzyh-Yung, Vrudhula, Sarrna B.K.
7. PEWOMUISI ORGBANIATION NAMEES AND AOOUSS4ES)
USC/Informat- ion Sciences Institute I IO NME
4676 Admiralty Way ISI/ RR-93-302
Marina del Rey, CA 90292-6695
I.S PONSORINSIMONITORIUS AGENC" NAUE4S) AND AOOIESS(ES) 10. SPONSOMINGIMONSTOSa
AGENCY REOMT EMKIE
DARPA
3701 N. Fairfax Drive
Arlington, VA 22203-1714
11. SUPPLEMENTARY NOTES
128.A 5TRinUTsONLAVALAWfIUTY STATEMENT 12b. DISTUTlON CONE
This document is approved for public release;
distribution is unlimited.
13. ASITRACT (Maaxmum .00wam
A multi-input Muller C-element has frequently been used for joining signal
transitions or completion time detection in self-timed circuits. This paper
presents an n-input Muller C-element design which uses the multi-level login
design technique and has a symmetric format for any integer n > 2. In comparison
!IS. UPO(cid:127)MITRT NOTES~IAC
with series-parallel MOS structure implementations and C-element tree implementations,
our design has fewer restrictions in terms of n, less path delay, less delay
variance from inputs to output, and less area consumption. Experimental validation
in this paper is based on an industrial standard cell library.
IT SIW Tana IS. N4MMBOI
Muller C-element, MOS, standard cells, self-timed circuits, 11
signal transitions, path delay, cell area I1IL PUNCE CM
Arlinton, a3 t1-31o.7- 1 AIEI4OU NWICAIO 1. SUA V20 2 2 T104 OP ANUACT
Uclassified isUnclassifiUendc lassified Ua
N.N 7ABSTRI.AST ¢ FM 2:n" w 10
ftm Wamy "dAre
MWisig
INSTUMTONS FOR COMPLETING SIP 293j
.BL
The Report MocumentainPg (110M is used in announi ng and cataloging report. it is important
that this information beconsistent with the rest of the report. particularly the cover and title page.
Instructions for filling in each block of the form follow. It is important to stay wl"hh thme lines to meet
Block 1. Aaemt Use Only (Lemv blank) Block 12e. DistributionAvailabilitv ttmn
BloDc& 2. .RoFol l ublcaton ateDenotes public availability or limitations. Cite any
Bnlock2n. y monthg. anll puliation dviatle gI availability to the public. Enter additional
inmcoldtihn, d anyd ear if vaiable(e~. ~ limitations or special markings in all capitals (e.g.
Jan 88). Must cite at least the year. NOFORN. REL. ITAR).
Block 3. Type of Renor arid Date Covered DOD -See DoDD 5230.24,. Distribution
State whether report is interim, final. etc. IfStemnsoTchia
applicable, enter inclusive r eport dates (e.g. 10 Statulementso ehia
J8u4n.D O7E - 0 Ju - See authorities.
Block 4. Tile and Subtitl A tifte is taken from NASA - See Handbook NHB 2200.2.
the part of the. report that provides the most NTIS - Leav blank.
meaningful and complete information. When a0
report isp repared in more thani one volume. lc b isrbto oe
repeat the pnmarytitle. add volume number,. and Blclb.DsrutoCd.
include subtilea for the specific~volume. On 00-Lmbak
classified documents enter the tifte classification DOE - LEaver blank.iutoncteore
in paenthses.from the Standard Distribution for
Block 5. Fu~n Nmm To include contract Unclassified Scientific and Technical
and grant numbers; may include program Reports.
ement number(s). project numberds), task NASA - Loom blank.
"umber(s), and work unit number(s). Use the NTIS - Leav blank.
following labels:
C -contract PR - PrOect Block 13. Include a brief (Mlaximum
G -Grant TA Task 200 words) factual summary of the most
-
PE Program WU - Work Unit significant informatior contained in the report.
Element Accession No.
Block S. ~Authors) Name(s) of person(s) Block 14. SubjsioctTems Keywords or. phrases
responsible for writing the report pwiforming identifying major subjects in the report.
the research. or credited with the content of the
report. if editor or compiler, this should follow
the name~s). Block I15. Number of Pg=e. Enter the total
number of pages.
Block 7. Performina Oroaniuation Namer(s) and
Adrsfs Self-explanatoty. Block'16. PrisgCgdg. Enter appropriateprc
Iloick 2. Petri raiainRpn code (NTI only).
~.Enter the unique alphanumeric report
numbrs) assgned by the organization Bkk 17. -19. Security Classifications. Self-
performing the rpot explanatory. Enter U.S. Security Classification in
Blc 9 pnsrnafoitrnaamNaes accordance with U.S. Security Regulations 0(i.e.
and dedlfressxlep). lnatry.UNCLASSIFIED). If form contains classified
information, stamp classification on the top and
Block 10. SoomodefifahdeAaec bottom of the page.
Reor (if known)
Block 11. Sunse GEtrBlock 20. Limitation- of Atac.This block must0
infanatone not' dncuded esenwhere such as: be completed to assign a limitation to the
Prep@Med in cooperatdi on with ...; Trans. of ...; To be abstract. Enter eitherUL (unlimnite) or SAR (same
publuidmed In.... Whla repor isr evised. include- as report). An en"y in fths block ft -necessarri f
as tatumen whethewter ew report supersedes the abstract is to be limitd If blank, the abstract
or supplemients the older .epamt isa ginmed o beunlImited
StM1a1r- 1 1F 011n1 M 11t W, 249)
A Design of a Fast and Area Efficient Multi-input
Muller C-element
Tzyh-Yung Wuu Sarma B. K. Vrudhula
(a.k.a. Sarma Sastry)
Information Science Institute ECE Dept.
Univ. of Southern California Univ. of Arizona
[email protected] [email protected]
Abstract
A multi-input Muller C-element has frequently been used for joining signal transitions or
completion time detection in self-timed circuits. This paper presents an n-input Muller C-
element design which uses the multi-level login design technique and has a symmetric format
for any integer n > 2. In comparison with series-parallel MOS structure implementations
and C-element tree implementations, our design has fewer restrictions in terms of n. less path
delay, less delay variance from inputs to output, and less area consumption. Experimental
validation in this paper is based on an industrial standard cell library.
1 Introduction
A Muller C-element (5] is used as a basic component in the design of speed-independent
circuits. A C-element is functionally equivalent to an SR latch. Under the assumption of
unbounded gate delays it is not possible to guarantee that S and R will not be 1 simul-
taneously. This problem does not arise with -. C-element [4]. The output of a two input
C-element will equal the value of the inputs after both inputs have reached the same value;
otherwise the output remains unchanged. That is, if il and i are the two inputs and 0 is
2
the output, then the defining equation of the C-element is 0 = ii • i + 0- i + 0 - [5]. A
2 1 2
two input C-element can be viewed as a logical and of two events, where an evw nt can be a
0-1 or 1-0 transition (7]. This behaviour is shown in Figure 1.
A C-element is commonly used for joining signal transitions to signal the completion of
an operation [1, 2, 3, 4, 6]. For example, the 16 output computational block in Figure 2 will
require a 16 input C-element to join all 16 completion signals in order to generate one signal
transition to indicate the completion of the block.
The output of an n-input C-element is 1 if all the inputs are 1 and it is 0 if all the inputs
are 0; otherwise its value remains unchanged [6]. The state diagrams for two-input and
three-input C-elements are shown in Figure 3, where the state is labeled "inputs/output".
The initial state of a C-element is having all inputs and its output zero. This is denoted by
the double circle in the state diagram.
US
0
i/
Figure 1: Timing diagrams for 2-input Muller C-elements
Self-timed i dOQ,. na - '•
Block " d15 d. >"
(a) A typical self-timed computational block
--- La- of all remtS igils
* ~L"s of nil COMm)OdSiraU-si
* a
(b) rting diagsrm of the coWpletinrse signals
Figure 2: Join of completion/reset signals
2
0000/0
10/6 1/
11/0
11/1 1/ 10/ 11 1
00/1T/
(a) Two-input C-element (b) Three-input C-element
Figure 3: State diagrams for Muller C-elements
3C
0
(a) Series-parallel MOS structure (b) C-tree structure 0
Figure 4: Three-input series-parallel C-element designs
I Availability Codes
MWYC QUALIM INSPEMCTD 3 vat /
3special
3A
,(I
_ _I- - _
C-elements with large numbers of inputs are very useful and an efficient implementation
in terms of area and speed is needed. Two designs of multi-input C-elements have been used;
a series-parallel MOS structure [6] shown in Figure 4(a) and the C-element tree [4, 6] shown
in Figure 4(b). Because the C-element is associative, the tree implementation uses (n - 1)
two-input C-elements to form an n-input C-element. The input-output delay of the series-,
parallel MOS structure is less than that of the tree structure. However, the series-parallel
implementation is not feasible for large n. For this reason, most existing designs employ the
tree structure. The main disadvantages of the tree implementation are that it is very slow
and the variance in the delays over different input-output paths is very large.
In this paper we present an efficient design of a multi-input C-element. Our design is
symmetric and the variance in delay over different input-output paths is very small. In
Section 2 we derive a symmetric form of an n-input C-element and provide estimates of
delay and area. In Section 3 we demonstrate the advantages of our design by presenting
experimental results for C-elements with inputs ranging from 2 to 128.
2 Design of a Multi-input C-element
The target technology of our design is CMOS. Since inverted logic is faster than non-inverted
logic in CMOS, we will use and-or-invert (AOI) logic and inverters instead of and-or logic.
The defining equation of an n-input C-element with a reset is given by
OUT = ((Ih * I .... In) + (I + I +... + In)oOUT)o RESET (1)
2 2
where Ii, for i = 1,...,n, are inputs of the C-element, and OUT is the output of the
C-element. Using DeMorgan's law we transform Equation (1) as follows:
OUT = ((1, *12 .... I) + (h + 12 +... + I). OUT) 9 RESET (2)
= ((h.oI ... *In)+(I,+ +...+In)oOUT).RESET
2 2
= ((I .I .... o In) +O(I +.In).O UT) + RESET
1 2 1 2
= (h.Ioo...9In,)o(I,+1 +...+I,,)oOUT+ RESET
2 2
= (, he 1 *... e In) ((I + . +I, ) + -U-T) + RESET (3)
2
Equation (3) can be further decomposed to following equations. 0
NANDTREE = (1,* 12 ... In) (4)
NOR.TREE = (+I2 +... + I) (5)
OUT = NANDTREE * (NOR-TREE + OUT) + RESET (6)
4
1w NOR-TREE
-
12
In Q
o OAO_PARTOU
NAVDTREE
Figure 5: C-elements design
0 0
19 19
1i14
(a) Two-level NANDTREE implementation
12 B12B
a OUT 0 OUT>
0CID_.. 0 00F__ 0
1.IL3
In
(b) Three-level NANDTREE implementation
Figure 6: Multi-level NANDTREE implementation
5
The above decomposition is very useful when it is mapped to a CMOS cell-based imple-
mentation. In Figure 5 we show a C-element design consisting of 3 parts: a NANDTREE, a
NOR-TREE, and a OAOPART (or-and-or), each implemented separately. The OAOPART
shown in Figure 5 remains the same for all values of n.
Therefore, we only need to find a proper NANDTREE/NORTREE implementation. In
the HP C34000 library [81, there are n-input NAND and NOR gates for 2 < n < 8. For
larger n, NANDTREE (NOR-TREE) can be further decomposed into two-level NAND-OR
tree (NOR-AND tree). When a two-level structure is not sufficient, we can use more levels,
i.e., use a nand-nor-nand three level structure to implement a NANDTREE or a nor-nand-
nor structure to implement a NORLTREE. Figures 6 (a) and (b) show the two-level and
the three-level implementations for NAND.TREE. In order to minimize the variance of the
input-output delays, the structure of the NANDTREE implementation is identical to the
structure of the NOR. TREE implementation.
We now provide estimates of delay and area for our design and compare them with the
tree implementation of a C-element. A comparison with a series-parallel MOS structure is
unnecessary since such an implementation is not feasible for large numbers of inputs.
Delay: For an n-input C-element, the input-output delay of a C-element tree implemen-
tation is equal to the number of levels in the structure multiplied by the input-output delay
of a 2-input C-element, where the number of levels is [log n], i.e.,
2
Dee,.(n) = [log n] * ( 2-input Muller C-element delay ) (7)
2
;t [log2n] * ( 2-input NAND/NOR delay + OAOI delay) (8)
; flog2n] * OAOI delay + [log2n] * 2-input NAND/NOR delay
The input-output of our design is .
Dmuti(n) = OAOI delay + n-input NANDTREE/NORTREE delay (9)
The n-input NAND.TREE/NOR.TREE can be implemented by [log n] stages of a NAND2-
2
NOR2 tree, and can be made faster by using [logn] stages of NANDm-NORm tree. There-
fore,
Dmati(n) <_O AOI delay + [log n] * 2-input NAND/NOR delay (10)
2
ehviously, our design is much faster than C-element tree implementation for n > 2, although
both Dt...(n) and DmIti(n) are O(log(n)).
Delay Variance: In order to have less delay variance among the input-output paths in
our design, two sufficient conditions need to be met.
1. Transistors in the OAOI element of 1he OAO.PART must be sized so that the delay
from one OR gate input to the output and the delay from one AND gate input to the •
output is the same.
6
S.
.. .