Table Of ContentSense Amplifier-Based Pass Transistor Logic
Louis Poblete Alarcon
Jan M. Rabaey
Electrical Engineering and Computer Sciences
University of California at Berkeley
Technical Report No. UCB/EECS-2010-173
http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-173.html
December 19, 2010
Copyright © 2010, by the author(s).
All rights reserved.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission.
Sense Amplifier-Based Pass Transistor Logic
by
Louis Poblete Alarcón
A dissertation submitted in partial satisfaction of the
requirements for the degree of
Doctor of Philosophy
in
Engineering - Electrical Engineering and Computer Science
in the
Graduate Division
of the
University of California, Berkeley
Committee in charge:
Professor Jan M. Rabaey, Chair
Professor Elad Alon
Professor Paul K. Wright
Fall 2010
Sense Amplifier-Based Pass Transistor Logic
Copyright 2010
by
Louis Poblete Alarcón
1
Abstract
Sense Amplifier-Based Pass Transistor Logic
by
Louis Poblete Alarcón
Doctor of Philosophy in Engineering - Electrical Engineering and Computer Science
University of California, Berkeley
Professor Jan M. Rabaey, Chair
Reducing the energy required per operation is the key to building ultra-low energy systems,
and the most effective way of achieving this is to reduce the supply voltage. However, oper-
ating CMOS circuits at low supply voltages increases circuit delay, leading to lower circuit
performance. In this region, the sub-threshold leakage energy component becomes more
pronounced and can even dominate the total circuit energy. Increasing threshold voltages
reduces the amount of leakage, but this forces operation in the sub-threshold region where
performance and variability become exponentially worse.
The use of the sense amplifier-based pass transistor logic (SAPTL) topology is one approach
to reducing the energy per operation. It uses an inverted pass transistor logic (PTL) tree,
whichinherentlyhasnogain, andhencenopowersupplyconnections, eliminatingthesources
of sub-threshold leakage current. Reducing the threshold voltages of the PTL transistors im-
proves performance, without the leakage current increase associated with conventional static
CMOS logic. This reduced threshold voltage also allows the PTL transistors to operate in
the super-threshold region, even for very low supply voltages, avoiding the increased delay
and variability associated with the sub-threshold operating regime.
Gainisintroducedbyusingdriversandsenseamplifiers(SAs)thatrestoretheoutputvoltage
swing and provide the appropriate output current to drive the fan-out capacitances. These
drivers and SAs are the primary source of sub-threshold leakage, which can be amortized by
making the PTL networks complex, and by applying various leakage reduction techniques.
SAPTL-based 90nm test circuits using both synchronous and asynchronous timing schemes
have been designed, fabricated and tested. These circuits show leakage and energy charac-
teristics better than the equivalent static CMOS circuits. These test chips also demonstrate
rudimentary SAPTL-based design flows using commercially available CAD tools.
Simulation and measurement results of basic synchronous SAPTL building blocks show a
2
40X-50X reduction in standby current and a 6X reduction in energy when compared to an
equivalent CMOS logic block, at the expense of a 10X-30X increase in delay. Operating the
SAPTL asynchronously reduces the average delay by 89%. However, adding the necessary
handshaking circuitry increases the energy by 31%.
These SAPTL building blocks are used to create a parallel 64-byte asynchronous SAPTL-
based CRC generator with a minimum energy point that is 25% lower than that of the
static CMOS equivalent, with a 6X delay penalty. Also, due to the nature of the PTL tree,
forward-biasing the body of the PTL transistors results in a 10% reduction in delay with no
energy penalty.
The advantages of the SAPTL over conventional static CMOS is expected to be more signifi-
cant as technology continues to scale, where subthreshold leakage continue to prevent supply
voltages from being aggressively scaled.
i
Contents
List of Figures iv
List of Tables viii
1 Introduction 1
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Thesis Flow and Organization . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Ultra-Low Energy Design 3
2.1 Characteristics of CMOS Logic . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1.1 The CMOS Inverter . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.2 Delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.3 Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.4 Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 A Sampling of Low Energy Design Techniques . . . . . . . . . . . . . . . . . 9
2.3 Fundamental Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.1 Variability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.2 Subthreshold Leakage . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Leakage and Gain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Pass Transistor Logic 16
3.1 Basic PTL Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 PTL Operation and Circuit Gain . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2.1 Leakage in PTL Circuits . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2.2 The Effect of Adding Gain . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2.3 Decoupled Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.3 Alternative Gain Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3.1 Sense Amplifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3.2 Regenerative Elements . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4 The SAPTL Organization 25
CONTENTS ii
4.1 The SAPTL logic block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2 The Inverted PTN Multiplexer Tree . . . . . . . . . . . . . . . . . . . . . . . 26
4.3 The Full Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.4 The Simple Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.4.1 A First Order Delay Model . . . . . . . . . . . . . . . . . . . . . . . 28
4.4.2 Simple Stack Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.5 Analysis of The Differential Full Stack . . . . . . . . . . . . . . . . . . . . . 32
4.6 Stack Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.6.1 The XOR Logic Function . . . . . . . . . . . . . . . . . . . . . . . . 36
4.6.2 A Full-Adder Implementation . . . . . . . . . . . . . . . . . . . . . . 39
4.7 Stack Implementation and Variability . . . . . . . . . . . . . . . . . . . . . . 40
4.8 Differential Stack Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.8.1 The Stack Output Voltage . . . . . . . . . . . . . . . . . . . . . . . . 41
4.8.2 The Weak Output Latch . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.9 The Sense Amplifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.9.1 Sense Amplifier Design . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.9.2 Delay and Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.10 Decoupling Functionality and Gain . . . . . . . . . . . . . . . . . . . . . . . 51
4.11 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5 SAPTL Timing 54
5.1 Synchronous SAPTL Operation . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.1.1 Two-Phase Clocking . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.1.2 Synchronous SAPTL Design . . . . . . . . . . . . . . . . . . . . . . . 57
5.2 Asynchronous Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2.1 The Delay Line-based Asynchronous SAPTL . . . . . . . . . . . . . 59
5.2.2 The Asynchronous SAPTL Operation without a Delay Line . . . . . 62
5.2.3 The Cost of Asynchrony . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.3 An Example: The SAPTL XOR gate . . . . . . . . . . . . . . . . . . . . . . 67
5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6 Case Studies 70
6.1 The Synchronous SAPTL Test Chip . . . . . . . . . . . . . . . . . . . . . . . 70
6.1.1 Basic Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.1.2 Chip Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.1.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.1.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.2 The Asynchronous SAPTL Test Chip . . . . . . . . . . . . . . . . . . . . . . 78
6.2.1 Design Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.2.2 Test Chip Organization . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
6.3 The SAPTL CRC Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
CONTENTS iii
6.3.1 The CRC Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.3.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.3.3 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.3.4 Measurement Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7 Summary and Conclusions 95
7.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
7.2 Limitations of SAPTL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
8 Recommendations and Future Work 100
Bibliography 102
A Exploring the V Space: A Simulation Technique 110
TH
iv
List of Figures
2.1 A generalized complementary CMOS logic gate. . . . . . . . . . . . . . . . . 4
2.2 A CMOS inverter driving a capacitive load. . . . . . . . . . . . . . . . . . . 5
2.3 CMOS inverter delay as a function of V . . . . . . . . . . . . . . . . . . . . 6
DD
2.4 CMOS inverter power as a function of V . . . . . . . . . . . . . . . . . . . 8
DD
2.5 CMOS inverter energy per operation as a function of V . . . . . . . . . . . 9
DD
2.6 The energy-delay plot of the 65nm CMOS inverters. . . . . . . . . . . . . . 10
2.7 Theoretical energy and voltage limits. . . . . . . . . . . . . . . . . . . . . . . 11
2.8 A NAND4 gate implemented using (a) a single gate and (b) multiple gates. . 13
2.9 The CMOS NAND4 gate: (a) energy-delay characteristics and (b) details at
high supply voltages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.1 OR/NOR implementations in (a) CMOS, (b) CPL, (c) DPL and (d) DVL. . 17
3.2 PTL 2-input OR/NOR energy-delay characteristics: (a) α = 0.1 and loaded
with a 1X INV, (b) α = 0.01 and loaded with a 1X INV, (c) α = 0.1 and
loaded with a 4X INV and (d) α = 0.01 and loaded with a 4X INV. . . . . . 18
3.3 The pass transistor multiplexer tree (a) with N = 2 and (b) a sneak path. 19
depth
3.4 A pass transistor network (a) with no gain elements and (b) with gain at the
output and internal nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.5 The PTL energy and delay as a function of gain. . . . . . . . . . . . . . . . . 21
3.6 Alternative gain elements: (a) a sense amplifier and (b) a regenerative element. 22
3.7 A generalized PTL model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.1 The SAPTL logic block. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2 The inverted pass transistor tree network (a) with two inputs and (b) config-
ured as a two-input OR/NOR stack. . . . . . . . . . . . . . . . . . . . . . . 26
4.3 The full stack and driver showing (a) the logic paths and (b) switches for
reconfigurability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.4 The simple stack (a) schematic and (b) rising transient model. . . . . . . . . 29
4.5 The simple stack delay model for (a) N = 5 and (b) N = 16. . . . . . 30
depth depth
4.6 The simple stack energy mode for (a) N = 5 and (b) N = 16. . . . . 31
depth depth
4.7 The simple stack energy-delay model for (a) N = 5 and (b) N = 16. . 32
depth depth
4.8 The full stack rising transient model. . . . . . . . . . . . . . . . . . . . . . . 33
4.9 The full stack delay model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Description:Professor Jan M. Rabaey, Chair. Professor Elad Alon between each output switching event of a particular gate or subsystem. 2.1.3 Power.