ΗΜΥ 307 ΨΗΦΙΑΚΑ...
date post
01-Sep-2018Category
Documents
view
216download
0
Embed Size (px)
Transcript of ΗΜΥ 307 ΨΗΦΙΑΚΑ...
307
2018
14-15:
(ttheocharides@ucy.ac.cy)(ack: Prof. Mary Jane Irwin and Vijay Narayanan)
[ Rabaeys Digital Integrated Circuits, 2002, J. Rabaey et al.]
mailto:ttheocharides@ucy.ac.cy
307 14-15 Arithmetic and Logic Circuits .2 , , 2018
Review: Basic Building Blocksl Datapath
Execution units Adder, multiplier, divider, shifter, etc Today!
Register file and pipeline registers MEMORY See below! Multiplexers, decoders, etc. THIS lecture (and L.15)
l Control Finite state machines (PLA, ROM Lecture 17)
l Interconnect Switches, arbiters, buses Lecture 16
l Memory Caches (SRAMs), TLBs, DRAMs, buffers
Lecture 17
307 14-15 Arithmetic and Logic Circuits .3 , , 2018
The 1-bit Binary Adder
1-bit Full Adder(FA)
A
BS
Cin
S = A B CinCout = A&B | A&Cin | B&Cin (majority function)
q How can we use it to build a 64-bit adder?
q How can we modify it easily to build an adder/subtractor?
q How can we make it better (faster, lower power, smaller)?
A B Cin Cout S carry status0 0 0 0 0 kill0 0 1 0 1 kill0 1 0 0 1 propagate0 1 1 1 0 propagate1 0 0 0 1 propagate1 0 1 1 0 propagate1 1 0 1 0 generate1 1 1 1 1 generate
Cout
G = A&BP = A BK = !A & !B
= P Cin
= G | P&Cin
307 14-15 Arithmetic and Logic Circuits .4 , , 2018
FA Gate Level Implementations
A B
S
Cout
Cin
t1 t0t2 t0
t1
A B
S
Cout
Cin
t2
q The way you learned to design in ECE 210 and ECE 211
307 14-15 Arithmetic and Logic Circuits .5 , , 2018
Review: XOR FA
Cout
S
Cin
A
B
16 transistors
307 14-15 Arithmetic and Logic Circuits .6 , , 2018
Review: CPL FA
A
!A
B!B Cin!Cin
!S
S
Cout
!CoutA
!A
B
!B
!B
B Cin !Cin
Cin
!Cin
20+8 transistors, dual rail beware of threshold drops
307 14-15 Arithmetic and Logic Circuits .8 , , 2018
Review: Mirror Adder
B
B B
B B
BB
BA
A
A
A
A
A A
A
Cin
Cin
Cin
Cin
Cin!Cout !S
24+4 transistors
kill
generate
0-propagate
1-propagate
Cout = A&B | B&Cin | A&Cin SUM = A&B&Cin | COUT&(A | B | Cin)
4 4
4 4
4
8
888
8
2 2 23
3
3
6
6
6444
4
2
Sizing: Each input in the carry circuit has a logical effort of 2 so the optimal fan-out for each is also 2. Since !Cout drives 2 internal and 2 inverter transistor gates (to form Cin for the nms bit adder) should oversize the carry circuit. PMOS/NMOS ratio of 2.
307 14-15 Arithmetic and Logic Circuits .9 , , 2018
Mirror Adder Featuresl The NMOS and PMOS chains are completely symmetrical with
a maximum of two series transistors in the carry circuitry,guaranteeing identical rise and fall transitions if the NMOS and PMOS devices are properly sized.
l When laying out the cell, the most critical issue is the minimization of the capacitances at node !Cout (four diffusion capacitances, two internal gate capacitances, and two inverter gate capacitances). Shared diffusions can reduce the stack node capacitances.
l The transistors connected to Cin are placed closest to the output.
l Only the transistors in the carry stage have to be optimized for optimal speed. All transistors in the sum stage can be minimal size.
307 14-15 Arithmetic and Logic Circuits .10 , , 2018
A 64-bit Adder/Subtractor
1-bit FA S0
C0=Cin
C11-bit FA S1
C21-bit FA S2
C3
C64=Cout
1-bit FA S63
C63
. . .
q Ripple Carry Adder (RCA) built out of 64 FAs
q Subtraction complement all subtrahend bits (xorgates) and set the low order carry-in
q RCA
l advantage: simple logic, small (low cost)
l disadvantage: slow (O(N) for N bits) and lots of glitching (so lots of energy consumption)
A0
B0
A1
B1
A2
B2
A63
B63
add/subt
307 14-15 Arithmetic and Logic Circuits .11 , , 2018
Ripple Carry Adder (RCA)
A0 B0
S0
C0=CinFA
A1 B1
S1
FA
A2 B2
S2
FA
A3 B3
S3
FACout=C4
T = O(N) worst case delay
Tadder TFA(A,BCout) + (N-2)TFA(CinCout) + TFA(CinS)
Real Goal: Make the fastest possible carry path
307 14-15 Arithmetic and Logic Circuits .12 , , 2018
Inversion Property
A B
S
CinFA
!Cout (A, B, Cin) = Cout (!A, !B, !Cin)
Cout
A B
S
FACout Cin
!S (A, B, Cin) = S(!A, !B, !Cin)
q Inverting all inputs to a FA results in inverted values for all outputs
307 14-15 Arithmetic and Logic Circuits .13 , , 2018
Exploiting the Inversion Property
A0 B0
S0
C0=CinFA
A1 B1
S1
FA
A2 B2
S2
FA
A3 B3
S3
FACout=C4
Now need two flavors of FAs
regular cellinverted cell
q Minimizes the critical path (the carry chain) by eliminating inverters between the FAs (will need to increase the transistor sizing on the carry chain portion of the mirror adder).
307 14-15 Arithmetic and Logic Circuits .15 , , 2018
Fast Carry Chain Designl The key to fast addition is a low latency carry networkl What matters is whether in a given position a carry is
generated Gi = Ai & Bi = AiBi propagated Pi = Ai Bi (sometimes use Ai | Bi) annihilated (killed) Ki = !Ai & !Bi
l Giving a carry recurrence ofCi+1 = Gi | PiCi
C1 = G0 | P0C0C2 = G1 | P1G0 | P1P0 C0C3 = G2 | P2G1 | P2P1G0 | P2P1P0 C0C4 = G3 | P3G2 | P3P2G1 | P3P2P1G0 | P3P2P1P0 C0
307 14-15 Arithmetic and Logic Circuits .16 , , 2018
Manchester Carry Chainl Switches controlled by Gi and Pi
l Total delay of time to form the switch control signals Gi and Pi setup time for the switches signal propagation delay through N switches in the worst case
Gi Pi
!Ci!Ci+1
clk
307 14-15 Arithmetic and Logic Circuits .17 , , 2018
4-bit Sliced MCC Adder
G P
!C0
clk
G PG PG P
& & & &
A0 B0A1 B1A2 B2A3 B3
S0S1S2S3
!C1!C2!C3
!C4
307 14-15 Arithmetic and Logic Circuits .18 , , 2018
Domino Manchester Carry Chain Circuit
Ci,0G0
clk
clkP0P1P2P3
G1G2G3
Ci,4 1 2 3 4
5
6
3 3 3 3 3
1
2
2
3
3
4
4
5
!(G0 | P0 Ci,0)
!(G1 | P1G0 | P1P0 Ci,0)
!(G2 | P2G1 | P2P1G0 | P2P1P0 Ci,0)
!(G3 | P3G2 | P3P2G1 | P3P2P1G0 | P3P2P1P0 Ci,0)
307 14-15 Arithmetic and Logic Circuits .19 , , 2018
Binary Adder Landscapesynchronous word parallel adders
ripple carry adders (RCA) carry prop min adders
signed-digit fast carry prop residue adders adders adders
Manchester carry parallel conditional carry carry chain select prefix sum skip
T = O(N), A = O(N)
T = O(1), A = O(N)
T = O(log N)A = O(N log N)
T = O(N), A = O(N)T = O(N)
A = O(N)
307 14-15 Arithmetic and Logic Circuits .20 , , 2018
Carry-Skip (Carry-Bypass) Adder
If (P0 & P1 & P2 & P3 = 1) then Co,3 = Ci,0 otherwise the block itself kills or generates the carry internally
A0 B0
S0
Ci,0FA
A1 B1
S1
FA
A2 B2
S2
FA
A3 B3
S3
FACo,3
Co,3
BP = P0 P1 P2 P3 Block Propagate
307 14-15 Arithmetic and Logic Circuits .21 , , 2018
Carry-Skip Chain Implementation
BPblock carry-in
block carry-outcarry-out
CinG0
P0P1P2P3
G1G2G3
!Cout
BP
307 14-15 Arithmetic and Logic Circuits .22 , , 2018
4-bit Block Carry-Skip Adder
Worst-case delay carry from bit 0 to bit 15 = carry generated in bit 0, ripples through bits 1, 2, and 3, skips the middle two groups (B is the group size in bits), ripples in the last group from bit 12 to bit 15
Ci,0
Sum
CarryPropagation
Setup
Sum
CarryPropagation
Setup
Sum
CarryPropagation
Setup
Sum
CarryPropagation
Setup
bits 0 to 3bits 4 to 7bits 8 to 11bits 12 to 15
Tadd = tsetup + B tcarry + ((N/B) -1) tskip +B tcarry + tsum
307 14-15 Arithmetic and Logic Circuits .23 , , 2018
Optimal Block Size and Timel Assuming one stage of ripple (tcarry) has the same delay as
one skip logic stage (tskip) and both are 1TCSkA = 1 + B + (N/B-1) + B + 1
tsetup ripple in skips ripple in tsumblock 0 last block
= 2B + N/B + 1l So the optimal block size, B, is
dTCSkA/dB = 0 (N/2) = Bopt
l And the optimal time isOptimal TCSkA = 2((2N)) + 1
307 14-15 Arithmetic and Logic Circuits .24 , , 2018
Carry-Skip Adder Extensionsl Variable block sizes
A carry that is generated in, or absorbed by, one of the inner blocks travels a shorter distance through the skip blocks, so can have bigger blocks for the inner carries without increasing the overall delay
CinCout
q Multiple levels of skip logic
skip level 1
skip level 2
CinCout
AN