• date post

01-Sep-2018
• Category

## Documents

• view

216

0

Embed Size (px)

### Transcript of ΗΜΥ 307 ΨΗΦΙΑΚΑ...

• 307

2018

14-15:

(ttheocharides@ucy.ac.cy)(ack: Prof. Mary Jane Irwin and Vijay Narayanan)

[ Rabaeys Digital Integrated Circuits, 2002, J. Rabaey et al.]

mailto:ttheocharides@ucy.ac.cy

• 307 14-15 Arithmetic and Logic Circuits .2 , , 2018

Review: Basic Building Blocksl Datapath

Execution units Adder, multiplier, divider, shifter, etc Today!

Register file and pipeline registers MEMORY See below! Multiplexers, decoders, etc. THIS lecture (and L.15)

l Control Finite state machines (PLA, ROM Lecture 17)

l Interconnect Switches, arbiters, buses Lecture 16

l Memory Caches (SRAMs), TLBs, DRAMs, buffers

Lecture 17

• 307 14-15 Arithmetic and Logic Circuits .3 , , 2018

A

BS

Cin

S = A B CinCout = A&B | A&Cin | B&Cin (majority function)

q How can we use it to build a 64-bit adder?

q How can we modify it easily to build an adder/subtractor?

q How can we make it better (faster, lower power, smaller)?

A B Cin Cout S carry status0 0 0 0 0 kill0 0 1 0 1 kill0 1 0 0 1 propagate0 1 1 1 0 propagate1 0 0 0 1 propagate1 0 1 1 0 propagate1 1 0 1 0 generate1 1 1 1 1 generate

Cout

G = A&BP = A BK = !A & !B

= P Cin

= G | P&Cin

• 307 14-15 Arithmetic and Logic Circuits .4 , , 2018

FA Gate Level Implementations

A B

S

Cout

Cin

t1 t0t2 t0

t1

A B

S

Cout

Cin

t2

q The way you learned to design in ECE 210 and ECE 211

• 307 14-15 Arithmetic and Logic Circuits .5 , , 2018

Review: XOR FA

Cout

S

Cin

A

B

16 transistors

• 307 14-15 Arithmetic and Logic Circuits .6 , , 2018

Review: CPL FA

A

!A

B!B Cin!Cin

!S

S

Cout

!CoutA

!A

B

!B

!B

B Cin !Cin

Cin

!Cin

20+8 transistors, dual rail beware of threshold drops

• 307 14-15 Arithmetic and Logic Circuits .8 , , 2018

B

B B

B B

BB

BA

A

A

A

A

A A

A

Cin

Cin

Cin

Cin

Cin!Cout !S

24+4 transistors

kill

generate

0-propagate

1-propagate

Cout = A&B | B&Cin | A&Cin SUM = A&B&Cin | COUT&(A | B | Cin)

4 4

4 4

4

8

888

8

2 2 23

3

3

6

6

6444

4

2

Sizing: Each input in the carry circuit has a logical effort of 2 so the optimal fan-out for each is also 2. Since !Cout drives 2 internal and 2 inverter transistor gates (to form Cin for the nms bit adder) should oversize the carry circuit. PMOS/NMOS ratio of 2.

• 307 14-15 Arithmetic and Logic Circuits .9 , , 2018

Mirror Adder Featuresl The NMOS and PMOS chains are completely symmetrical with

a maximum of two series transistors in the carry circuitry,guaranteeing identical rise and fall transitions if the NMOS and PMOS devices are properly sized.

l When laying out the cell, the most critical issue is the minimization of the capacitances at node !Cout (four diffusion capacitances, two internal gate capacitances, and two inverter gate capacitances). Shared diffusions can reduce the stack node capacitances.

l The transistors connected to Cin are placed closest to the output.

l Only the transistors in the carry stage have to be optimized for optimal speed. All transistors in the sum stage can be minimal size.

• 307 14-15 Arithmetic and Logic Circuits .10 , , 2018

1-bit FA S0

C0=Cin

C11-bit FA S1

C21-bit FA S2

C3

C64=Cout

1-bit FA S63

C63

. . .

q Ripple Carry Adder (RCA) built out of 64 FAs

q Subtraction complement all subtrahend bits (xorgates) and set the low order carry-in

q RCA

l advantage: simple logic, small (low cost)

l disadvantage: slow (O(N) for N bits) and lots of glitching (so lots of energy consumption)

A0

B0

A1

B1

A2

B2

A63

B63

• 307 14-15 Arithmetic and Logic Circuits .11 , , 2018

A0 B0

S0

C0=CinFA

A1 B1

S1

FA

A2 B2

S2

FA

A3 B3

S3

FACout=C4

T = O(N) worst case delay

Tadder TFA(A,BCout) + (N-2)TFA(CinCout) + TFA(CinS)

Real Goal: Make the fastest possible carry path

• 307 14-15 Arithmetic and Logic Circuits .12 , , 2018

Inversion Property

A B

S

CinFA

!Cout (A, B, Cin) = Cout (!A, !B, !Cin)

Cout

A B

S

FACout Cin

!S (A, B, Cin) = S(!A, !B, !Cin)

q Inverting all inputs to a FA results in inverted values for all outputs

• 307 14-15 Arithmetic and Logic Circuits .13 , , 2018

Exploiting the Inversion Property

A0 B0

S0

C0=CinFA

A1 B1

S1

FA

A2 B2

S2

FA

A3 B3

S3

FACout=C4

Now need two flavors of FAs

regular cellinverted cell

q Minimizes the critical path (the carry chain) by eliminating inverters between the FAs (will need to increase the transistor sizing on the carry chain portion of the mirror adder).

• 307 14-15 Arithmetic and Logic Circuits .15 , , 2018

Fast Carry Chain Designl The key to fast addition is a low latency carry networkl What matters is whether in a given position a carry is

generated Gi = Ai & Bi = AiBi propagated Pi = Ai Bi (sometimes use Ai | Bi) annihilated (killed) Ki = !Ai & !Bi

l Giving a carry recurrence ofCi+1 = Gi | PiCi

C1 = G0 | P0C0C2 = G1 | P1G0 | P1P0 C0C3 = G2 | P2G1 | P2P1G0 | P2P1P0 C0C4 = G3 | P3G2 | P3P2G1 | P3P2P1G0 | P3P2P1P0 C0

• 307 14-15 Arithmetic and Logic Circuits .16 , , 2018

Manchester Carry Chainl Switches controlled by Gi and Pi

l Total delay of time to form the switch control signals Gi and Pi setup time for the switches signal propagation delay through N switches in the worst case

Gi Pi

!Ci!Ci+1

clk

• 307 14-15 Arithmetic and Logic Circuits .17 , , 2018

G P

!C0

clk

G PG PG P

& & & &

A0 B0A1 B1A2 B2A3 B3

S0S1S2S3

!C1!C2!C3

!C4

• 307 14-15 Arithmetic and Logic Circuits .18 , , 2018

Domino Manchester Carry Chain Circuit

Ci,0G0

clk

clkP0P1P2P3

G1G2G3

Ci,4 1 2 3 4

5

6

3 3 3 3 3

1

2

2

3

3

4

4

5

!(G0 | P0 Ci,0)

!(G1 | P1G0 | P1P0 Ci,0)

!(G2 | P2G1 | P2P1G0 | P2P1P0 Ci,0)

!(G3 | P3G2 | P3P2G1 | P3P2P1G0 | P3P2P1P0 Ci,0)

• 307 14-15 Arithmetic and Logic Circuits .19 , , 2018

Manchester carry parallel conditional carry carry chain select prefix sum skip

T = O(N), A = O(N)

T = O(1), A = O(N)

T = O(log N)A = O(N log N)

T = O(N), A = O(N)T = O(N)

A = O(N)

• 307 14-15 Arithmetic and Logic Circuits .20 , , 2018

If (P0 & P1 & P2 & P3 = 1) then Co,3 = Ci,0 otherwise the block itself kills or generates the carry internally

A0 B0

S0

Ci,0FA

A1 B1

S1

FA

A2 B2

S2

FA

A3 B3

S3

FACo,3

Co,3

BP = P0 P1 P2 P3 Block Propagate

• 307 14-15 Arithmetic and Logic Circuits .21 , , 2018

Carry-Skip Chain Implementation

BPblock carry-in

block carry-outcarry-out

CinG0

P0P1P2P3

G1G2G3

!Cout

BP

• 307 14-15 Arithmetic and Logic Circuits .22 , , 2018

Worst-case delay carry from bit 0 to bit 15 = carry generated in bit 0, ripples through bits 1, 2, and 3, skips the middle two groups (B is the group size in bits), ripples in the last group from bit 12 to bit 15

Ci,0

Sum

CarryPropagation

Setup

Sum

CarryPropagation

Setup

Sum

CarryPropagation

Setup

Sum

CarryPropagation

Setup

bits 0 to 3bits 4 to 7bits 8 to 11bits 12 to 15

Tadd = tsetup + B tcarry + ((N/B) -1) tskip +B tcarry + tsum

• 307 14-15 Arithmetic and Logic Circuits .23 , , 2018

Optimal Block Size and Timel Assuming one stage of ripple (tcarry) has the same delay as

one skip logic stage (tskip) and both are 1TCSkA = 1 + B + (N/B-1) + B + 1

tsetup ripple in skips ripple in tsumblock 0 last block

= 2B + N/B + 1l So the optimal block size, B, is

dTCSkA/dB = 0 (N/2) = Bopt

l And the optimal time isOptimal TCSkA = 2((2N)) + 1

• 307 14-15 Arithmetic and Logic Circuits .24 , , 2018

Carry-Skip Adder Extensionsl Variable block sizes

A carry that is generated in, or absorbed by, one of the inner blocks travels a shorter distance through the skip blocks, so can have bigger blocks for the inner carries without increasing the overall delay

CinCout

q Multiple levels of skip logic

skip level 1

skip level 2

CinCout

AN