Download - ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

Transcript
Page 1: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ 307ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ

ΚΥΚΛΩΜΑΤΑΕαρινό Εξάμηνο 2018

ΔΙΑΛΕΞΕΙΣ 14-15: Κυκλώματα Αριθμητικής και Λογικής

ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ([email protected])(ack: Prof. Mary Jane Irwin and Vijay Narayanan)

[Προσαρμογή από “Rabaey’s Digital Integrated Circuits, ©2002, J. Rabaey et al.”]

Page 2: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .2 © Θεοχαρίδης, ΗΜΥ, 2018

Review: Basic Building Blocksl Datapath

¤ Execution units¢ Adder, multiplier, divider, shifter, etc – Today!

¤ Register file and pipeline registers – MEMORY – See below!¤ Multiplexers, decoders, etc. – THIS lecture (and L.15)

l Control¤ Finite state machines (PLA, ROM – Lecture 17)

l Interconnect¤ Switches, arbiters, buses – Lecture 16

l Memory¤ Caches (SRAMs), TLBs, DRAMs, buffers

¤ Lecture 17

Page 3: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .3 © Θεοχαρίδης, ΗΜΥ, 2018

The 1-bit Binary Adder

1-bit Full Adder(FA)

A

BS

Cin

S = A Å B Å CinCout = A&B | A&Cin | B&Cin (majority function)

q How can we use it to build a 64-bit adder?

q How can we modify it easily to build an adder/subtractor?

q How can we make it better (faster, lower power, smaller)?

A B Cin Cout S carry status0 0 0 0 0 kill0 0 1 0 1 kill0 1 0 0 1 propagate0 1 1 1 0 propagate1 0 0 0 1 propagate1 0 1 1 0 propagate1 1 0 1 0 generate1 1 1 1 1 generate

Cout

G = A&BP = A Å BK = !A & !B

= P Å Cin

= G | P&Cin

Page 4: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .4 © Θεοχαρίδης, ΗΜΥ, 2018

FA Gate Level Implementations

A B

S

Cout

Cin

t1 t0t2 t0

t1

A B

S

Cout

Cin

t2

q The way you learned to design in ECE 210 and ECE 211

Page 5: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .5 © Θεοχαρίδης, ΗΜΥ, 2018

Review: XOR FA

Cout

S

Cin

A

B

16 transistors

Page 6: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .6 © Θεοχαρίδης, ΗΜΥ, 2018

Review: CPL FA

A

!A

B!B Cin!Cin

!S

S

Cout

!CoutA

!A

B

!B

!B

B Cin !Cin

Cin

!Cin

20+8 transistors, dual rail – beware of threshold drops

Page 7: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .8 © Θεοχαρίδης, ΗΜΥ, 2018

Review: Mirror Adder

B

B B

B B

BB

BA

A

A

A

A

A A

A

Cin

Cin

Cin

Cin

Cin!Cout !S

24+4 transistors

kill

generate

0-propagate

1-propagate

Cout = A&B | B&Cin | A&Cin SUM = A&B&Cin | COUT&(A | B | Cin)

4 4

4 4

4

8

888

8

2 2 23

3

3

6

6

6444

4

2

Sizing: Each input in the carry circuit has a logical effort of 2 so the optimal fan-out for each is also 2. Since !Cout drives 2 internal and 2 inverter transistor gates (to form Cin for the nms bit adder) should oversize the carry circuit. PMOS/NMOS ratio of 2.

Page 8: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .9 © Θεοχαρίδης, ΗΜΥ, 2018

Mirror Adder Featuresl The NMOS and PMOS chains are completely symmetrical with

a maximum of two series transistors in the carry circuitry,guaranteeing identical rise and fall transitions if the NMOS and PMOS devices are properly sized.

l When laying out the cell, the most critical issue is the minimization of the capacitances at node !Cout (four diffusion capacitances, two internal gate capacitances, and two inverter gate capacitances). Shared diffusions can reduce the stack node capacitances.

l The transistors connected to Cin are placed closest to the output.

l Only the transistors in the carry stage have to be optimized for optimal speed. All transistors in the sum stage can be minimal size.

Page 9: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .10 © Θεοχαρίδης, ΗΜΥ, 2018

A 64-bit Adder/Subtractor

1-bit FA S0

C0=Cin

C1

1-bit FA S1

C2

1-bit FA S2

C3

C64=Cout

1-bit FA S63

C63

. . .

q Ripple Carry Adder (RCA) built out of 64 FAs

q Subtraction – complement all subtrahend bits (xorgates) and set the low order carry-in

q RCA

l advantage: simple logic, small (low cost)

l disadvantage: slow (O(N) for N bits) and lots of glitching (so lots of energy consumption)

A0

B0

A1

B1

A2

B2

A63

B63

add/subt

Page 10: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .11 © Θεοχαρίδης, ΗΜΥ, 2018

Ripple Carry Adder (RCA)

A0 B0

S0

C0=CinFA

A1 B1

S1

FA

A2 B2

S2

FA

A3 B3

S3

FACout=C4

T = O(N) worst case delay

Tadder » TFA(A,B®Cout) + (N-2)TFA(Cin®Cout) + TFA(Cin®S)

Real Goal: Make the fastest possible carry path

Page 11: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .12 © Θεοχαρίδης, ΗΜΥ, 2018

Inversion Property

A B

S

CinFA

!Cout (A, B, Cin) = Cout (!A, !B, !Cin)

Cout

A B

S

FACout Cin

!S (A, B, Cin) = S(!A, !B, !Cin)

º

q Inverting all inputs to a FA results in inverted values for all outputs

Page 12: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .13 © Θεοχαρίδης, ΗΜΥ, 2018

Exploiting the Inversion Property

A0 B0

S0

C0=CinFA’

A1 B1

S1

FA’

A2 B2

S2

FA’

A3 B3

S3

FA’Cout=C4

Now need two “flavors” of FAs

regular cellinverted cell

q Minimizes the critical path (the carry chain) by eliminating inverters between the FAs (will need to increase the transistor sizing on the carry chain portion of the mirror adder).

Page 13: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .15 © Θεοχαρίδης, ΗΜΥ, 2018

Fast Carry Chain Designl The key to fast addition is a low latency carry networkl What matters is whether in a given position a carry is

¤ generated Gi = Ai & Bi = AiBi¤ propagated Pi = Ai Å Bi (sometimes use Ai | Bi)¤ annihilated (killed) Ki = !Ai & !Bi

l Giving a carry recurrence ofCi+1 = Gi | PiCi

C1 = G0 | P0C0

C2 = G1 | P1G0 | P1P0 C0

C3 = G2 | P2G1 | P2P1G0 | P2P1P0 C0

C4 = G3 | P3G2 | P3P2G1 | P3P2P1G0 | P3P2P1P0 C0

Page 14: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .16 © Θεοχαρίδης, ΗΜΥ, 2018

Manchester Carry Chainl Switches controlled by Gi and Pi

l Total delay of¤ time to form the switch control signals Gi and Pi¤ setup time for the switches¤ signal propagation delay through N switches in the worst case

Gi Pi

!Ci!Ci+1

clk

Page 15: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .17 © Θεοχαρίδης, ΗΜΥ, 2018

4-bit Sliced MCC Adder

G P

!C0

clk

G PG PG P

ÅÅÅÅ

& Å& Å& Å& Å

A0 B0A1 B1A2 B2A3 B3

S0S1S2S3

!C1!C2!C3

!C4

Page 16: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .18 © Θεοχαρίδης, ΗΜΥ, 2018

Domino Manchester Carry Chain Circuit

Ci,0G0

clk

clkP0P1P2P3

G1G2G3

Ci,41 2 3 4

5

6

3 3 3 3 3

1

2

2

3

3

4

4

5

!(G0 | P0 Ci,0)

!(G1 | P1G0 | P1P0 Ci,0)

!(G2 | P2G1 | P2P1G0 | P2P1P0 Ci,0)

!(G3 | P3G2 | P3P2G1 | P3P2P1G0 | P3P2P1P0 Ci,0)

Page 17: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .19 © Θεοχαρίδης, ΗΜΥ, 2018

Binary Adder Landscapesynchronous word parallel adders

ripple carry adders (RCA) carry prop min adders

signed-digit fast carry prop residue adders adders adders

Manchester carry parallel conditional carry carry chain select prefix sum skip

T = O(N), A = O(N)

T = O(1), A = O(N)

T = O(log N)A = O(N log N)

T = O(ÖN), A = O(N)T = O(N)

A = O(N)

Page 18: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .20 © Θεοχαρίδης, ΗΜΥ, 2018

Carry-Skip (Carry-Bypass) Adder

If (P0 & P1 & P2 & P3 = 1) then Co,3 = Ci,0 otherwise the block itself kills or generates the carry internally

A0 B0

S0

Ci,0FA

A1 B1

S1

FA

A2 B2

S2

FA

A3 B3

S3

FACo,3

Co,3

BP = P0 P1 P2 P3 “Block Propagate”

Page 19: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .21 © Θεοχαρίδης, ΗΜΥ, 2018

Carry-Skip Chain Implementation

BPblock carry-in

block carry-outcarry-out

Cin

G0

P0P1P2P3

G1G2G3

!Cout

BP

Page 20: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .22 © Θεοχαρίδης, ΗΜΥ, 2018

4-bit Block Carry-Skip Adder

Worst-case delay ® carry from bit 0 to bit 15 = carry generated in bit 0, ripples through bits 1, 2, and 3, skips the middle two groups (B is the group size in bits), ripples in the last group from bit 12 to bit 15

Ci,0

Sum

CarryPropagation

Setup

Sum

CarryPropagation

Setup

Sum

CarryPropagation

Setup

Sum

CarryPropagation

Setup

bits 0 to 3bits 4 to 7bits 8 to 11bits 12 to 15

Tadd = tsetup + B tcarry + ((N/B) -1) tskip +B tcarry + tsum

Page 21: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .23 © Θεοχαρίδης, ΗΜΥ, 2018

Optimal Block Size and Timel Assuming one stage of ripple (tcarry) has the same delay as

one skip logic stage (tskip) and both are 1TCSkA = 1 + B + (N/B-1) + B + 1

tsetup ripple in skips ripple in tsumblock 0 last block

= 2B + N/B + 1l So the optimal block size, B, is

dTCSkA/dB = 0 Þ Ö(N/2) = Bopt

l And the optimal time isOptimal TCSkA = 2(Ö(2N)) + 1

Page 22: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .24 © Θεοχαρίδης, ΗΜΥ, 2018

Carry-Skip Adder Extensionsl Variable block sizes

¤ A carry that is generated in, or absorbed by, one of the inner blocks travels a shorter distance through the skip blocks, so can have bigger blocks for the inner carries without increasing the overall delay

CinCout

q Multiple levels of skip logic

skip level 1

skip level 2

CinCout

AND of the first level skip signals (BP’s)

Page 23: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .25 © Θεοχαρίδης, ΗΜΥ, 2018

Carry-Skip Adder Comparisons

0

10

20

30

40

50

60

70

8 bits 16 bits 32 bits 48 bits 64 bits

RCACSkAVSkA

B=2 B=3B=4

B=5B=6

Page 24: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .31 © Θεοχαρίδης, ΗΜΥ, 2018

Parallel Prefix Adders (PPAs)l Define carry operator € on (G,P) signal pairs

¤ € is associative, i.e.,[(g’’’,p’’’) € (g’’,p’’)] € (g’,p’) = (g’’’,p’’’) € [(g’’,p’’) € (g’,p’)]

(G’’,P’’) (G’,P’)

(G,P)

whereG = G’’ Ú P’’G’P = P’’P’

€ €

G’!G

G’’

P’’

Page 25: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .32 © Θεοχαρίδης, ΗΜΥ, 2018

PPA General Structurel Given P and G terms for each bit position, computing all the

carries is equal to finding all the prefixes in parallel(G0,P0) € (G1,P1) € (G2,P2) € … € (GN-2,PN-2) € (GN-1,PN-1)

l Since € is associative, we can group them in any order ¤ but note that it is not commutative

q Measures to consider● number of € cells● tree cell depth (time)● tree cell area● cell fan-in and fan-out● max wiring length● wiring congestion● delay path variation (glitching)

Pi, Gi logic (1 unit delay)

Si logic (1 unit delay)

Ci parallel prefix logic tree (1 unit delay per level)

Page 26: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .34 © Θεοχαρίδης, ΗΜΥ, 2018

Brent-Kung PPAPa

ralle

l Pre

fix C

ompu

tatio

n €

G0P0

G1P1

G2p2

G3P3

G4P4

G5P5

G6P6

G7P7

G8P8

G9p9

G10P10

G11p11

G12P12

G13p13

G14p14

G15p15

€€€€€€€

€ € € €

€ € € € € €

€ €

C1C2C3C4C5C6C7C8C9C10C11C12C13C14C15C16

Cin

T =

log 2

NT

= lo

g 2N

-2

A =

2log

2N

A = N/2

Page 27: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .35 © Θεοχαρίδης, ΗΜΥ, 2018

Kogge-Stone PPF AdderPa

ralle

l Pre

fix C

ompu

tatio

n

G0P0

G1P1

G2P2

G3P3

G4P4

G5P5

G6P6

G7P7

G8P8

G9P9

G10P10

G11P11

G12P12

G13P13

G14P14

G15P15

€€€€€€€

€ € € €

C1C2C3C4C5C6C7C8C9C10C11C12C13C14C15C16

Cin

T =

log 2

N

A =

log 2

N

A = N

€€€€€€€

€ € € € € € € € € €

€ € € € € € € € € €

€ € € € € €

Tadd = tsetup + log2N t€ + tsum

Page 28: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .36 © Θεοχαρίδης, ΗΜΥ, 2018

Normalized Delay - Adder Comparisons

0

10

20

30

40

50

60

70

8 bits 16 bits 32 bits 48 bits 64 bits

RCACSkAVSkAKS PPA

Page 29: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .37 © Θεοχαρίδης, ΗΜΥ, 2018

Multiply Operationl Multiplication as repeated additions

multiplicandmultiplier

partialproductarray

double precision product

N

2N

N can be formed in parallel

Page 30: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .38 © Θεοχαρίδης, ΗΜΥ, 2018

Shift & Add Multiplicationl Right shift and add

¤ Partial product array rows are accumulated from top to bottom on an N-bit adder

¤ After each addition, right shift (by one bit) the accumulated partial product to align it with the next row to add

¤ Time for N bits Tserial_mult = O(N Tadder) = O(N2) for a RCA

q Making it faster● Use a faster adder● Use higher radix (e.g., base 4) multiplication

- Use multiplier recoding to simplify multiple formation

● Form partial product array in parallel and add it in parallelq Making it smaller (i.e., slower)

● Use an array multiplier- Very regular structure with only short wires to nearest neighbor

cells. Thus, very simple and efficient layout in VLSI- Can be easily and efficiently pipelined

Page 31: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .39 © Θεοχαρίδης, ΗΜΥ, 2018

Tree Multiplier Structure

partial productarray reduction tree

fast carry propagate adder (CPA)

P (product)

mux + reductiontree (log N)+CPA (log N)

Q (‘ier)

D (‘icand)

DD

D

0

00

0

multiple forming circuits

Page 32: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .40 © Θεοχαρίδης, ΗΜΥ, 2018

(4,2) Counterl Built out of two (3,2) counters (just FA’s!)

¤ all of the inputs (4 external plus one internal) have the same weight (i.e., are in the same bit position)

¤ the internal output is carried to the next higher weight position (indicated by the )

(3,2)

(3,2) Note: Two carry outs - one “internal” and one “external”

Page 33: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .42 © Θεοχαρίδης, ΗΜΥ, 2018

Tiling (4,2) Counters

l Reduces columns four high to columns only two high¤ Tiles with neighboring (4,2) counters¤ Internal carry in at same “level” (i.e., bit position weight) as the

internal carry out

(3,2)

(3,2)

(3,2)

(3,2)

(3,2)

(3,2)

Page 34: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .44 © Θεοχαρίδης, ΗΜΥ, 2018

4x4 Partial Product Array Reduction

multiplicandmultiplier

partialproductarray

reduced pp array (to CPA)

double precision product

q Fast 4x4 multiplication using (4,2) counters

Page 35: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .46 © Θεοχαρίδης, ΗΜΥ, 2018

8x8 Partial Product Array Reduction‘icand‘ier

partialproductarray

reduced partial product array

How many (4,2) countersminimumare needed to reduce it to 2 rows?

Answer: 24

Page 36: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .47 © Θεοχαρίδης, ΗΜΥ, 2018

Alternate 8x8 Partial Product Array Reduction‘icand‘ier

partialproductarray

reduced partial product array

More (4,2) counters, so what is the advantage?

Page 37: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .48 © Θεοχαρίδης, ΗΜΥ, 2018

Array Reduction Layout Approach

multiple generators

multiplicand

multiple selection signals(‘ier)

. . .2(4,2) counter slice

(4,2) counter slice

(4,2) counter slice

CPA

Page 38: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .49 © Θεοχαρίδης, ΗΜΥ, 2018

Parallel Programmable Shifters

Dat

a In

Control =

Dat

a O

ut

Shift amountShift directionShift type (logical,

arith, circular)

Shifters used in multipliers, floating point units

Consume lots of area if done in random logic gates

Page 39: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .50 © Θεοχαρίδης, ΗΜΥ, 2018

A Programmable Binary Shifter

rgt nop left

Ai

Ai-1 Bi-1

BiAi Ai-1 rgt nop left Bi Bi-1A1 A0 0 1 0 A1 A0

A1 A0 1 0 0 0 A1

A1 A0 0 0 1 A0 0

Page 40: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .52 © Θεοχαρίδης, ΗΜΥ, 2018

4-bit Barrel Shifter

A0

A1

A2

A3

B0

B1

B2

B3

Sh1

Sh2

Sh3

Sh0 Sh1 Sh2 Sh3

Example: Sh0 = 1B3B2B1B0 = A3A2A1A0

Sh1 = 1B3B2B1B0 = A3A3A2A1

Sh2 = 1B3B2B1B0 = A3A3A3A2

Sh3 = 1B3B2B1B0 = A3A3A3A3

Area dominated by wiring

Page 41: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .54 © Θεοχαρίδης, ΗΜΥ, 2018

4-bit Barrel Shifter Layout

BufferSh3Sh2Sh1Sh0

A3

A2

A1

A0

Widthbarrel ~ 2 pm NN = max shift distance, pm = metal pitch

Delay ~ 1 fet + N diff caps

Widthbarrel

Only one Sh#active at a timel

Page 42: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .55 © Θεοχαρίδης, ΗΜΥ, 2018

8-bit Logarithmic Shifter

A3

A2

A1

A0

!Sh1Sh1 !Sh2Sh2 !Sh3Sh3

B0

B1

B2

B3

Page 43: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .57 © Θεοχαρίδης, ΗΜΥ, 2018

8-bit Logarithmic Shifter Layout Slice

Widthlog ~ pm(2K+(1+2+…+2K-1)) = pm(2K+2K-1)K = log2 N

Delay ~ K fets + 2 diff caps

A0

B3

B2

B1

B0

A1

A2

A3

1 2 4

Page 44: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .58 © Θεοχαρίδης, ΗΜΥ, 2018

Shifter Implementation Comparisons

N K

Barrel Logarithmic

Width Speed Width Speed2 N pm 1 + N diffs pm(2K+2K-1) K + 2 diffs

8 3 16 pm 1 + 8 13 pm 3 + 216 4 32 pm 1 + 16 23 pm 4 + 232 5 64 pm 1 + 32 41 pm 5 + 264 6 128 pm 1 + 64 75 pm 6 + 2

Page 45: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .59 © Θεοχαρίδης, ΗΜΥ, 2018

Decodersl Decodes inputs to activate one of many outputs

¤ two inverters, four 2-input nand gates, four inverters plus enable logic

¤ how about for a 3-to-8, 4-to-16, etc. decoder?

In0

In1

Enable

Out0 = !In1 & !In0

Out1 = !In1 & In0

Out2 = In1 & !In0

Out3 = In1 & In0

2x4

Page 46: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .60 © Θεοχαρίδης, ΗΜΥ, 2018

Dynamic NOR Decoder

Vdd GND GND

A0 !A0 A1 !A1

B0

B1

B2

B3

precharge

Page 47: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .62 © Θεοχαρίδης, ΗΜΥ, 2018

Dynamic NAND Decoder

GND

A0 !A0 A1 !A1

B3

precharge

B2

B1

B0

Page 48: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .64 © Θεοχαρίδης, ΗΜΥ, 2018

Building Big Decoders from Small

1x2

A4

enable

A3 A2

2x4

2x4

A1 A0

2x4

2x4

.

.

.

0 0 0 0 1

1 ® 0 ® 1

Active low enable Active low output

Page 49: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .65 © Θεοχαρίδης, ΗΜΥ, 2018

Multiplexersl Selects one of several inputs to gate to the single output

¤ two inverters, four 3-input nands, one 4-input nand¤ how about for an 8x1, 16x1, etc. mux?

In0

S1 S0

Out = In0 & !S1 & !S0 |In1 & !S1 & S0 |In2 & S1 & !S0 |In3 & S1 & S0

In1

In2

In3

4x1

Page 50: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .66 © Θεοχαρίδης, ΗΜΥ, 2018

Review: TG 2x1 Multiplexer

GND

VDD

In1 In2S S

S S

S

S

!S

In2

In1

F

F

F = !((In1 & S) | (In2 & !S))

Page 51: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .67 © Θεοχαρίδης, ΗΜΥ, 2018

Building Big Muxes from Small

A0

S0

A12x1

A2

A32x1

2x1

S1

Out

Page 52: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .69 © Θεοχαρίδης, ΗΜΥ, 2018

Review: Datapath Bit-Sliced OrganizationControl Flow

Bit 0

Bit 1

Bit 2

Bit 3

Tile identical bit-slice elements

Reg

iste

r File

Pipe

line

Reg

iste

r

Adde

r

Shift

er

Pipe

line

Reg

iste

r

Mul

tiple

xer

Mul

tiple

xer

Data Flow

Pipe

line

Reg

iste

r

From I$

Pipe

line

Reg

iste

r

To/From D$

decoder

Page 53: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .70 © Θεοχαρίδης, ΗΜΥ, 2018

Layout of Bit-Sliced Datapaths

Page 54: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .71 © Θεοχαρίδης, ΗΜΥ, 2018

Layout of Bit-sliced DatapathsWithout feedthroughs or pitch matching (4.2µm2)

With feedthroughs (3.2µm2)

With feedthroughs and pitch matching (2.2µm2)

Page 55: ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ … 14-15-ECE... · Mirror Adder Features ... qInverting all inputs to a FA results in inverted values for all

ΗΜΥ307 Δ14-15 Arithmetic and Logic Circuits .72 © Θεοχαρίδης, ΗΜΥ, 2018

Alpha 21264 Integer Unit DatapathMultimedia engine

Shifter

Intercluster bypass

Adder

Logic box

Register fileRegister

file decoder

Logic box

Adder

Intercluster bypass

Load bypass

Store FIFO

Address drivers

tristate bus driver

bus driver

RC1_0RC1_1

RC2_0

RC2_1LSD_1LSD_0to D$