ΗΜΥ 307 ΨΗΦΙΑΚΑ...

Click here to load reader

  • date post

    01-Sep-2018
  • Category

    Documents

  • view

    216
  • download

    0

Embed Size (px)

Transcript of ΗΜΥ 307 ΨΗΦΙΑΚΑ...

  • 307

    2018

    14-15:

    (ttheocharides@ucy.ac.cy)(ack: Prof. Mary Jane Irwin and Vijay Narayanan)

    [ Rabaeys Digital Integrated Circuits, 2002, J. Rabaey et al.]

    mailto:ttheocharides@ucy.ac.cy

  • 307 14-15 Arithmetic and Logic Circuits .2 , , 2018

    Review: Basic Building Blocksl Datapath

    Execution units Adder, multiplier, divider, shifter, etc Today!

    Register file and pipeline registers MEMORY See below! Multiplexers, decoders, etc. THIS lecture (and L.15)

    l Control Finite state machines (PLA, ROM Lecture 17)

    l Interconnect Switches, arbiters, buses Lecture 16

    l Memory Caches (SRAMs), TLBs, DRAMs, buffers

    Lecture 17

  • 307 14-15 Arithmetic and Logic Circuits .3 , , 2018

    The 1-bit Binary Adder

    1-bit Full Adder(FA)

    A

    BS

    Cin

    S = A B CinCout = A&B | A&Cin | B&Cin (majority function)

    q How can we use it to build a 64-bit adder?

    q How can we modify it easily to build an adder/subtractor?

    q How can we make it better (faster, lower power, smaller)?

    A B Cin Cout S carry status0 0 0 0 0 kill0 0 1 0 1 kill0 1 0 0 1 propagate0 1 1 1 0 propagate1 0 0 0 1 propagate1 0 1 1 0 propagate1 1 0 1 0 generate1 1 1 1 1 generate

    Cout

    G = A&BP = A BK = !A & !B

    = P Cin

    = G | P&Cin

  • 307 14-15 Arithmetic and Logic Circuits .4 , , 2018

    FA Gate Level Implementations

    A B

    S

    Cout

    Cin

    t1 t0t2 t0

    t1

    A B

    S

    Cout

    Cin

    t2

    q The way you learned to design in ECE 210 and ECE 211

  • 307 14-15 Arithmetic and Logic Circuits .5 , , 2018

    Review: XOR FA

    Cout

    S

    Cin

    A

    B

    16 transistors

  • 307 14-15 Arithmetic and Logic Circuits .6 , , 2018

    Review: CPL FA

    A

    !A

    B!B Cin!Cin

    !S

    S

    Cout

    !CoutA

    !A

    B

    !B

    !B

    B Cin !Cin

    Cin

    !Cin

    20+8 transistors, dual rail beware of threshold drops

  • 307 14-15 Arithmetic and Logic Circuits .8 , , 2018

    Review: Mirror Adder

    B

    B B

    B B

    BB

    BA

    A

    A

    A

    A

    A A

    A

    Cin

    Cin

    Cin

    Cin

    Cin!Cout !S

    24+4 transistors

    kill

    generate

    0-propagate

    1-propagate

    Cout = A&B | B&Cin | A&Cin SUM = A&B&Cin | COUT&(A | B | Cin)

    4 4

    4 4

    4

    8

    888

    8

    2 2 23

    3

    3

    6

    6

    6444

    4

    2

    Sizing: Each input in the carry circuit has a logical effort of 2 so the optimal fan-out for each is also 2. Since !Cout drives 2 internal and 2 inverter transistor gates (to form Cin for the nms bit adder) should oversize the carry circuit. PMOS/NMOS ratio of 2.

  • 307 14-15 Arithmetic and Logic Circuits .9 , , 2018

    Mirror Adder Featuresl The NMOS and PMOS chains are completely symmetrical with

    a maximum of two series transistors in the carry circuitry,guaranteeing identical rise and fall transitions if the NMOS and PMOS devices are properly sized.

    l When laying out the cell, the most critical issue is the minimization of the capacitances at node !Cout (four diffusion capacitances, two internal gate capacitances, and two inverter gate capacitances). Shared diffusions can reduce the stack node capacitances.

    l The transistors connected to Cin are placed closest to the output.

    l Only the transistors in the carry stage have to be optimized for optimal speed. All transistors in the sum stage can be minimal size.

  • 307 14-15 Arithmetic and Logic Circuits .10 , , 2018

    A 64-bit Adder/Subtractor

    1-bit FA S0

    C0=Cin

    C11-bit FA S1

    C21-bit FA S2

    C3

    C64=Cout

    1-bit FA S63

    C63

    . . .

    q Ripple Carry Adder (RCA) built out of 64 FAs

    q Subtraction complement all subtrahend bits (xorgates) and set the low order carry-in

    q RCA

    l advantage: simple logic, small (low cost)

    l disadvantage: slow (O(N) for N bits) and lots of glitching (so lots of energy consumption)

    A0

    B0

    A1

    B1

    A2

    B2

    A63

    B63

    add/subt

  • 307 14-15 Arithmetic and Logic Circuits .11 , , 2018

    Ripple Carry Adder (RCA)

    A0 B0

    S0

    C0=CinFA

    A1 B1

    S1

    FA

    A2 B2

    S2

    FA

    A3 B3

    S3

    FACout=C4

    T = O(N) worst case delay

    Tadder TFA(A,BCout) + (N-2)TFA(CinCout) + TFA(CinS)

    Real Goal: Make the fastest possible carry path

  • 307 14-15 Arithmetic and Logic Circuits .12 , , 2018

    Inversion Property

    A B

    S

    CinFA

    !Cout (A, B, Cin) = Cout (!A, !B, !Cin)

    Cout

    A B

    S

    FACout Cin

    !S (A, B, Cin) = S(!A, !B, !Cin)

    q Inverting all inputs to a FA results in inverted values for all outputs

  • 307 14-15 Arithmetic and Logic Circuits .13 , , 2018

    Exploiting the Inversion Property

    A0 B0

    S0

    C0=CinFA

    A1 B1

    S1

    FA

    A2 B2

    S2

    FA

    A3 B3

    S3

    FACout=C4

    Now need two flavors of FAs

    regular cellinverted cell

    q Minimizes the critical path (the carry chain) by eliminating inverters between the FAs (will need to increase the transistor sizing on the carry chain portion of the mirror adder).

  • 307 14-15 Arithmetic and Logic Circuits .15 , , 2018

    Fast Carry Chain Designl The key to fast addition is a low latency carry networkl What matters is whether in a given position a carry is

    generated Gi = Ai & Bi = AiBi propagated Pi = Ai Bi (sometimes use Ai | Bi) annihilated (killed) Ki = !Ai & !Bi

    l Giving a carry recurrence ofCi+1 = Gi | PiCi

    C1 = G0 | P0C0C2 = G1 | P1G0 | P1P0 C0C3 = G2 | P2G1 | P2P1G0 | P2P1P0 C0C4 = G3 | P3G2 | P3P2G1 | P3P2P1G0 | P3P2P1P0 C0

  • 307 14-15 Arithmetic and Logic Circuits .16 , , 2018

    Manchester Carry Chainl Switches controlled by Gi and Pi

    l Total delay of time to form the switch control signals Gi and Pi setup time for the switches signal propagation delay through N switches in the worst case

    Gi Pi

    !Ci!Ci+1

    clk

  • 307 14-15 Arithmetic and Logic Circuits .17 , , 2018

    4-bit Sliced MCC Adder

    G P

    !C0

    clk

    G PG PG P

    & & & &

    A0 B0A1 B1A2 B2A3 B3

    S0S1S2S3

    !C1!C2!C3

    !C4

  • 307 14-15 Arithmetic and Logic Circuits .18 , , 2018

    Domino Manchester Carry Chain Circuit

    Ci,0G0

    clk

    clkP0P1P2P3

    G1G2G3

    Ci,4 1 2 3 4

    5

    6

    3 3 3 3 3

    1

    2

    2

    3

    3

    4

    4

    5

    !(G0 | P0 Ci,0)

    !(G1 | P1G0 | P1P0 Ci,0)

    !(G2 | P2G1 | P2P1G0 | P2P1P0 Ci,0)

    !(G3 | P3G2 | P3P2G1 | P3P2P1G0 | P3P2P1P0 Ci,0)

  • 307 14-15 Arithmetic and Logic Circuits .19 , , 2018

    Binary Adder Landscapesynchronous word parallel adders

    ripple carry adders (RCA) carry prop min adders

    signed-digit fast carry prop residue adders adders adders

    Manchester carry parallel conditional carry carry chain select prefix sum skip

    T = O(N), A = O(N)

    T = O(1), A = O(N)

    T = O(log N)A = O(N log N)

    T = O(N), A = O(N)T = O(N)

    A = O(N)

  • 307 14-15 Arithmetic and Logic Circuits .20 , , 2018

    Carry-Skip (Carry-Bypass) Adder

    If (P0 & P1 & P2 & P3 = 1) then Co,3 = Ci,0 otherwise the block itself kills or generates the carry internally

    A0 B0

    S0

    Ci,0FA

    A1 B1

    S1

    FA

    A2 B2

    S2

    FA

    A3 B3

    S3

    FACo,3

    Co,3

    BP = P0 P1 P2 P3 Block Propagate

  • 307 14-15 Arithmetic and Logic Circuits .21 , , 2018

    Carry-Skip Chain Implementation

    BPblock carry-in

    block carry-outcarry-out

    CinG0

    P0P1P2P3

    G1G2G3

    !Cout

    BP

  • 307 14-15 Arithmetic and Logic Circuits .22 , , 2018

    4-bit Block Carry-Skip Adder

    Worst-case delay carry from bit 0 to bit 15 = carry generated in bit 0, ripples through bits 1, 2, and 3, skips the middle two groups (B is the group size in bits), ripples in the last group from bit 12 to bit 15

    Ci,0

    Sum

    CarryPropagation

    Setup

    Sum

    CarryPropagation

    Setup

    Sum

    CarryPropagation

    Setup

    Sum

    CarryPropagation

    Setup

    bits 0 to 3bits 4 to 7bits 8 to 11bits 12 to 15

    Tadd = tsetup + B tcarry + ((N/B) -1) tskip +B tcarry + tsum

  • 307 14-15 Arithmetic and Logic Circuits .23 , , 2018

    Optimal Block Size and Timel Assuming one stage of ripple (tcarry) has the same delay as

    one skip logic stage (tskip) and both are 1TCSkA = 1 + B + (N/B-1) + B + 1

    tsetup ripple in skips ripple in tsumblock 0 last block

    = 2B + N/B + 1l So the optimal block size, B, is

    dTCSkA/dB = 0 (N/2) = Bopt

    l And the optimal time isOptimal TCSkA = 2((2N)) + 1

  • 307 14-15 Arithmetic and Logic Circuits .24 , , 2018

    Carry-Skip Adder Extensionsl Variable block sizes

    A carry that is generated in, or absorbed by, one of the inner blocks travels a shorter distance through the skip blocks, so can have bigger blocks for the inner carries without increasing the overall delay

    CinCout

    q Multiple levels of skip logic

    skip level 1

    skip level 2

    CinCout

    AN