Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

24
Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction Anand Rajaram †‡ David Z. Pan Jiang Hu * Dept. of ECE, UT-Austin Texas Instruments, Dallas * Dept. of EE, TAMU

description

Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction. Anand Rajaram † ‡ David Z. Pan † Jiang Hu * † Dept. of ECE, UT-Austin ‡ Texas Instruments, Dallas * Dept. of EE, TAMU. Outline. Introduction Review of link-based non-tree clock network - PowerPoint PPT Presentation

Transcript of Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

Page 1: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

Improved Algorithms for Link-Based Non-tree Clock Network for

Skew Variability Reduction

Anand Rajaram†‡ David Z. Pan† Jiang Hu*

† Dept. of ECE, UT-Austin‡ Texas Instruments, Dallas

* Dept. of EE, TAMU

Page 2: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

Outline

IntroductionReview of link-based non-tree clock

networkImproved algorithms (over [Rajaram et al,

DAC’04])› Rule based algorithm (δ Rule)

› Graph theoretical approach (MST-based)Experimental resultsConclusions

Page 3: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

Clock Distribution Network

Register

Register

Dmax

Clock Network

1 2d1 Launc

h signals

d2

T

Catch signals

Signal transfer coordinated by clock signal

All registers are supplied with clock signal by clock distribution network

Skew = d1 – d2

Zero skew: d1 = d2

Useful skew, d1 – d2 = δ12

Page 4: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

Clocks : Important Considerations & Objectives

One of the biggest & most frequently switching nets Very sensitive to unwanted skew introduced by PVT

› Manufacturing process variations (P)› Power supply voltage noise (V)› Temperature variations (T)

Less clock skew variation a “MUST” for nanometer VLSI designs

Minimizing clock routing wire-length can › Reduce power consumption

Page 5: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

Approaches for Reducing Skew Variability

Buffer & wire sizing [Pullela et al., DAC’93; Chung

et al., ICCAD’94; Wang et al., ISPD’04]

Variation aware routing [Lin et al., ICCAD’94; Lu

et al., ISPD’03]

Non-tree clock networks › McCoy et al., ETC’94; Vandenberghe et al., ICCAD’97; Xue et

al., ICCAD’95

› Link based non-tree clock networks [Rajaram et al., DAC’04]

Page 6: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

Non-tree: 1-D Spine [Kurd et.al JSSC’01]

1-D spine Applied in Intel Pentium processor design Variations between spines still exists

Spines

Clock sinks or local sub-networks

Page 7: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

Non-tree: 2-D Mesh

Top level mesh [Su et. al, ICCAD’01]

Less wire, less effective

Leaf level mesh [Restle et. al, JSSC’01]

Very effective, huge wire

Applied in IBM microprocessors

Clock sinks or local sub-networks

Clock sinks or local sub-networks

Page 8: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

Linked Non-tree = Tree + Links[Rajaram et al, DAC’04]

Non-tree = tree + links How to select link pairs is the key! Link = link_capacitors + link_resistor

u

w

i

w

u

Rl

C/2 C/2

u w

Rl

C/2

C/2

Page 9: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

Skew Between Link Endpoints

wuloop

linkwu q

R

Rq ,,

~ˆ New skew with link (u, w):

Rlink

u

w

Rloop

wuq ,ˆ Value of

becomes smaller when link is

closer to leaf nodes for a given Rlink

Page 10: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

Skew Between any Two Nodes (i, j) with Link (u, w)

Skew variation between any node pair (i, j)

Scenario1: i Tg , j Th => always smaller

Scenario2: i & j Tg (or Th) => could be worse

Scenario3: i Tp , j Tp => could be much worse Key idea: try to avoid Scenario 3 and 2 for link insertion

u

w

P

g

hP: nearest common

ancestor for u and w Tx: Sub-tree rooted at x

Page 11: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

Rule Based Algorithms[Rajaram et al, DAC’04]

α-rule: max loop

link

R

R

Lower the α, better the link

β-rule:

max,, 2

CRR wwuu

Lower the β, lesser the tuning required

γ-rule: The nearest common ancestor's depth from root is < γmax

Page 12: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

Guidelines for Node Pair Selection for Link Insertion

Select nodes which are hierarchically far apartSelect nodes physically close to each otherSelect nodes with equal nominal delaySelect nodes closer to leaf nodesFor zero skew routing, only select leaf nodes

Page 13: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

Merits› Physical characteristics of the links

considered. So bad links avoided.› Independent of balanced nature of clock

structure› Efficient run time

Demerits› No control over distribution of links.› Possibility of links getting added in the

same region Solution

› δ-rule: No two links should have the same pair of ancestors at the depth = δ from the clock source

› Retains the merits of the previous rules and addresses the demerit

A B C D

A B C D

Using δ = 2

Rule Based Algorithms[Rajaram et al, DAC’04]

Page 14: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

δ Rule – An Example

A B C D

Crowding of links. Subtrees A and D not linked!

Using δ = 2

δ is the node level from clock source

Page 15: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

Graph Theoretical Approach

Select_Node_Pairs(Tv) {

l = v.left_child

r = v.right_child

P = Select_node_pair_between(Tl, Tr, k)

if Depth(v) ≥≥ depth_limit, exit;

P = P Select_Node_Pairs(Tl)

P = P Select_Node_Pairs(Tr)

Return P}

l r

v

Tl1Tl2

Tr1 Tr2

The entire clock tree is recursively divided into two parts and links added between them

This ensures distribution of links throughout the clock tree

Edge weight = Min-distance between sinks of Tli and Trj

Tl1

Tl2

Tr1

Tr2

Page 16: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

Graph theoretical approach – Min-matching [Rajaram et al, DAC’04]

Bipartite min-matching algorithm to select the node pairs

Merits› Distribute links evenly through all regions

of the clock network

Demerits› Due to the nature of the min-matching

algorithm, only one link per sub-tree is allowed

› May result in some very lengthy links and increased wire lengths

› Lengthy links might be difficult to route› Complexity of min-matching is O(n3). Not

scalable!

l r

v

Lengthy links

Page 17: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

New graph theoretical approach – Minimum Spanning Tree Based

MST algorithm allows more than one link per sub-tree

› More number of short links (cf. bipartite approach)

Retains the merits of the min-matching based approach

› Evenly distribute the links Complexity is O(nlogn)

› Much faster than bipartite matching algorithm O(n3)

l r

v

Page 18: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

MST_node_pair_select(Tl, Tr, k){

Divide Tl into k sub-trees, Sl = { Tl1 ,

Tl2 , Tl3 ,… Tlk.}

Divide Tr into k subtrees, Sr = { Tr1 , Tr2 ,

Tr3 ,… Trk.}

Find MST of the completely connected bipartite graph between Sl & Sr

}

Tl1

Tl2

Tr1

Tr2

Sl Sr

l r

v

Tl1Tl2

Tr1 Tr2

MST Based Algorithm

After MST pair selection, iteratively delete edges violating the four rules (α, β, γ, and δ)

Page 19: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

Experimental Setup

Benchmarks: r1 – r5 from bounded skew tree work [Cong et. al, ICCAD’95]

Interconnect width variation› Smaller than thickness› More sensitive to variations

Load capacitance variation

-3σ -2σ -1σ +1σ +2σ +3σ

MaxNom

99.74%

Min

All variables assumed to be

Gaussian

TN

dd

T_trials

N_

refi

sinks

2 Standard Deviation = id Delay of sink i

refd Delay of reference sink

Skew Variability measure: Standard Deviation

Page 20: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

Experimental Result on Skew Variability

0

0.2

0.4

0.6

0.8

1

Sta

ndard

Devia

tion

w.r.t. clock

tre

es

r1 r2 r3 r4 r5

Test cases

Skew Variability

Sparse MeshDense MeshLink-MLink-RDLink-MST

Benchmark r1 r2 r3 r4 r5

No. of sinks 267 598 862 1903 3100

Page 21: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

HSPICE ValidationSkew Variability w.r.t Clock Tree

00.020.040.060.080.1

0.120.140.160.18

r1 r2 r3 r4 r5

Test cases

Sta

ndard

Devia

tion w

.r.t.

clock

tre

es

Link-MST- SPICELink-MST- Elmore

Benchmark r1 r2 r3 r4 r5

No. of sinks 267 598 862 1903 3100

Page 22: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

Experimental Result on Wire-length

0

0.5

1

1.5

2

2.5

3

Wire-length

w.r.t c

lock

trees

r1 r2 r3 r4 r5

Test Cases

Wire-length Comparison

Sparse MeshDense MeshLink-MLink-RDLink-MST

Page 23: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

Wire-length comparison between link insertion methods

0.9

0.95

1

1.05

1.1

1.15

1.2

Wire-length

w.r.t c

lock

trees

r1 r2 r3 r4 r5

Test Cases

Wire-length Comparison

Link-MLink-RDLink-MST

Page 24: Improved Algorithms for Link-Based Non-tree Clock Network for Skew Variability Reduction

Conclusions

Two new efficient algorithms for link insertion have been proposed

› Significant skew variability reduction with very small wire-length increase

› Scale very well with size of clock network for both runtime and QOR

Proposed methodology is independent of the nature of variability effects

Friendly to incremental changes