Optimization algorithms using SSA

27
Optimization algorithms using SSA Software Optimizations & Restructuring Resear ch Group School of Electronical Engineering Seoul National University 2006-21166 wonsub Kim

description

Optimization algorithms using SSA. Software Optimizations & Restructuring Research Group School of Electronical Engineering Seoul National University 2006-21166 wonsub Kim. Using static single assignment form (SSA). Review of SSA translation Place Φ function using dominance frontier - PowerPoint PPT Presentation

Transcript of Optimization algorithms using SSA

Page 1: Optimization algorithms using SSA

Optimization algorithms using SSA

Software Optimizations & Restructuring Research GroupSchool of Electronical Engineering

Seoul National University2006-21166 wonsub Kim

Page 2: Optimization algorithms using SSA

2

Using static single assignment form (SSA)

Review of SSA translation Place Φ function using dominance frontier

Overview of SSA form (why SSA form? ) SSA form transform step Role of SSA form, benefit of SSA form

Further optimization using SSA form Constant propagation, dead code elimination, induction variable reduc

tion and other issues.

Page 3: Optimization algorithms using SSA

3

Review of SSA form translation

Observation Node X does not dominates Z Node x dominates a immediate d

ominator of Z Key observation

insert the Φ function on the first node Z that is common to another path originating in a node Y with an assignment to V

Node z definition is same as dominance frontier definition

Place Φ function in the domi-nance frontier nodes of the nodes where each def of V is

Page 4: Optimization algorithms using SSA

4

Review of SSA form translation

Dominance frontier definition Definition – node sets which dominates the immediate predecessor o

f node Y, but do not dominate node Y. From A's point of view, these are the nodes at which other control pat

hs that don't go through A make their earliest appearance. Case 1 – definition 이 dominate 하는 모든 node 에 대해서는 definiti

on 이 reachable Case 2 – case 1 node 를 떠나서 dominance frontier 에 들어가게

되면 그때서야 비로소 같은 변수에 대한 other definition 을 하는 flow를 고려하게 된다 .

How about the assignments in the loop? it also needs to be use Φ function which merge multiple definitions

Page 5: Optimization algorithms using SSA

5

Static single assignment (SSA) form

Single assignment form In MA form, variable is memory location, not a value. In SA form, variable is a value. Simplifying property of variable Data structure - only a single assignment to a variable (single def-initi

on), but many uses of it (only one def site, lists of use site) A def must dominate all of its uses!! Variables are renamed to remove multiple assignments

Role of SSA form in optimization Data flow analysis and compiler optimizations are more efficient with

SA form and SSA form simplify many optimizations The need for use-def chain is removed & that info explicitly appear. Quadratic number of use-def chain O(n^2)-> linear number O(n) Eliminates false dependences (simplifying context)

Page 6: Optimization algorithms using SSA

6

Why SSA form?

SSA form provides the compiler with a solution to the question “which definitions of a variable reach the points where it is used? -> Make def-use chains explicit

every definition knows its uses and every use know its single definition

Makes dataflow optimization more easier and faster For most optimizations reduce space/time requirements

DU chains in SSA form save more space, but spend more In MA form, # of chains for V variables

Def-use chains are so expensive!! Worst case # of chains = O(# of defs (v) * # of uses (v)) <= O (E*

E*V) # of defs (v) proportional to E, # of uses (v) also prop to E

In SSA form, # of chains = O (E * V)

Page 7: Optimization algorithms using SSA

7

Space reduction in DU chain

Def-use chain structure is more simplified

x1 := …x2 := …

x4 := (x1,x2,x3)… := x4

… := x4… := x4

x := … x := …

… := x

… := x… := x

MA form SSA form

x := … x2 := …

on

e s

tep

searc

h

mu

lti ste

p s

earc

h

Page 8: Optimization algorithms using SSA

8

SSA transform step SSA transformation step

1. To get some efficiency from SA form, translate the original code into SA form statically.

2. SSA form is not executable due to pseudo-instructions (compiler internal form), thus should be translated back to MA form execution

Original Code(MA form)

Optimizations

Optimized Code (MA form)

SSA form

code motion, redundancy elimination, constant propagation, …

Page 9: Optimization algorithms using SSA

9

Static single assignment form

Using SSA, further possible optimizations Constant propagation (simple constant, conditional constant) Dead code elimination Induction variable identification Global Value numbering(p349~355), data dependences.. Register allocation

Other considerations More variables, Increase in code size due to Φ-functions But only linearly increased (in practice, SSA is 0.6-2.4 times larger) Some optimizations are more annoying But on the whole, a win for compilers How does the Φ function choose which xi to use?

We don’t really care about it

Page 10: Optimization algorithms using SSA

10

Dead code elimination (1)

Problem definition assignment to variable with no use can be removed

SSA structure only one definition site & a list of use sites ( easy to check liveness)

Worklist algorithm W <- a list of all variables in the SSA program (if transformed) While W is not empty

Remove some variable v from W If v’s list of uses is empty (def-use chain’s use list empty)

Let S be v’s statement of definition If S has no side effects(?) other than the assignment to v Delete S from the program for each variable xi used by S with UD ( keep track last use!! )

Delete S from the list of uses of xi

Page 11: Optimization algorithms using SSA

11

Dead code elimination (2)

Just check which variable value Vi do not show up Remove the phi function or statement that creates Vi Recurse as new Vi value may now have become dead

Page 12: Optimization algorithms using SSA

12

Dead code elimination example

x := 4 y := y+3

L: z := y*5

x := 4 y := y+3 goto L on x<5 z := y*2 x := y+9L: z := y*5

constant propagati

on

x := 4 y := y+3 goto L on 4<5 z := y*2 x := y+9L: z := y*5

dead code eliminatio

n

need extra operation to avoid dangling pointers to du chains for removed stateme

ntsDU chains in MA form

x1 := 4 y2 := y1+3 goto L on x1<5 z1 := y2*2 x2 := y2+9L: x3 := phi(x1,x2) z2 := phi(z0,z1) z3 := y2*5

x1 := 4 y2 := y1+3 goto L on 4<5 z1 := y2*2 x2 := y2+9L: x3 := phi(x1,x2) z2 := phi(z0,z1) z3 := y2*5

x1 := 4 y2 := y1+3

L: x3 := x1 z2 := z0 z3 := y2*5

dead code eliminatio

n

constant propagati

on

SSA forms

dead code eliminatio

n

Page 13: Optimization algorithms using SSA

13

Constant propagation

Evaluate expression at compile time, eliminate dead code, improve efficacy of other optimization

What is constants?? Simple constant – constant for all paths through a program

Faster algorithm - sparse simple constants algorithm using SSA translation

Conditional constants – constant for actual paths through Faster algorithm - sparse conditional constants algorithm using SS

A translation Key points!!

sparse SSA edge traverse is more efficient than Normal CFG edge traverse

Page 14: Optimization algorithms using SSA

14

Simple constant propagation using SSA

Standard worklist algorithm Identify simple constants

If variable is defined using only one constant value, it is simple constant

If variable definition uses phi func and func arguments are all same, v is simple constant!!

Simple constants First iteration - i1, j1, k1 Second iteration – j3 added

Traverse all edges in the CFG, and process. <- inefficient!!

I1<-1

J1<-1

K1<-1

J2<-phi(j4, j1)

K2<-phi(k4, k1)

If k2 < 100

If j2 <20 Return j2

J3<-i1

K3<-k2 + 1

J5 <- k2

K5 <- k2 + 2

J4 <- phi (j3, j5)

K4 <- phi (k3, k5)

Page 15: Optimization algorithms using SSA

15

Sparse simple constant propagation

Standard simple constant propagate algorithm is inefficient!!

For each program points (CFG edge connected node), maintain one constant value for each var.

O(EV), E # of edge in CFG Inefficient, since constant may have to be p

ropagated through irrelevant node -> exploit spare dependence

Exploit SSA edges (explicitly connect defs with uses)

Iterate over SSA edges instead of over all CFG edges, SSA has fewer edge than def-use graph

Page 16: Optimization algorithms using SSA

16

Sparse simple constant propagation

Sparse simple worklist algorithm W <- list of all statement in the SSA program

While W is not empty Remove statement S from W If S is v <- phi(c1, c2, c3,…) and for arbitrary I, j, ci == cj

Replace S by v <- c If S is v<-c for some constant c

Delete S from the program For each statement T that uses v Substitute c for v in T W <- W union {T}

실제로 constant definition’s use edge 에 대해서만 search 를 함으로써 sparse search algorithm 으로 modify 된다 .

Page 17: Optimization algorithms using SSA

17

Conditional constant propagation

Delete infeasible branch due to discovered constant Data flow analysis

Lattice (over defined, defined, never defined) Executability – Is there evidence that block B can ever be executed? Executable assignment – assignments in a executable block B Processed in compile time not in run time

Algorithm Simultaneously find constants + eliminates infeasible branch First find out executable block using following observation

If x < y goto L1 else L2 V[x] = Top or V[y] = Top -> E(L1, L2)=T If x < y goto L1 else L2 V[x]=c1, V[y]=c2, c1!=c2, L1 or L2 taken

Second, in any assignments, get lattice value for variables Block executability and variable lattice is updated repeatedly

Page 18: Optimization algorithms using SSA

18

Conditional constant propagation example

Source codeAnalysis for conditional constant propagation

SSA form transform Dead code elimination

By analysis, j2, j3, j5 are always 1, else part is not reachable Unreachable code is eliminated by dead code elimination

Page 19: Optimization algorithms using SSA

19

Conditional constant propagation result

B1 :I1<-1

J1<-1

K1<-1

B2 :J2<-phi(j4, j1)

K2<-phi(k4, k1)

If k2 < 100B3 :If j2 <20 B4 :Return j2

B5 :J3<-i1

K3<-k2 + 1

B6 :J5 <- k2

K5 <- k2 + 2

B7 :J4 <- phi (j3, j5)

K4 <- phi (k3, k5)

B Exec[B]

1 T

2 T

3 T

4 T

5 T

6 F

7 T

x V[x]

I1 1

J1 1

J2 1

J3 1

J4 1

J5 Bot

K1 0

K2 Top

K3 Top

K4 Top

k5 bot

K2 <- phi (k3, 0)

If k2 < 100

K3 <- k2 +1

Return 1

J2 constant prop.

B6 dead block

j4 constant prop.

k4 phi function use one argument, so copy propagation

Page 20: Optimization algorithms using SSA

20

Induction variable identification

Induction variable reduction Optimize the SSA graph rather

than handling CFG directly SSA graph clarify the link from

data use to its definition Induction variable value is

increased/decreased by constant in each iteration. I0 – first entry value I2 – value after going through

loop RC – loop invariant expression

Page 21: Optimization algorithms using SSA

21

Induction variable identification

SSA-based algorithm Build SSA representation Iterate from innermost CFG loop to outermost loop (just like

loop invariant code motion search.. Innermost -> outermost ) finding SSA cycle

Each cycle may be basic induction variable if a variable in a cycle is a function of loop invariants and its value on the current iteration (how to detect this condition?) Phi function in the cycle have as one of its inputs a

def from inside the loop and a def from outside the loop

The def inside the loop (phi function input) will be part of the cycle and get one operand from the phi function and all others will be loop invariant

Find derived induction variable

Page 22: Optimization algorithms using SSA

22

Induction variable identification example

Source code transformation to SSA form

Build SSA value graph

Find a cycle from SSA graph

Cycle phi function이 앞에서 언급한 조건을 만족하므로 basic induction variable!!

Variable x case

SSA value graph

Page 23: Optimization algorithms using SSA

23

Induction variable identification example

1. transform SSA form 2. find SSA cycle

I2,m2 has a SSA cycle<-candidate 3. for each candidate, check basic i

nduction variable condition I2 case

i1 – outside ,i3 - inside i3 (inside def) get a operand from ph

i function and another from constant -> biv!!

Page 24: Optimization algorithms using SSA

24

Global Value numbering

Global Value numbering (GVN) Symbolic evaluation (not run-time evaluation), if symbolic

number is same, two computation are equal. Compiler optimization based on the SSA IR Build value graph from the SSA form prevent the false variable name-value name mappings more powerful that global common sub expression (CSE) in

some cases

Page 25: Optimization algorithms using SSA

25

Value numbering example

SSA transform

Page 26: Optimization algorithms using SSA

26

Dependency issues

In optimization, parallelization & scheduling, dependency check is important

3 data dependence - true ( read after write), anti (write after read), output (write after write) dependency

SSA form, true dependence is evident. (def site, use list) There are no anti, output dependence in SSA form SSA form has a single definition of each variable so write after read dependency can’t occur, why? so write after write dependency can’t occur, why? Variable definition dominates all use of it.

Page 27: Optimization algorithms using SSA

27

Other issues

Copy propagation Check live range of each variable, replace the target

variable uses following the copy operation by source variables if the target variable is live.

In SSA form, variable values are assured to remain statically single

So, there is no need to check live range, simply replace it!

Register allocation, common sub-expression elimination