Static Program Analysis
Part 4 – flow sensitive analyses
http://cs.au.dk/~amoeller/spa/
øller & . Schwartzbach Computer Science, Aarhus University
Copyright By PowCoder代写 加微信 powcoder
• Constantpropagationanalysis • Live variables analysis
• Available expressions analysis • Very busy expressions analysis • Reaching definitions analysis
• Initialized variables analysis
Constant propagation optimization
var x,y,z;
y = input,
z = 2*x+y;
if (x<0) { y=z-3; } else { y=12; }
var x,y,z;
y = input;
if (0) { y=z-3; } else { y=12; }
y = input;
output 12;
Constant propagation analysis
• Determinevariableswith⊤aconstantvalue • Flatlattice:
- 3 - 2 - 1 ⊥0 1
Constraints for constant propagation
• EssentiallyasfortheSignanalysis... • Abstractoperatorforaddition:
⊥ if n=⊥ ∨ m=⊥ +(n,m) = ⊤ else if n=⊤ ∨ m=⊤
n+m otherwise
• Constant propagation analysis • Live variables analysis
• Available expressions analysis • Very busy expressions analysis • Reaching definitions analysis
• Initialized variables analysis
Liveness analysis
• Avariableisliveataprogrampointifitscurrentvalue may be read in the remaining execution
• Thisisclearlyundecidable,butthepropertycanbe conservatively approximated
• Theanalysismustonlyanswer“dead” if the variable is really dead
– no need to store the values of dead variables
A lattice for liveness
A powerset lattice of program variables
var x,y,z;
x = input;
while (x>1) {
if (y>3) x = x-y;
if (z>0) x = x/2;
L = (P({x,y,z}), ⊆)
the trivial answer
{x,y} {y,z} {x,z}
{x} {y} {z}
The control flow graph
may be too large
• Auxiliarydefinition:
Setting up
• ForeveryCFGnode,v,wehaveavariable⟦v⟧: – the set of program variables that are live
at the program point before v
• Sincetheanalysisisconservative,thecomputedsets
JOIN(v) = ∪⟦w⟧ w∈succ(v)
Liveness constraints
• For the exit node: ⟦exit⟧ = ∅
• Forconditionsandoutput:
⟦if (E)⟧ = ⟦outputE ⟧ = JOIN(v) ∪ vars(E)
• Forassignments:
⟦ x=E ⟧ = JOIN(v) \ {x} ∪ vars(E)
• Forvariabledeclarations:
⟦varx ,…,x ⟧ = JOIN(v) \ {x , …, x }
1n1n • Forallothernodes:
⟦v⟧ = JOIN(v)
vars(E) = variables occurring in E
right-hand sides are monotone since JOIN is monotone, and …
Generated constraints
⟦var x,y,z⟧ = ⟦x=input⟧ \ {x,y,z} ⟦x=input⟧ = ⟦x>1⟧ \ {x}
⟦x>1⟧ = (⟦y=x/2⟧ ∪ ⟦output x⟧) ∪ {x} ⟦y=x/2⟧ = (⟦y>3⟧ \ {y}) ∪ {x}
⟦y>3⟧ = ⟦x=x-y⟧ ∪ ⟦z=x-4⟧ ∪ {y} ⟦x=x-y⟧ = (⟦z=x-4⟧ \ {x}) ∪ {x,y} ⟦z=x-4⟧ = (⟦z>0⟧ \ {z}) ∪ {x} ⟦z>0⟧ = ⟦x=x/2⟧ ∪ ⟦z=z-1⟧ ∪ {z} ⟦x=x/2⟧ = (⟦z=z-1⟧ \ {x}) ∪ {x} ⟦z=z-1⟧ = (⟦x>1⟧ \ {z}) ∪ {z} ⟦output x⟧ = ⟦exit⟧ ∪ {x}
⟦exit⟧ = ∅
Least solution
⟦entry⟧ = ∅
⟦var x,y,z⟧ = ∅ ⟦x=input⟧ = ∅ ⟦x>1⟧ = {x} ⟦y=x/2⟧ = {x} ⟦y>3⟧ = {x,y} ⟦x=x-y⟧ = {x,y} ⟦z=x-4⟧ = {x}
Many non-trivial answers!
⟦z>0⟧ = {x,z} ⟦x=x/2⟧ = {x,z} ⟦z=z-1⟧ = {x,z} ⟦outputx⟧ = {x} ⟦exit⟧ = ∅
Optimizations
• Variablesyandzareneversimultaneouslylive ⇒ they can share the same variable location
• Thevalueassignedinz=z-1isneverread ⇒ the assignment can be skipped
x = input;
while (x>1) {
if (yz>3) x = x-yz;
if (yz>0) x = x/2;
• better register allocation • a few clock cycles saved
Time complexity (for the naive algorithm)
• With n CFG nodes and k variables: – thelatticeLnhasheightk⋅n
– sothereareatmostk⋅niterations
• Subsets of Vars (the variables in the program) can be represented as bitvectors:
– eachelementhassizek
– each∪,\,=operationtakestimeO(k)
• Each iteration uses O(n) bitvector operations: – soeachiterationtakestimeO(k⋅n)
• Total time complexity: O(k2n2)
• Exercise: what is the complexity for the worklist algorithm?
• Constant propagation analysis • Live variables analysis
• Available expressions analysis • Very busy expressions analysis • Reaching definitions analysis
• Initialized variables analysis
Available expressions analysis
• A (nontrivial) expression is available at a program point if its current value has already been computed earlier in the execution
• Theapproximationgenerallyincludestoofew expressions
– the analysis can only report “available” if the expression is definitely available
– no need to re-compute available expressions (e.g. common subexpression elimination)
A lattice for available expressions
A reverse powerset lattice of nontrivial expressions
var x,y,z,a,b;
while (y > a+b) {
x = a+b; }
L = (P({a+b, a*b, y>a+b, a+1}), ⊇)
Reverse powerset lattice
the trivial answer
{a+b, a*b}
{a+b} {a*b}
{a+b, y>a+b} {a+b, a+1}
{y>a+b} {a+1}
{a*b, y>a+b} {a*b, a+1} {y>a+b, a+1}
{a+b, a*b, y>a+b} {a+b, a*b, a+1}
{a+b, a*b, y>a+b, a+1}
{a+b, y>a+b, a+1} {a*b, y>a+b, a+1}
The control flow graph
var x,y,z,a,b
Setting up
• ForeveryCFGnode,v,wehaveavariable⟦v⟧: – the set of expressions that are available
at the program point after v
• Sincetheanalysisisconservative,thecomputedsets
may be too small
• Auxiliarydefinition:
JOIN(v) = ∩⟦w⟧ w∈pred(v)
Auxiliary functions
• The function S↓x removes all expressions that contain the variable x from the set S
• Thefunctionexps(E)isdefinedas: – exps(intconst) = ∅
– exps(x) = ∅
– exps(input) = ∅
– exps(E1 op E2) = {E1 op E2} ∪ exps(E1) ∪ exps(E2)
but don’t include expressions containing input
Availability constraints
• Fortheentrynode: ⟦entry⟧ = ∅
• Forconditionsandoutput:
⟦if (E)⟧ = ⟦outputE ⟧ = JOIN(v) ∪ exps(E)
• Forassignments:
⟦ x = E ⟧ = (JOIN(v) ∪ exps(E))↓x
• Foranyothernodev: ⟦v⟧ = JOIN(v)
Generated constraints
⟦entry⟧ = ∅
⟦var x,y,z,a,b⟧ = ⟦entry⟧
⟦z=a+b⟧ = exps(a+b)↓z
⟦y=a*b⟧ = (⟦z=a+b⟧ ∪ exps(a*b))↓y
⟦y>a+b⟧ = (⟦y=a*b⟧ ∩ ⟦x=a+b⟧) ∪ exps(y>a+b) ⟦a=a+1⟧ = (⟦y>a+b⟧ ∪ exps(a+1))↓a
⟦x=a+b⟧ = (⟦a=a+1⟧ ∪ exps(a+b))↓x
⟦exit⟧ = ⟦y>a+b⟧
Least solution
⟦entry⟧ = ∅
⟦var x,y,z,a,b⟧ = ∅ ⟦z=a+b⟧ = {a+b} ⟦y=a*b⟧ = {a+b, a*b} ⟦y>a+b⟧ = {a+b, y>a+b} ⟦a=a+1⟧ = ∅
⟦x=a+b⟧ = {a+b}
⟦exit⟧ = {a+b}
Again, many nontrivial answers!
Optimizations
• We notice that a+b is available before the loop • Theprogramcanbeoptimized(slightly):
var x,y,x,a,b,aplusb;
aplusb = a+b;
z = aplusb;
while (y > aplusb) {
aplusb = a+b;
x = aplusb;
• Constant propagation analysis • Live variables analysis
• Available expressions analysis
• Very busy expressions analysis • Reaching definitions analysis
• Initialized variables analysis
Very busy expressions analysis
• A(nontrivial)expressionisverybusyifitwilldefinitely be evaluated before its value changes
• Theapproximationgenerallyincludestoofew expressions
– the answer “very busy” must be the true one – very busy expressions may be pre-computed
(e.g. loop hoisting)
• Samelatticeasforavailableexpressions
An example program
var x,a,b;
x = input;
while (x > 0) {
output a*b-x;
x = x-1; }
output a*b;
The analysis shows that a*b is very busy right before the while loop
var x,a,b;
x = input;
while (x > 0) {
output a*b-x;
x = x-1; }
output a*b;
Code hoisting
var x,a,b,atimesb;
x = input;
atimesb = a*b;
while (x > 0) {
output atimesb-x;
x = x-1; }
output atimesb;
may be too small
• Auxiliarydefinition:
Setting up
• ForeveryCFGnode,v,wehaveavariable⟦v⟧: – the set of expressions that are very busy
at the program point before v
• Sincetheanalysisisconservative,thecomputedsets
JOIN(v) = ∩⟦w⟧ w∈succ(v)
Very busy constraints
• Fortheexitnode: ⟦exit⟧ = ∅
• Forconditionsandoutput:
⟦if (E)⟧ = ⟦outputE ⟧ = JOIN(v) ∪ exps(E)
• Forassignments:
⟦ x=E ⟧ = JOIN(v)↓x ∪ exps(E)
• Forallothernodes: ⟦v⟧ = JOIN(v)
same ↓ operator as for available expressions analysis
• Constant propagation analysis • Live variables analysis
• Available expressions analysis • Very busy expressions analysis • Reaching definitions analysis • Initialized variables analysis
Reaching definitions analysis
• Thereachingdefinitionsforaprogrampointare those assignments that may define the current values of variables
• Theconservativeapproximationmayincludetoo many possible assignments
A lattice for reaching definitions
The powerset lattice of assignments
L = (P({x=input, y=x/2, x=x-y, z=x-4, x=x/2, z=z-1}),⊆)
var x,y,z;
x = input;
while (x > 1) {
if (y>3) x = x-y;
if (z>0) x = x/2;
Reaching definitions constraints
• Forassignments:
⟦ x = E ⟧ = JOIN(v)↓x ∪ { x = E }
• Forallothernodes: ⟦v⟧ = JOIN(v)
• Auxiliarydefinition: JOIN(v) = ∪⟦w⟧
• The function S↓x removes assignments to x from the set S
Def-use graph
Reaching definitions define the def-use graph: – like a CFG but with edges from def to use nodes – basis for dead code elimination and code motion
Forward vs. backward
• Aforwardanalysis:
– computes information about the past behavior
– examples: available expressions, reaching definitions
• Abackwardanalysis:
– computes information about the future behavior – examples: liveness, very busy expressions
May vs. must
• Amayanalysis:
– describes information that is possibly true
– an over-approximation
– examples: liveness, reaching definitions
• Amustanalysis:
– describes information that is definitely true
– an under-approximation
– examples: available expressions, very busy expressions
Classifying analyses
example: reaching definitions ⟦v⟧ describes state after v
JOIN(v) = ⨆⟦w⟧ =
w∈pred(v) w∈pred(v)
example: liveness
⟦v⟧ describes state before v
JOIN(v) = ⨆⟦w⟧ =
⟦w⟧ w∈succ(v) w∈succ(v)
example: available expressions
⟦v⟧ describes state after v
JOIN(v) = ⨆⟦w⟧ =
w∈pred(v) w∈pred(v)
example: very busy expressions ⟦v⟧ describes state before v
JOIN(v) = ⨆⟦w⟧ =
⟦w⟧ w∈succ(v) w∈succ(v)
• Constant propagation analysis • Live variables analysis
• Available expressions analysis • Very busy expressions analysis • Reaching definitions analysis
• Initialized variables analysis
Initialized variables analysis
• Computeforeachprogrampointthosevariables that have definitely been initialized in the past
• (CalleddefiniteassignmentanalysisinJavaandC#)
• ⇒forwardmustanalysis
• Reversepowersetlatticeofallvariables
JOIN(v) = ∩⟦w⟧ w∈pred(v)
• Forassignments:⟦x=E⟧=JOIN(v)∪{x}
• Forallothers:⟦v⟧=JOIN(v)
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com