COMP90045 Programming Language Implementation
Basic Blocks and Optimization
Harald Søndergaard Lecture 20
Semester 1, 2019
PLI (Sem 1, 2019) Basic Blocks and Optimization ⃝c University of Melbourne 1 / 17
Basic Blocks
A basic block is a sequence of instructions that can be entered only at the first instruction (the leader) and can be left only from the last instruction. Every instruction in a basic block is thus executed the same number of times.
To convert an instruction sequence to basic blocks, first determine the leaders:
The first instruction is a leader.
Any instruction that is the target of a branch of any kind (conditional, unconditional, call, return) is a leader.
Any instruction immediately following a branch instruction of
any kind is a leader.
The basic block of each leader extends to the instruction just before the next leader (if there is one).
PLI (Sem 1, 2019) Basic Blocks and Optimization ⃝c University of Melbourne 2 / 17
Quiz: Basic Blocks
Identify the basic blocks of this three-address code fragment.
48 read_int(n)
49 ifn>1goto60 50 t1:=n/2
51 t2:=t1*2
52 ifn==t2goto57 53 t3:=3*n
54 t4:=t3-1
55 n:=t4
56 goto 49
57 t5:=n/2
58 n:=t5
59 goto 49
60 halt
PLI (Sem 1, 2019)
Basic Blocks and Optimization ⃝c University of Melbourne 3 / 17
Quiz: Basic Blocks
First mark, as leaders, the first instruction and every instruction that is the target of a jump.
* 48 read_int(n) *49 ifn>1goto60
50 t1:=n/2
51 t2:=t1*2
52 ifn==t2goto57 53 t3:=3*n
54 t4:=t3-1
55 n:=t4
56 goto 49
*57 t5:=n/2 58 n:=t5
59 goto 49
* 60 halt
PLI (Sem 1, 2019)
Basic Blocks and Optimization ⃝c University of Melbourne 4 / 17
Quiz: Basic Blocks
Add to that every instruction that follows immediately after a jump (conditional or unconditional).
How many basic blocks in this example?
Can you picture the code as a graph?
* 48 read_int(n) *49 ifn>1goto60 * 50 t1 := n / 2
51 t2 := t1 * 2
52 if n == t2 goto 57 * 53 t3 := 3 * n
54 t4:=t3-1 55 n:=t4
56 goto 49
*57 t5:=n/2 58 n:=t5
59 goto 49
* 60 halt
PLI (Sem 1, 2019)
Basic Blocks and Optimization ⃝c University of Melbourne 5 / 17
Quiz: Basic Blocks
B1 B2
B3
read int(n)
if n > 1 goto B6
t1 := n / 2
t2 := t1 * 2
if n == t2 goto B5
t3 := 3 * n t4 := t3 – 1 n := t4 goto B2
t5 := n / 2 n := t5 goto B2
halt
B5
B6 PLI (Sem 1, 2019)
Basic Blocks and Optimization
⃝c University of Melbourne
6 / 17
B4
The Control Flow Graph
We can identify each basic block by its label.
You can consider the implementation of a procedure to be a graph, whose nodes are the basic blocks of the procedure and in which an edge from basic block B1 to basic block B2 represents a potential transfer of control from the last instruction of B1 to the first instruction of B2.
The transfer may be via a branch instruction or by simple fallthrough. If there is an edge from B1 to B2, then B1 is a predecessor of B2,
and B2 is a successor of B1.
Flow graphs are very useful for optimizers. For example, basic blocks
that have no predecessors can eliminated as dead code.
PLI (Sem 1, 2019) Basic Blocks and Optimization ⃝c University of Melbourne 7 / 17
Reaching Definitions
A program point lies just before and just after each instruction.
A path is a sequence of program points, each separated from the previous one by a single instruction or the traversal of an edge in the flow graph.
A definition of an lvalue is an instruction that updates the rvalue stored in that lvalue.
An ambiguous definition of an lvalue is an instruction (such as a call or an assignment through a pointer) that may update the rvalue stored in that lvalue.
A definition d reaches a point p iff there is a path from d to p with no unambiguous definition of the lvalue.
PLI (Sem 1, 2019) Basic Blocks and Optimization ⃝c University of Melbourne 8 / 17
Liveness Analysis
A use of an lvalue is an instruction whose outcome depends on the rvalue stored in that lvalue.
An lvalue lv is live at program point p if there is a path starting at p on which lv is used before being (unambiguously) defined.
Next week we shall look in detail at liveness analysis as a global analysis that determines which lvalues are live at each program point, and in particular at the end of each basic block.
PLI (Sem 1, 2019) Basic Blocks and Optimization ⃝c University of Melbourne 9 / 17
Renaming Lvalues Apart
Two basic blocks are equivalent and thus interchangeable if they compute the same rvalues for the all lvalues that are (or are assumed to be) live at the end of the block.
If a basic block contains two instructions that are known to define the same lvalue, then we can replace all occurrences of that lvalue from the first definition (including its output operands but not its input operands) until the second definition (including its input operands but not its output operands) with a fresh temporary register.
r0 := r1 + r2 r5 := r1 + r2 r0:=r0+r3 => r0:=r5+r3
We can use this to ensure that the lvalues defined by the instructions of a basic block are syntactically distinct.
PLI (Sem 1, 2019) Basic Blocks and Optimization ⃝c University of Melbourne 10 / 17
Reordering Inside Basic Blocks
Avoiding repeated assignments to an lvalue in a basic block gives the optimizer more freedom to reorder the instructions of the block.
Modern machines try to execute several instructions at once, but cannot start an instruction until its input is ready. Below, the result of the multiply won’t be ready when the next assignment needs it. Putting unrelated instructions between them can reduce execution time:
r4 := r2 * r3 *r0:=r4
r0 := r1 + r2 r0 := r0 + r3
r4 := r2 * r3 => *r0:=r4
r5 := r1 + r2
r0 := r5 + r3
r4 := r2 * r3 => r5:=r1+r2
*r0 := r4
r0 := r5 + r3
*r0 := r4 and r5 := r1 + r2 can be swapped because neither instruction affects the other’s inputs.
PLI (Sem 1, 2019) Basic Blocks and Optimization ⃝c University of Melbourne 11 / 17
Reusing Lvalues
Many code sequences compute the same subexpression several times, even when this is not apparent in the source. For example,
a[i,j].f1 = b[i,j].f1 + c[i,j].f1;
computes three addresses. If the three arrays are the same size and have the same element type, then these three addresses are
a start + ((i ∗s2) + j)∗eltsize + offsetof(f1) b start +((i∗s2)+j)∗eltsize +offsetof(f1) c start + ((i ∗s2) + j)∗eltsize + offsetof(f1)
After we have computed ((i ∗ s 2) + j ) ∗ eltsize + offsetof(f 1) into a register, we don’t want to compute it again; we want to reuse the value in that register.
PLI (Sem 1, 2019) Basic Blocks and Optimization ⃝c University of Melbourne 12 / 17
Common Subexpression Elimination
Start with an empty set of facts, and process each instruction of a basic block in sequence.
When processing an instruction that applies an operation op to input operands i1, i2, … to compute output operand o, check whether there is a fact of the form AVAIL(o′ = op(i1, i2, …)).
If yes, replace the instruction by the instruction o := o′. If not, then keep it. In both cases, assert the fact AVAIL(o = op(i1, i2, …)).
Delete any AVAIL facts in which o occurs or may occur as an input, since they apply to the old version of o.
For example, an assignment to *r1 requires invalidating all facts involving *r2 unless we know that *r1 and *r2 don’t overlap.
PLI (Sem 1, 2019) Basic Blocks and Optimization ⃝c University of Melbourne 13 / 17
Copy Elimination
Copy instructions are very small and fast, but on some modern machines r4 := r1 + r2 is the same size and speed as r4 := r3.
If o is not live at the program point after the copy instruction o := o′, then the copy instruction can be deleted.
If o is live at the program point after the copy instruction o := o′, but it is not live at the end of the containing basic block, then that instruction can be deleted, provided that all references to o in the block up to its next definition (if any) are replaced with o′.
If o is live even at the end of the containing basic block, then the copy instruction must be kept, unless we can perform optimizations across basic blocks.
PLI (Sem 1, 2019) Basic Blocks and Optimization ⃝c University of Melbourne 14 / 17
Classes of Optimization
An optimization that considers each basic block in isolation is a local optimization.
If it takes several basic block into consideration then it is a global optimization. Amongst global optimizations we distinguish those that consider interactions of procedures/functions:
An optimization that considers each procedure body in isolation is an intraprocedural optimization.
An optimization that considers procedure interactions is an interprocedural optimization.
PLI (Sem 1, 2019) Basic Blocks and Optimization ⃝c University of Melbourne 15 / 17
Global Optimization
for (i = 0; i < n; i++) { if (a[i] < x * y) {
++count; }
}
If the inputs to a computation inside a loop body don’t depend on any lvalue possibly defined by the loop body, then that computation can be moved outside the loop.
prod = x * y;
for (i = 0; i < n; i++) {
if (a[i] < prod) {
++count;
} }
PLI (Sem 1, 2019) Basic Blocks and Optimization ⃝c University of Melbourne 16 / 17
Next Up
Register allocation; then we turn to global optimization.
PLI (Sem 1, 2019) Basic Blocks and Optimization ⃝c University of Melbourne 17 / 17