程序代写代做代考 cache compiler C algorithm graph COMP90045 Programming Language Implementation

COMP90045 Programming Language Implementation
Register Allocation
Harald Søndergaard Lecture 21
Semester 1, 2019
PLI (Sem 1, 2019) Register Allocation ⃝c University of Melbourne 1 / 32

The Use of Registers
Up until now we have freely used as many temporaries as was convenient.
We have assumed each temporary corresponds to a register, although a typical processor has a very limited number of registers.
Accessing memory is much slower than accessing registers even in the presence of caches, hardware devices specifically designed to speed up memory accesses. Caches are small, fast storage areas close to the CPU, and they hold the contents of recently accessed memory locations.
Good performance typically isn’t possible unless the compiler can keep the values of not just temporaries but also variables in registers.
PLI (Sem 1, 2019) Register Allocation ⃝c University of Melbourne 2 / 32

Register Allocation
If there are more variables than free registers, figuring out which variables to keep in registers can be quite complex. No efficient optimal algorithm exists, since the problem of optimal register allocation is NP-complete. We have to settle for heuristic approximations.
Register allocation can be done at the IR or the machine language level, and the techniques are the same.
It is more commonly done in the machine language, because, as we have seen, a single IR instruction can result in several machine-code instructions, and several IR instructions can sometimes be folded into a single machine-code instruction, and the need for registers may not be preserved. Also, some target machines impose constraints on which registers can be used by which instructions.
PLI (Sem 1, 2019) Register Allocation ⃝c University of Melbourne 3 / 32

Programmer Intervention
C allows a programmer to provide hints to the compiler about local variables and parameters that will get heavy use:
register int n;
register char c;
However, the annotations entail some restrictions on these variables (for example, the address-of operator & cannot be applied), and yet there was never any guarantee that the compiler would pay attention to the hints.
PLI (Sem 1, 2019) Register Allocation ⃝c University of Melbourne 4 / 32

Register Allocation by Usage Count
The following assumes that liveness information is available. The critical register allocation decisions are for loops L.
For each variable x, we can try to get a rough estimate of the benefit we could expect from keeping x in a register. Savings, roughly:
1 for each use of x in a block, where the use is not preceded by a definition of x in that block, plus
2 for each block where x gets defined and is live at exit.
Hence the “benefit” of choosing x to be kept in a register within loop
L is
􏰆
block B∈L
benefit =
use(x, B) + 2 · live(x, B)
PLI (Sem 1, 2019)
Register Allocation ⃝c University of Melbourne 5 / 32

Usage Count Example
a := b+c
d := d-b
e := a+f
f := a-d
b := d+f
e := a-c
b := d+c
PLI (Sem 1, 2019) Register Allocation ⃝c University of Melbourne 6 / 32

Usage Count Example
a,c,d,e
c,d,e,f
a,c,d,f
a := b+c
d := d-b
e := a+f
b,c,d,f
a,c,d,e,f
b,c,d,e,f
b,c,d,e,f
– with liveness info
b := d+f
e := a-c
f := a-d
c,d,e,f
b := d+c
b,c,d,f
PLI (Sem 1, 2019)
Register Allocation
⃝c University of Melbourne
6 / 32

Usage Count Example
b,c,d,f
a := b+c
d := d-b
e := a+f
a,c,d,e
c,d,e,f
a,c,d,f
a,c,d,e,f
b,c,d,e,f
b,c,d,e,f
– with liveness info
Now calculate:
use live benefit
a: 2 1 4 b: 2 2 6 c: 3 0 3 d: 4 1 6 e: 0 2 4 f: 2 1 4
b := d+f
e := a-c
f := a-d
b,c,d,f
b := d+c
c,d,e,f
PLI (Sem 1, 2019)
Register Allocation
⃝c University of Melbourne
6 / 32

Usage Count Example
b,c,d,f
a := b+c
d := d-b
e := a+f
a,c,d,e
c,d,e,f
b,c,d,e,f
b,c,d,e,f
a,c,d,e,f
a,c,d,f
– with liveness info
Now calculate:
use live benefit
a: 2 1 4 b: 2 2 6 c: 3 0 3 d: 4 1 6 e: 0 2 4 f: 2 1 4
b := d+f
e := a-c
f := a-d
c,d,e,f
b := d+c
b,c,d,f
So if four registers are available, give three to b, d, and, say, a,
keeping the fourth available for others.
PLI (Sem 1, 2019) Register Allocation ⃝c University of Melbourne 6 / 32

Usage Count Example
SUB R3, R0, R2
ST f, R3
LD R1, b
LD R2, d
UseR0fora,R1for b, and R2 for d.
Note: a is not live on entry, so no need to load it into R0 from the outset.
Similarly, no need to store a on exit.
Operations involving c, e, and f need to use load and store.
LD R3, c
ADD R0, R1, R3
SUB R2, R2, R1
LD R3, f
ADD R3, R0, R3
ST e, R3
LD R3, f
ADD R1, R2, R3
LD R3, c
SUB R3, R0, R3
ST e, R3
LD R3, c
ADD R1, R2, R3
ST b, R1
ST d, R2
ST b, R1
ST d, R2
PLI (Sem 1, 2019)
Register Allocation
⃝c University of Melbourne 7 / 32

Register Sharing and Variable Interference
If two or more variables live independent lives, they should be able to share the same register.
We say that u interferes with v if v is ever alive immediately after u is defined.
The condition implies that u and v can be alive at the same time, and so they cannot be stored in the same register.
Actually we can weaken the condition: For a copy instruction u := v, the two variables are alive at the same time, but since they hold the same value anyway, they can still share a register. Even if v is live after u := v, we do not consider it interfering with u.
We will do global register allocation in the sense that we ensure a variable can stay in the same register throughout a procedure body.
PLI (Sem 1, 2019) Register Allocation ⃝c University of Melbourne 8 / 32

Interference Graph
Given the code of a procedure, consider each statement as if it were an IR instruction. Break the code into basic blocks and construct a CFG. Find out which variables are live at which program points. The interference graph has a node for each variable.
There is an edge between u and v if they interfere. So the edge means that the two are alive at the same time, and thus cannot be
stored in the same register.
Interference graph:
y z
y,z
x,z
y,z
y,z
x
PLI (Sem 1, 2019)
Register Allocation
⃝c University of Melbourne
9 / 32
x := y+z
y := x+z
z := z+1

Register Allocation via Graph Colouring
We can apply standard algorithms for graph colouring to the interference graph, with each colour representing a register.
The algorithms tell us whether the graph can be coloured with a given number of colours (the number of available registers), and if so, which sets of nodes should share a colour (which sets of variables should be stored in the same variable).
xR2 yR2
z R1 In this case, x and y can share R2.
PLI (Sem 1, 2019) Register Allocation ⃝c University of Melbourne 10 / 32

Graph Colouring
Given k registers. How to colour the graph with k colours, so no edge connects two nodes of the same colour (or decide it is impossible)?
This is an NP-complete problem (for k > 2), so we cannot hope for an optimal algorithm.
But a good, sub-optimal algorithm is fine—we will just generate a target program that is slightly slower that it might have been, if we could colour optimally.
PLI (Sem 1, 2019) Register Allocation ⃝c University of Melbourne 11 / 32

Reasonably Good Graph Colouring
A good fast heuristic is this: Until the graph is empty,
pick a node with fewer than k adjacent nodes; remove it and its edges.
We then allocate colours to nodes in the reverse order of removal.
Of course we may encounter a graph where no node has less than k adjacent nodes.
At that stage we produce code for register spilling to memory.
PLI (Sem 1, 2019) Register Allocation ⃝c University of Melbourne 12 / 32

Register Allocation with Graph Colouring
d := b*e
b := a
c := b-d
Assume a, c, and e are live at the end.
Let us track the liveness information backwards through each
program point, building the register interference graph along the way.
PLI (Sem 1, 2019) Register Allocation ⃝c University of Melbourne 13 / 32

Register Allocation with Graph Colouring
d := b*e
b := a
c := b-d
ad
c
a,c,e
eb
PLI (Sem 1, 2019)
Register Allocation
⃝c University of Melbourne 14 / 32

Register Allocation with Graph Colouring
d := b*e
b := a
c := b-d
ad
c
a,b,d,e
a,c,e
eb
PLI (Sem 1, 2019)
Register Allocation
⃝c University of Melbourne 15 / 32

Register Allocation with Graph Colouring
(a,b,e)
d := b*e
a,d,e b := a
a,b,d,e
c := b-d
a,c,e
ad
c
eb
PLI (Sem 1, 2019)
Register Allocation
⃝c University of Melbourne 16 / 32

Register Allocation with Graph Colouring
d := b*e
b := a
c := b-d
(a,b,e)
a,d,e
a,b,d,e
a,c,e
a d
c
eb
Assume three registers available: R1, R2, and R3.
Now remove nodes one by one, giving preference to nodes with fewer
than three neighbours (give preference to the less connected).
As a node is removed, push it onto a stack, together with its (remaining) edges.
PLI (Sem 1, 2019) Register Allocation ⃝c University of Melbourne 17 / 32

Register Allocation with Graph Colouring
ad
The stack:
c
eb
PLI (Sem 1, 2019) Register Allocation ⃝c University of Melbourne 18 / 32

Register Allocation with Graph Colouring
The stack:
d
c
eb
a [c,d]
PLI (Sem 1, 2019)
Register Allocation
⃝c University of Melbourne
19 / 32

Register Allocation with Graph Colouring
The stack:
d
eb
c [e]
a [c,d]
PLI (Sem 1, 2019)
Register Allocation
⃝c University of Melbourne
20 / 32

Register Allocation with Graph Colouring
e
d
The stack:
b [d,e] c [e]
a [c,d]
PLI (Sem 1, 2019)
Register Allocation
⃝c University of Melbourne
21 / 32

Register Allocation with Graph Colouring
e
The stack:
d [e]
b [d,e] c [e]
a [c,d]
PLI (Sem 1, 2019)
Register Allocation
⃝c University of Melbourne
22 / 32

Register Allocation with Graph Colouring
The stack:
e[]
d [e]
b [d,e] c [e]
a [c,d]
PLI (Sem 1, 2019)
Register Allocation
⃝c University of Melbourne
23 / 32

Register Allocation with Graph Colouring
Now pop nodes off the stack, assigning colours, so that each node is coloured differently to those in its associated list:
(a,b,e)
d := b*e
b := a c := b-d
The stack: a,d,e e[]
d [e] a,b,d,e b [d,e]
c [e] a,c,e a [c,d]
The allocations: a:
b: c: d: e:
PLI (Sem 1, 2019)
Register Allocation
⃝c University of Melbourne
24 / 32

Register Allocation with Graph Colouring
Now pop nodes off the stack, assigning colours, so that each node is coloured differently to those in the associated list:
(a,b,e)
d := b*e
The stack: d [e]
b [d,e] c [e]
a [c,d]
The allocations:
a:
b:
c:
d:
e: R1
b:=a c:=b-d
a,d,e
a,b,d,e
a,c,e
PLI (Sem 1, 2019)
Register Allocation
⃝c University of Melbourne
25 / 32

Register Allocation with Graph Colouring
Now pop nodes off the stack, assigning colours, so that each node is coloured differently to those in the associated list:
(a,b,e)
d := b*e
The stack:
b [d,e] c [e]
a [c,d]
The allocations:
a:
b:
c:
d: R2 e: R1
b:=a c:=b-d
a,d,e
a,b,d,e
a,c,e
PLI (Sem 1, 2019)
Register Allocation
⃝c University of Melbourne
26 / 32

Register Allocation with Graph Colouring
Now pop nodes off the stack, assigning colours, so that each node is coloured differently to those in the associated list:
(a,b,e)
d := b*e
The stack:
The allocations:
a:
b: R3 c:
d: R2 e: R1
b := a c:=b-d
a,d,e
a,b,d,e
a,c,e
c [e]
a [c,d]
PLI (Sem 1, 2019)
Register Allocation
⃝c University of Melbourne
27 / 32

Register Allocation with Graph Colouring
Now pop nodes off the stack, assigning colours, so that each node is coloured differently to those in the associated list:
(a,b,e)
d := b*e
The stack:
The allocations:
a:
b: R3 c: R2 d: R2 e: R1
b := a c := b-d
a,d,e
a,b,d,e
a,c,e a [c,d]
PLI (Sem 1, 2019)
Register Allocation
⃝c University of Melbourne
28 / 32

Register Allocation with Graph Colouring
Now pop nodes off the stack, assigning colours, so that each node is coloured differently to those in the associated list:
(a,b,e)
d := b*e
The stack:
The allocations:
a:R1 b:R3 c:R2 d:R2 e:R1
b := a c := b-d
a,d,e
a,b,d,e
a,c,e
PLI (Sem 1, 2019)
Register Allocation
⃝c University of Melbourne
29 / 32

The Algorithm, in General
Suppose we have N colours available. The greedy colouring algorithm says to choose a node with fewer than N edges, if possible, and remove that node.
When that is not possible we can choose randomly (or follow any one of many suggested sub-heuristics).
However, in the next phase, when colours are allocated, we have a real problem if we get to a node whose list of neighbours have already used up N colours.
In that case we pop the node and leave it uncoloured. The corresponding variable will need to spend (most of) its time in memory—no register can be globally allocated.
We say that the variable is spilled.
PLI (Sem 1, 2019) Register Allocation ⃝c University of Melbourne 30 / 32

Spills and Other Complications
If spilling occurs, the code needed to effect the spilling usually introduces fresh variables, so liveness analysis and register allocation may need to be repeated!
A compiler should allocate all variables accessed in inner loops to registers if this is at all possible. Beyond that, there are no universally superior heuristics about the choice of which variable to spill.
Choices about spills are also complicated by calls.
Register allocation may also be complicated by the fact that on some target machines (for example, x86), some instructions have their operands in fixed registers and cannot be told that their operands are in registers chosen by the compiler.
PLI (Sem 1, 2019) Register Allocation ⃝c University of Melbourne 31 / 32

Further Reading
Mogensen’s ICD has more detail about how to organise register spilling (Section 8.5) and common heuristics used in register allocation (Section 8.6).
In the next lecture we will take a good look at global optimization, and look at liveness analysis in particular.
PLI (Sem 1, 2019) Register Allocation ⃝c University of Melbourne 32 / 32