COMP90045 Programming Language Implementation
Runtime Environments, Calls and Returns
Harald Søndergaard Lecture 17
Semester 1, 2019
PLI (Sem 1, 2019) Runtime Environments, Calls and Returns ⃝c University of Melbourne 1 / 24
Function/Procedure Invocation
In the previous lectures we discussed the translation of expressions and statements to IR.
That clarified what to do with function and procedure bodies, but we did not discuss how to handle calls and returns.
Each function invocation gives rise to new incarnations of it local variables, and formal parameters.
Since, in most programming languages, functions and procedures may be defined recursively, and there is no a priori bound on the depth of recursion, the compiler and the runtime system deal with an unbounded number of variables.
PLI (Sem 1, 2019) Runtime Environments, Calls and Returns ⃝c University of Melbourne 2 / 24
Activation Trees
In traditional imperative languages
control flow is sequential, and
calls return to the point after the call.
We can thus depict a run of a program as a tree in which
each node represents a function call or function activation, each edge represents a call from the parent to the child, and a child on the left finishes before a child on its right begins.
fib(4)
fib(2) fib(2) fib(1) fib(1) fib(0)
fib(1) fib(0)
fib(3)
PLI (Sem 1, 2019) Runtime Environments, Calls and Returns ⃝c University of Melbourne 3 / 24
Control Stack
It is natural to use a stack to handle function activations, which is why activation records are also called stack frames.
A call pushes a new frame on the stack, a return pops a frame from the stack. The state of the stack at any one time corresponds to a path from the root of the activation tree to a node.
The general format of stack frames is set by the designers of the platform (instruction set + OS) for inter-operability between languages. Some of the detail (such as which variables are stored in which stack slots, and which variables are stored outside the stack) are controlled by the compiler of each language, which keeps its information about such things in its symbol table.
For variables stored on the stack, their lifetime is limited by the lifetime of the stack frame in which they are stored.
PLI (Sem 1, 2019) Runtime Environments, Calls and Returns ⃝c University of Melbourne 4 / 24
Runtime Address Space Layout
Text: Executable code.
Read-only data section: Constant global variables.
Data section: Initialized writeable global variables.
Bss section: Uninitialized writeable global variables.
Heap: Dynamically allocated memory.
Stack: Formal parameters and local variables.∗
∗The exception is that local variables that keep their values from one call to the next (static locals in C) are stored as if they were globals, and have corresponding lifetimes.
stack
↓
unused
↑
heap
bss
data
rodata
text
PLI (Sem 1, 2019) Runtime Environments, Calls and Returns ⃝c University of Melbourne 5 / 24
Scope
A declaration associates some information with a name. The information includes the type of entity the name represents
(a variable, a function etc) as well as, e.g., the type of a variable or the types of the arguments and return value of a function.
The scope of a declaration is the part of the program in which the declaration applies. The scope rules of the language specify which declaration applies to each occurrence of a name.
int x;
void f(float x) {
… /* here, x is a float */ if (…) {
}
} …
char x; …
/* here, x is a char */ /* here, x is a float */
PLI (Sem 1, 2019)
Runtime Environments, Calls and Returns ⃝c University of Melbourne 6 / 24
Binding of Names
Name Storage x sp-8:
Environment
Value sp-8:
State
23
An environment is a function that binds names to storage locations (lvalues) that are either absolute or relative to a pointer (e.g., the stack pointer). The environment changes when execution moves from one scope to another.
A state is a function that maps lvalues to rvalues. The state changes when the program executes an assignment statement.
The environment is controlled by the compiler, with help from the linker and runtime system. The state is controlled by the program.
PLI (Sem 1, 2019) Runtime Environments, Calls and Returns ⃝c University of Melbourne 7 / 24
Activation Records
actual parameters
return value
saved frame pointer
saved stack pointer
return address
local data
temporaries
Most platforms have an Application Binary In- terface (ABI) standard that dictates the form of activation records. A common example is shown here.
direction of growth ⇓
←− frame pointer (fp)
←− stack pointer (sp)
PLI (Sem 1, 2019)
Runtime Environments, Calls and Returns
⃝c University of Melbourne
8 / 24
Creating and Destroying Activation Records
The ABI dictates the tasks of the caller and the callee, even if they are written in different languages. For example, it may say:
Call sequence:
1 Caller pushes actual parameters and space for return value;
2 Caller pushes saved fp, saved sp and return address;
3 Caller sets both sp and fp to point to return address slot;
4 Callee increments sp to reserve space for its locals and temps.
Return sequence:
1 Callee places return value;
2 Callee restores the saved sp and fp;
3 Callee branches to return address;
4 Caller picks up return value and pops the actual parameters.
PLI (Sem 1, 2019) Runtime Environments, Calls and Returns ⃝c University of Melbourne 9 / 24
The Frame Pointer
Every parameter and local variable is at a fixed offset from the memory location pointed to by the frame pointer. This offset remains constant as the program performs pushes and pops.
If the stack frame contains variable-sized data structures such as arrays, then figuring out the offsets of parameters and local variables from the stack pointer would be too difficult.
procedure p(n: int) begin
declare x as array[n] of float;
for i := 1 to n do …
end
The procedure’s code starts by assigning to x’s stack slot the address
of the first free word on the stack, and then incrementing sp by the
size of n floats.
PLI (Sem 1, 2019) Runtime Environments, Calls and Returns ⃝c University of Melbourne 10 / 24
Simpler Activation Records
If the stack frame cannot contain variable-sized data structures (as is the case in C) then keeping track of offsets from the stack pointer as it changes is not too difficult, and the compiler can dispense with the frame pointer (including its saved copy).
Most ABIs dictate that the return value should be returned in a register, not a stack slot.
Likewise, parameters should be passed in registers whenever possible. Typically, the ABI says that the first (say) five parameters should be passed in registers, with the rest passed on the stack.
If the machine has a separate set of FP registers, the ABI may say, for example, “pass the first five FP parameters in f0 through f4, and pass the first six non-FP parameters in r1 through r6”.
PLI (Sem 1, 2019) Runtime Environments, Calls and Returns ⃝c University of Melbourne 11 / 24
Variable Numbers of Arguments
Some functions, like C’s printf, take a variable number of arguments. Usually, there is a fixed number (usually one or two) of required arguments (for example, the format string) that determine how many other arguments are expected.
Variable-length argument lists are easy to support if all arguments are passed on the stack; you simply access the argument list as an array. Support is much more complicated if some arguments are passed in registers, which is why many compiler writers wish variable-length argument lists would go away.
Supporting such functions imposes restrictions on ABI design. The ABI can no longer stipulate that the caller pushes the arguments but the callee pops them.
PLI (Sem 1, 2019) Runtime Environments, Calls and Returns ⃝c University of Melbourne 12 / 24
Formal and Actual Parameters
Formal parameters are always lvalues.
Actual parameters are rvalues that may or may not also be lvalues.
All languages insist on the actual parameters and formal parameters agreeing in number (at least if the function expects a fixed number) and in type (if the language has types, and modulo allowed type conversions).
Different languages have very different ideas of what passing a parameter means.
PLI (Sem 1, 2019) Runtime Environments, Calls and Returns ⃝c University of Melbourne 13 / 24
Parameter Passing Conventions
These are the most popular parameter passing mechanisms:
Call by value
Call by value-result Call by reference
C, Java, ML
Ada, some Fortran Pascal
C allows the passing of pointers. In the case of objects and arrays, Java passes (and copies) only references. That is how these languages facilitate call by reference.
Algol 60 used as its default mechanism call by name which aims at giving the impression of parameter passing as literal substitution; however, this is hard to implement for an imperative language, and the effect is sometimes hard to understand.
PLI (Sem 1, 2019) Runtime Environments, Calls and Returns ⃝c University of Melbourne 14 / 24
Call by Value and by Value-Result
The simplest way to implement parameter passing is call by value. Given an actual parameter ap and a formal parameter fp, call by value executes fp := ap at the time of call.
The next simplest way is call by value-result, which is also called copy/restore and copy-in/copy-out. Given an actual parameter ap and a formal parameter fp, call by value-result executes fp := ap at the time of call and ap := fp at the time of return.
If ap is not an lvalue, then call by value-result turns it into one by creating a temporary variable and assigning ap to it before the call.
In Ada, each argument is either “in”, “out”, or “inout”. “in” arguments get copy-in only, “out” arguments get copy-out only, and “inout” arguments get both copy-in and copy-out.
PLI (Sem 1, 2019) Runtime Environments, Calls and Returns ⃝c University of Melbourne 15 / 24
Call by Reference
With call by value, if you want the callee to modify the value of an actual parameter, you must pass its address as a pointer, and have the callee assign through that pointer.
Call by reference automates this process. It creates a new variable
fp ptr, executes fp ptr := &ap at the time of call, and replaces all references to fp in the body of the callee with *fp ptr.
As with call by value-result, the caller must use a temporary variable to turn the actual parameter into an lvalue if it isn’t one naturally.
This temporary should not be used anywhere else.
PLI (Sem 1, 2019) Runtime Environments, Calls and Returns ⃝c University of Melbourne 16 / 24
Effect of Parameter Passing Conventions
copyout(int i, int x) { x = 2; a[i] = 3;
}
a[1] = 4; copyout(1, a[1]);
By reference: a[1] = 3
By value-result: a[1] = 2
With call by reference, the assignments in the body of copyout take place in program order. With call by value-result, the assignment to the formal parameter x has its effect delayed until the call returns, overwriting the textually later assignment.
PLI (Sem 1, 2019) Runtime Environments, Calls and Returns ⃝c University of Melbourne 17 / 24
Effect of Parameter Passing Conventions
swap(int x, int y) { int tmp;
tmp = x; x = y; y = tmp;
}
i = 1; a[1] = 3; swap(i, a[i]);
By value: i=1 a[1] = 3
By reference: i=3
a[1] = 1
With value-result, the copy-back to i could affect which element of a is updated by the copy-back to a[i]. Whether it does depends on the copy-back order, which is usually not guaranteed by the language.
PLI (Sem 1, 2019) Runtime Environments, Calls and Returns ⃝c University of Melbourne 18 / 24
Call by Name
With call by name the intended effect is that of macro replacement.
So
must have the same effect as
i = 1; a[1] = 3; tmp = i; i = a[i]; a[i] = tmp; Hence after the call we havei= 3,a[1]= 3 anda[3]= 1
(which presumably was not the intended outcome).
i = 1; a[1] = 3; swap(i, a[i]);
swap(int x, int y) { int tmp;
tmp = x; x = y; y = tmp;
}
PLI (Sem 1, 2019) Runtime Environments, Calls and Returns ⃝c University of Melbourne 19 / 24
Call by Need
Call by name may be hard to understand in an imperative language, but it is a perfect match for a pure functional programming language (no side effects).
Haskell uses a refinement of call by name known as call by need, ensuring that a function argument is evaluated at most once.
Suppose a call to function g is very expensive. Under call by name f (g u) (g v) where f x y = x + x
is effectively reduced to (g u) + (g u); so g v is never evaluated (good), but g u is evaluated twice (bad).
Call by need ensures that g v is not evaluated and g u is evaluated
only once (the result being reused).
PLI (Sem 1, 2019) Runtime Environments, Calls and Returns ⃝c University of Melbourne 20 / 24
Argument Evaluation Order
When passing arguments on the stack, it is usually most convenient for callers to evaluate arguments right to left. That way, each evaluated actual parameter can be pushed on the stack, with the last pushed parameter being the first parameter. This parameter ends up on top, where any callees with variable-length argument lists can use it to figure out how many other arguments there are.
When passing arguments in registers, it is usually most convenient for callers to evaluate arguments left to right. That way, when evaluating argument N into (say) register N, all the registers with numbers higher than N are available for subexpressions.
To leave themselves freedom to pick either of these approaches, language designers usually explicitly leave the argument evaluation order undefined.
PLI (Sem 1, 2019) Runtime Environments, Calls and Returns ⃝c University of Melbourne 21 / 24
Register Save/Restore
At the call site in the caller, some of the registers may be live, that is, they may contain information that is needed later.
The code of the callee function may want to use those same registers.
Registers that are (a) live in the caller and (b) used in the callee must have their values saved in stack slots at call, and restored at return.
To support separate compilation and higher order calls (calls through function pointers in C), neither the caller nor the callee should be required to know the identity of the other.
The caller and callee thus each know only half of that condition. Compilers therefore use a conservative approximation of the set of registers that need saving and restoring: they guarantee saving and restoring those registers, but may also save and restore some others.
PLI (Sem 1, 2019) Runtime Environments, Calls and Returns ⃝c University of Melbourne 22 / 24
Caller Save vs Callee Save Registers
The algorithm relies on the ABI classifying registers into two kinds: caller save registers and callee save registers.
If a caller save register is live in the caller, then the caller must save its value in its own stack frame before the call, and restore its value after the return.
If a callee save register is used in the callee, then the callee must save its value in its own stack frame just after the call, and restore its value just before it returns.
The code of each procedure typically starts with a prologue that performs the callee’s side of the call sequence and saves the callee save registers at risk, and ends with an epilogue that restores those callee save registers and performs the callee’s side of the return sequence.
PLI (Sem 1, 2019) Runtime Environments, Calls and Returns ⃝c University of Melbourne 23 / 24
Up Next
Back to symbol tables; memory management.
PLI (Sem 1, 2019) Runtime Environments, Calls and Returns ⃝c University of Melbourne 24 / 24