COMP90045 Programming Language Implementation
Symbol Tables; Memory Management
Harald Søndergaard Lecture 18
Semester 1, 2019
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 1 / 26
Handling Nested Scopes
In most languages, scopes can nest inside each other. For example, C has three kinds of scope: global, function and block, and all nest.
Compilers usually keep a separate symbol table for each scope. Each of these symbol tables maps each name declared in the corresponding scope to information about that declaration.
The symbol tables themselves form a tree based on the nesting. At any point in the code, there is a current symbol table, the symbol table for the current scope.
Looking up a name requires looking in the current symbol table, and if not found, searching its ancestors in turn.
A declaration in an inner scope can hide (shadow) a declaration in an outer scope.
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 2 / 26
Handling Nested Scopes
int n;
int fib(int n) {
if (n <= 1)
return 1;
else {
int t1, t2;
t1 = fib(n-1);
t2 = fib(n-2);
return t1 + t2;
} }
/
header
n
fib
header
n
(block)
header
t1
t2
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 3 / 26
Symbol Table Structure
Each symbol table stores
a pointer to the symbol table of the enclosing scope (if any)
pointers to the symbol tables of the scopes it encloses
whether it has its own stack frame (for example, blocks don’t)
information about the named entities (such as types, variables, functions) declared in its scope.
The information associated with each name depends on what it is declared to be; for example, you need to record different information for functions and for variables.
In standard C, only the global scope may contain function definitions. Other languages (and GNU C) don’t have this restriction.
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 4 / 26
Nested Function Definitions
int n = 10;
int f(int p) {
int a;
int g(int q) {
return q + a + n;
}
int h(int r) {
return g(r);
}
a = p/2;
return h(p);
}
When g is executing, accessing a requires accessing the stack frame of its enclosing scope. This is not the same as its caller’s stack frame.
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 5 / 26
Implementing Nested Procedures
If you want to support nested procedures, each stack frame should have an extra slot, the access link. The access link points to the stack frame of the nearest scope enclosing the called procedure that has a stack frame, if there is such a scope.
In the example, when calling f, the access link is null. When calling h, and when calling g, the access link points to f’s stack frame.
When the compiler looks up a variable name in the symbol table, it counts how many scopes with stack frames it needs to walk past (for example, to look up a, it must walk past h).
Accessing the associated variable requires following the access link slot that many times (in this case once), and using the offset recorded in the relevant symbol table from the resulting stack pointer.
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 6 / 26
Sizes of Primitive Data Types
On most instruction sets, the sizes of the primitive data types are set in stone. For example, characters may be 8 bits, standard integers 32 bits, and single- and double-precision floating point numbers may be 32 and 64 bits, respectively.
However, some machines offer a choice. For example, AMD’s Opteron CPUs implemented both the usual x86 instruction set and AMD’s own AMD64 instruction set (a variant of the x86), with the latter supporting both 32 bit and 64 bit integers.
When generating code for AMD64, a compiler writer had to choose whether to use 32 bit or 64 integers; pieces of code compiled with different assumptions about integer sizes would not work together.
32 bit integers are more compact; 64 bit integers can hold pointers.
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 7 / 26
Alignment
Many instruction sets require that 4-byte integers, if stored in memory, have addresses that are multiples of 4, 8-byte doubles have addresses that are multiples of 8, and in general, N-byte primitive data types (where N is always a power of 2) have addresses that are multiples of N.
Instruction set designers impose alignment requirements because it allows them to use simpler, faster circuits.
The compiler will usually align data even when targeting an instruction set that doesn’t itself impose any alignment requirements (e.g., x86) because unaligned accesses are slower than aligned accesses even on such machines. However, the alignment chosen may be less strict (e.g., 4-byte alignment of 8-byte doubles).
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 8 / 26
Stack Slots and Alignment
In non-optimizing compilers, every variable is stored in its own stack slot.
The ABI dictates the offset from the frame pointer of the first stack slot containing a local variable. It also specifies what the callee may assume about the alignments of the frame and stack pointers at the call (usually that they are multiples of 4).
If a variable of type char is stored at offset -9, and the next variable is a four-byte int and must be aligned, then we can’t store it at offsets -13 through -10. We need to leave the bytes at offsets -10, -11 and -12 unused (these constitute three bytes of padding), and store it at offsets -16 through -13.
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 9 / 26
Stack Slot Allocation
We can minimize the padding needed by allocating the largest variables, the ones with the strictest alignment requirements, first.
Once you know the order, allocating stack slots to local variables is easy. You keep an offset counter. When allocating space for a local, you round up the counter to the next multiple of the alignment requirement of the variable, and then increment the counter by its size. (If the offsets are negative, you reverse the order of those two steps because such slots’ addresses refer to the byte whose offset has the highest absolute value.)
If some local variables or temporaries require stricter alignment than the ABI promises, then the procedure prologue must start with code to adjust the frame pointer and stack pointer if necessary.
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 10 / 26
Memory Allocation
The algorithm we use to allocate stack slots can also be used to allocate slots to parameters passed on the stack and to the temporaries stored after the local variables. The only difference, besides the fact that parameters must be passed in their original order, is that the starting offsets are different.
For temporaries, it is the first stack slot free after the locals. For parameters, it is the first parameter slot (dictated by the ABI), and the direction of allocation is in reverse.
The same algorithm also works for allocating memory to globals, and to locals that preserve their values across calls. The compiler must figure out which section should contain the variable (rodata, data or bss), and use the counter for that section.
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 11 / 26
The Symbol Table
The symbol table for a scope should map each name declared in the scope to all the information the compiler has about that name. This includes information given in the declaration (e.g., type) as well as information computed by the compiler: size, alignment requirement, and address, as offset from the frame pointer or the start of this module’s rodata, data or bss section.
The counters you need for memory allocation are logically also part of the symbol table.
The structure of the symbol table should allow quick access not only when the table is small (which is typical for human-written code) but also when it is very big (which can happen when the source code itself is created automatically, by tools such as lex and yacc).
This usually requires a balanced tree or an expandable hash table.
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 12 / 26
Structures
The layout of structures is also governed by alignment considerations, since most languages do not allow fields to be reordered. In the structure below, bytes 5-7 and 12-15 are padding.
The alignment requirement on a structure is the most severe alignment requirement of any of its fields, so the structure below must have an address divisible by 8. Even though one structure needs only 25 bytes, arrays of it need 32 bytes per element.
typedef struct { int f1; char f2; int f3; double f4; char f5;
} padded;
/* size
/*4 /*1 /*4 /*8 /*1
bytes */
0-3 */ 4 */ 8-11*/ 16-23 */ 24 */
PLI (Sem 1, 2019)
Symbol Tables; Memory Management ⃝c University of Melbourne 13 / 26
Structures
When processing the definition of a structure type, the compiler should record, in the symbol table entry for the type, the name, type, size, and alignment requirement of each field, and its offset from the start of the structure.
It should also record the alignment requirement and overall size of the structure. The size should be rounded up to the next multiple of the alignment requirement, since it is inconvenient to have sizeof(a[0]) differ from the difference in the addresses of &a[0] and &a[1].
Accessing a field requires computing its address, which is just the address of the structure plus the field’s offset.
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 14 / 26
Arrays
month: array[1..12] of integer
var: array[lo..hi] of type
The amount of memory required by the array is the number of elements (hi − lo + 1) multiplied by the size of the element type (eltsize ).
If the memory allocated to var starts at start, then the address of var[i] is start + (i − lo) ∗ eltsize.
Some languages, such as C, require lo to be zero to avoid the subtraction. The multiplication is usually done using shifts, even if eltsize is not a power of 2. For example, multiplication by 12 can be done using two shifts and an add:
x ∗ 12 = x ∗ 8 + x ∗ 4 = x ≪ 3 + x ≪ 2
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 15 / 26
Multidimensional Arrays
N-dimensional arrays can be handled as arrays whose elements are themselves (N-1)-dimensional arrays.
var: array[lo1..hi1, lo2..hi2, lo3..hi3] of type
With the standard row-major layout, the address of var[i,j,k] is
start + (i −lo1)∗((hi2−lo2+1)∗(hi3−lo3+1)∗eltsize) + (j −lo2)∗((hi3−lo3+1)∗eltsize)
+ (k − lo3) ∗ (eltsize)
This can be more quickly computed as
start +(((((i −lo1)∗s2)+(j −lo2))∗s3)+(k −lo3))∗eltsize
using the constants s2 = hi2 − lo2 + 1 and s3 = hi3 − lo3 + 1.
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 16 / 26
Dangling References
Accessing the variables of an ancestor function from a descendant is OK, but the reverse is a bug, because it accesses the descendant’s stack frame after it has been deallocated.
What does this C program print?
int main(void) {
int *p, *q;
p = dangle();
q = mangle();
printf("%d %d\n", *p, *q); return 0;
}
int *dangle(void) { int i = 17; return &i;
}
int *mangle(void) {
int j = 42;
return &j; }
PLI (Sem 1, 2019)
Symbol Tables; Memory Management ⃝c University of Melbourne 17 / 26
Heap Allocation
When a function wants to create data that outlives its stack frame, it must allocate that data on the heap.
Since the sizes of both local and global variables are fixed, any variable-sized data structures must also be allocated on the heap (with a few exceptions, e.g., declare x as array[n] of float).
The heap is managed by the runtime system, which provides functions such as malloc and free to allocate and deallocate memory on the heap.
They maintain a list of free memory blocks and their sizes as well as a list of in-use memory blocks and their sizes. When there is no free block that can accommodate a request, malloc asks the operating system for more memory. Usually (not always) the request is granted.
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 18 / 26
Memory Management
Using malloc and free is error prone. If you free a memory block that still has pointers to it, any reference through that pointer after malloc has handed out that block for another use will be an error.
On the other hand, not freeing a memory block when it becomes unreachable causes that memory block to be unavailable for any use while the process is alive; this is a memory leak.
If two modules share access to a data structure, neither can deallocate it safely without knowing whether the other one still needs it. Sometimes a module makes copies of the data structures it is about to pass to another module, just to avoid this sharing and allow each module to guarantee the deallocation of its data.
This is clumsy as well as slow. The solution adopted in some
languages (Haskell, OCaml, Prolog, Java, Scala,...) is to require the
runtime system to support garbage collection.
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 19 / 26
Reference Counting Garbage Collection
The simplest garbage collector extends each heap block with an extra field, the reference count, counting the pointers pointing to the block.
The count of each block is initialized to one when created. Whenever the program duplicates a pointer to such a block, it increments the reference count, and whenever a pointer stops pointing to such a block (because it now points somewhere else or because it is deallocated) it decrements the count. When the count reaches zero, the block is returned to the list of free blocks, and any other blocks pointed to by the newly deallocated block have their reference counts decremented as well.
Reference counting has high overhead, and cannot recover blocks of garbage with cyclic references pointing to each other.
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 20 / 26
Mark and Sweep Garbage Collection
Other garbage collection algorithms kick in only when memory has run out or is about to. Then they find and mark all live heap blocks using a simple recursive algorithm:
find all pointers in global variables and on the stack, and mark all blocks they point to as live;
find all pointers in live blocks, and mark all blocks they point to.
If a block is not marked, then it is not reachable by a pointer from a global variable, from a local variable in a stack frame or from any live block. It thus cannot be referred to in the future, and the collector therefore returns it to the free list.
Garbage collection is inherently approximate: “Reachable” means “could be used later”.
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 21 / 26
Non-Moving vs Moving Garbage Collectors
A non-moving garbage collector links all garbage items together in a freelist; future allocations take memory from there.
A moving garbage collector consolidates non-garbage into a contiguous region of memory, adjusting all pointers to the moved blocks as they go. This
allows memory to be allocated by simply advancing a pointer, gives greater locality of reference, and
de-fragments free memory as items are consolidated.
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 22 / 26
Identifying Pointers
In the implementation of languages such as Lisp and Prolog, every word of memory usually has a few bits that serve as a tag, identifying what the rest of the word holds: an integer, a float, a pointer, . . .
For garbage collection, it is particularly important to be able to tell what is a pointer and what is not. A conservative collector treats anything that looks like a pointer as if it is a pointer. A type-accurate collector maintains type information to decide which values are pointers. It relies on the compiler to generate read-only data structures (RunTime Type Information or RTTI) that tell the runtime system, for each stack slot of each function and for each global, the type of the value stored in that memory location.
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 23 / 26
Identifying Pointers
The Boehm-Demers conservative collector for C, considers that all memory locations may contain pointers. If they contain a bit pattern that matches the address of a heap block, that heap block is considered live, even though the bit pattern may be an integer. Such a system cannot move blocks, because it cannot update “pointers” to them.
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 24 / 26
Generational Garbage Collection
A generational garbage collector tries to capitalize on the fact that allocated items tend to have very different lifetimes, and a recently allocated item is more likely to next become garbage than an item that has been alive for a long time.
Hence it maintains separate regions for the different generations, looking more frequently for garbage in young generations, compared to old generations.
When younger items have been around for long enough, it promotes them to the next older generation, typically consolidating memory in the process.
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 25 / 26
Coming to a Theatre Near You
Final code generation; then: local code optimization.
PLI (Sem 1, 2019) Symbol Tables; Memory Management ⃝c University of Melbourne 26 / 26