2020 – Winter Term 2
Unit 1c – Instance Variables and Dynamic Allocation
1
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder
▪ Reading
▪ Companion: 2.4.4-5
▪ Textbook: 3.9.1, 9.9, 3.10 (reference)
▪ Learning Objectives
▪ read and write C code that includes structs
▪ compare Java classes/objects with C structs
▪ explain the difference between static and non-static variables in Java
▪ explain the usage of ISAs’ base-plus-offset and indexed addressing modes
▪ distinguish static and dynamic computation for access to members of a static struct variable in C
▪ distinguish static and dynamic computation for access to members of a non-static struct variable in C
▪ read and write assembly code that uses base-plus-offset instructions
▪ compare Java dynamic allocation the C’s explicit-delete approach by identifying relative strengths and weaknesses
▪ identify and correct dangling-pointer and memory leak bugs in C caused by improper use of free()
▪ write C code that uses techniques that avoid dynamic allocation as a way to minimize memory-allocation
bugs
▪ write C code that uses reference counting as a way to minimize memory-allocation bugs
▪ explain why Java code does not have dangling-reference errors but can have memory-leak errors
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 2
▪Static variables
▪the compiler allocates them: memory address is a constant ▪ Advantages:
▪ Disadvantages:
▪Scalar variable stores a single value
▪Array stores multiple values named by variable and index
▪data can be allocated either statically or dynamically
▪array variable can be the array (static) or store a pointer to it (dynamic)
▪index is generally considered dynamic
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 3
▪Variables that are an instance of a class or struct ▪Created dynamically
▪Many instances of the same class/struct can co-exist
▪Java vs C
▪Java: objects are instances of non-static variables of a class ▪C: structs are named variable groups, or one of its instances
▪Accessing an instance variable
▪requires a reference to a particular object (pointer to a struct) ▪then variable name chooses a variable in that object (struct)
Class X
static int i; int j;
X someX
Object instance of X
Object instance of X
Object instance of X
Object instance of X
Object instance of X
Object instance of X
int j; int j;
int j; int j;
int j; int j;
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder
4
▪A struct is a collection of variables of arbitrary type ▪allocated and accessed together
▪Declaration
▪similar to declaring a Java class without methods ▪name is “struct” plus name provided by programmer
▪ static:
▪ dynamic:
▪Access:
▪ static:
▪ dynamic:
struct D d0;
struct D* d1;
d0.e = d0.f;
d1->e = d1->f;
≈
class D { public int e; public int f;
}
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder
5
struct D { int e; int f;
}
▪Static structs are allocated by the compiler Static Memory Layout
0x1000: value of d0.e
0x1004: value of d0.f
▪Dynamic structs are allocated at runtime
▪variable that stores struct pointer may be static or dynamic
▪struct itself is allocated with a call to malloc Static Memory Layout
0x1000: value of d1
(address of instance)
struct D d0;
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder
6
struct D {
int e;
int f;
}
struct D* d1;
▪Runtime allocation of dynamic struct
▪Example: assume malloc returned 0x2000 Static Memory Layout
0x1000: 0x2000 # d1
…
0x2000: value of d1->e 0x2004: value of d1->f
void foo() {
d1 = malloc(sizeof(struct D));
}
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder
7
struct D {
int e;
int f;
}
▪Struct members can be accessed using offset ▪Offset to each variable from base of struct is static
▪As before, static and dynamic differ by extra memory access
d0.e = d0.f;
d1->e = d1->f;
r[0] ← 0x1000
ld $0x1000, r0
r[1] ← m[r[0]+4]
ld 4(r0), r1
m[r[0]] ← r[1]
st r1, (r0)
r[2] ← 0x1000
ld $0x1000, r2
r[0] ← m[r[2]]
ld (r2), r0
r[1] ← m[r[0]+4]
ld 4(r0), r1
m[r[0]] ← r[1]
st r1, (r0)
struct D {
int e;
int f;
}
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 8
▪Machine format for base+offset:
▪offset in our case will always be a multiple of 4
▪note that we only have 4 bits to store offset
▪so, we will store 𝑝 = Τ in the instruction (offset divided by 4) 4
𝑜
Name
Semantics
Assembly
Machine
load immediate
r[d] ← v
ld $v, rd
0d– vvvvvvvv
load base+offset
r[d] ← m[r[s]+o]
ld o(rs), rd
1psd
load indexed
r[d] ← m[r[s]+4*r[i]]
ld (rs,ri,4), rd
2sid
store base+offset
m[r[d]+o] ← r[s]
st rs, o(rd)
3spd
store indexed
m[r[d]+4*r[i]] ← r[s]
st rs, (rd,ri,4)
4sdi
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder
9
▪Scalars
▪access memory at address in register
▪Arrays
▪base address in register
▪dynamic index in register
▪access memory at base plus index times element-size (4)
▪Struct members
▪base address in register (start of struct)
▪static (constant) offset (bytes from start of struct) ▪access memory at base plus static offset
▪equivalent to access to array element with static index
ld (r1), r0
ld (r1,r2,4), r0
ld X(r1), r0
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 10
▪Struct variables can be declared inside other structs
▪Struct members can be arrays or pointers
struct X {
int i;
struct D d;
int j;
}
struct Y {
int i;
struct D* d;
int j;
}
struct Z {
int i;
struct Z* d;
int j;
}
struct D {
int e;
int f;
}
struct W { int i;
int a[10];
int j; }
struct W { int i; int* a; int j;
}
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 11
What is the offset of member j in struct X below?
struct D { int e; int f;
}
struct X {
int i;
struct D d;
int j;
}
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 12
What is the offset of member j in struct X below?
struct D { int e; int f;
}
struct X { int i; struct D *d; int j;
}
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 13
▪Alignment rules apply inside a struct
▪Each instance variable will be aligned according to its type size ▪Structs are usually aligned according to their largest member type
▪Structs can mix types (padding might be added) ▪What are the sizes of the structs below?
▪What are the offsets of each of their members?
struct X { char a; char b;
int i; }
struct Y {
char a;
int i;
char b; }
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 14
▪There can also be arrays of structs (or of pointers)
struct X { int i; int j; int k;
};
struct X a[10];
struct X *b[10];
a[3].i = b[5]->j;
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 15
What is the offset of member j in struct X below?
struct D { int e; int f;
}
struct X {
int i; struct D d[10]; int j;
}
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 16
What is the offset of member j in struct X below?
struct Y { char c; short s;
}
struct X {
int i; struct Y y[10]; int j;
}
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 17
▪Programs can allocate memory dynamically ▪allocation reserves a range of memory for a purpose
▪In Java, instances of classes are allocated by the new statement
▪In C, byte ranges are allocated by call to malloc procedure ▪these bytes can be used for any type that can fit in them
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 18
▪Memory allocation
void *malloc(unsigned long n); ▪n is the number of bytes to allocate
▪ returning type is void *
▪ pointer to anything (no specific type assigned)
▪ can be cast to/from any other pointer type ▪ cannot be dereferenced directly
▪Use sizeof to determine number of bytes to allocate struct Foo* f = malloc(sizeof(struct Foo));
▪statically computes number of bytes in type or variable
▪Caution: sizeof(pointer) gives the size of a pointer, not what it points to
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 19
▪Wise management of memory requires deallocation ▪Memory is a scarce resource
▪Deallocation frees previously allocated memory for re-use
▪In Java:
▪Garbage collection: memory is deallocated when no longer in use ▪Requires keeping track of every reference to an object
▪Java does it on its own
▪In C:
▪Dynamic memory must be deallocated explicitly by calling free ▪Memory is deallocated immediately, no checks if it’s still in use
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 20
▪The heap is a large section of memory from which malloc allocates objects
▪malloc finds unused space in the heap, marks as used and returns it ▪free marks space as unused
▪heap may be increased if necessary
▪Java: all objects are stored in the heap
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 21
▪What free(x) does
▪deallocates “object” at address x
▪this memory can be reused by subsequent call to malloc
▪What free(x) does not do
▪ after call to free, x still points at the freed object ▪other variables may still point there too
▪the data stored at address x is not erased
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 22
▪What bad things can happen below? struct buffer *create() {
struct buffer *buf = malloc(sizeof(struct buffer)); …
return buf;
}
void destroy(struct buffer *buf) {
…
free(buf); }
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 23
▪A dangling pointer is a pointer to an object that is not allocated for the purpose it’s been called for
▪often caused by use-after-free
▪could point to unallocated memory
▪if another malloc has been called since last free, could point to another object
▪also happens with incorrect pointer manipulation
▪Why is this a problem?
▪program thinks it’s writing to one object, but is writing to another ▪program may be writing to object of another type
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 24
▪A memory leak is a dynamic memory object with no pointers pointing to it
▪Usually happens if a program doesn’t free object properly
▪May also happen if a pointer to valid object is changed to another
value
▪Why is this a problem?
▪If object had useful information, it can’t be accessed anymore
▪If object is large (or if many memory leaks happen in sequence), program will use too much memory
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 25
▪Understand the problem ▪where is the memory allocated? ▪where is the memory freed?
▪Avoid the program cases
▪if possible, restrict dynamic allocation/free to single procedure
▪if possible, don’t write procedures that return pointers
▪if possible, use local variables instead
▪ local variables are allocated on call and freed on return, automatically
▪Engineer for memory management
▪define rules for which procedure is responsible for deallocation
▪use explicit reference counting if multiple potential deallocators ▪define rules for which pointers can be stored in data structures
▪use coding conventions and documentation to ensure rules are followed
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 26
▪What can be improved in the following code?
char *copy(char *s) {
int len = strlen(s);
char *d = malloc(len + 1);
for (int i = 0; i <= len; i++)
d[i] = s[i];
return d;
}
void foo(char *s) { char *d = copy(s); printf("%s\n", d); free(d);
}
void bar() {
}
foo("Hello, World!");
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder
27
▪Pros? Cons?
void copy(char *d, char *s) { int len = strlen(s);
for (int i = 0; i <= len; i++)
d[i] = s[i]; }
void foo(char *s) {
int len = strlen(s); char *d = malloc(len+1); copy(d, s); printf("%s\n", d); free(d);
}
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder
28
▪Pros? Cons?
void copy(char *d, int size, char *s) {
void foo(char *s) {
int len = strlen(s); char *d = malloc(len+1); copy(d, len+1, s); printf("%s\n", d); free(d);
}
}
int len = strlen(s);
if (len > size-1) len = size-1; for (int i = 0; i < len; i++)
d[i] = s[i];
d[len] = 0;
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder
29
▪If a reference needs to be passed among multiple modules ▪then each module knows when it needs and doesn’t need
reference
▪What do you need to know in order to correctly free an object?
▪free should only happen if the object is not in use
▪you need to know if the object is still in use
▪you need to keep track of all its uses ▪when it starts being used
▪when it stops being used
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 30
▪You can use reference counting to track object use ▪any procedure that stores a reference (starts using object)
increments the count
▪any procedure that discards a reference (stops using object)
decrements the count
▪the object is freed when count goes to zero
▪Reference counter can be part of a struct
▪Updates to reference counter are done through special methods
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 31
struct buffer *malloc_buf() {
struct buffer *buf = malloc(sizeof(struct buffer)); buf->ref_count = 1;
return buf;
}
void add_reference(struct buffer *buf) {
buf->ref_count++;
}
void free_reference(struct buffer *buf) { buf->ref_count–;
if (buf->ref_count == 0)
free(buf); }
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 32
▪On Canvas: ref-count-grades-example.zip
▪refcount.c: malloc/free wrapper with ref-count implementation ▪map.c: hashmap where values are ref-counted
▪intlist.c: ref-counted list of integers
▪grades.c: implements a map from student ID to grade list
▪Key memory-management challenge
▪map.c and grades.c have references to grade lists
▪lists may be released in different points of the code ▪mapping may change from one value to another
▪marks may still be in use when mapping changes
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 33
▪Valgrind is a program that performs dynamic analysis of the runtime of a program
▪Example:valgrind ./grade
▪It runs your program and monitors dynamic allocation and
deallocation
▪It can tell if your program has: ▪memory leaks
▪use after free (dangling pointers)
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 34
▪In Java objects are deallocated implicitly
▪the program never needs to free
▪the runtime system tracks every object reference
▪a garbage collector runs periodically to deallocate unreachable objects
▪Advantage compared to explicit free ▪no dangling pointers
▪no memory leaks
▪handles reference cycles
▪e.g., object a has pointer to b, object b has pointer to a
▪these would cause memory leak when using reference counters alone
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 35
▪What are advantages of explicit free?
▪What are advantages of reference counting? ▪What are advantages of garbage collection? ▪Should we ignore deallocation in Java programs?
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 36
▪Memory leaks can still occur
▪when garbage collector fails to reclaim unneeded objects
▪it’s a huge problem for long-running programs where garbage accumulates
▪Why would an object not be reclaimed?
▪Java only reclaims unreachable objects
▪Unneeded/unused objects that keep references are not reclaimed ▪Collections and maps may maintain old unneeded objects
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 37
▪Solution: plan for memory deallocation explicitly ▪Remove references to unneeded objects ▪http://www.ibm.com/developerworks/library/j-leaks/
▪Imperative approach: explicit reference annulling ▪explicitly set references to NULL when no longer needed ▪add close() or free() methods and call them explicitly
▪use try-finally block to ensure clean-up steps are always taken
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 38
▪Declarative approach: reference objects
▪refer to objects without requiring long-term retention
▪store object references that garbage collector can reclaim WeakReference
▪different levels of reference “stickiness”
▪ soft: discarded only when new allocations put pressure on available memory ▪ weak: discarded on next GC cycle when no stronger reference exists
▪ phantom: unretrievable, used to register with GC reference queue
▪referenced object can be reclaimed, but WeakReference not ▪ Alternative: WeakHashMap
For personal enrichment only, you are not required to know this
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 39