Carnegie Mellon
Dynamic Memory Allocation: Basic Concepts
15-213/18-213/15-513: Introduction to Computer Systems 15th Lecture, June 19, 2020
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 1
Carnegie Mellon
Today
Basic concepts
Implicit free lists
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 2
Carnegie Mellon
Dynamic Memory Allocation
Memory invisible to user code
%rsp
(stack pointer)
Kernel virtual memory
User stack (created at runtime)
Memory-mapped region for shared libraries
Run-time heap (created by malloc)
Read/write segment (.data, .bss)
Read-only segment (.init, .text, .rodata)
Unused
Application
Dynamic Memory Allocator
Heap
Programmers use dynamic memory allocators (such as malloc) to acquire virtual memory (VM) at runtime
▪ For data structures whose size is only known at runtime
Dynamic memory allocators manage an area of process VM known as the heap
0x400000
brk
Loaded from
the executable file
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 0
3
Carnegie Mellon
Dynamic Memory Allocation
Allocator maintains heap as collection of variable sized
blocks, which are either allocated or free
Types of allocators
▪ Explicit allocator: application allocates and frees space
▪ e.g., malloc and free in C
▪ Implicit allocator: application allocates, but does not free space
▪ e.g., new and garbage collection in Java
Will discuss simple explicit memory allocation today
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 4
Carnegie Mellon
The malloc Package #include
void *malloc(size_t size)
▪ Successful:
▪ Returns a pointer to a memory block of at least size bytes
aligned to a 16-byte boundary (on x86-64) ▪ If size == 0, returns NULL
▪ Unsuccessful: returns NULL (0) and sets errno to ENOMEM
void free(void *p)
▪ Returns the block pointed at by p to pool of available memory
▪ p must come from a previous call to malloc, calloc, or realloc
Other functions
▪ calloc: Version of malloc that initializes allocated block to zero ▪ realloc: Changes the size of a previously allocated block
▪ sbrk: Used internally by allocators to grow or shrink the heap
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 5
Carnegie Mellon
malloc Example
#include
void foo(long n) {
long i, *p;
/* Allocate a block of n longs */
p = (long *) malloc(n * sizeof(long)); if (p == NULL) {
perror(“malloc”);
exit(0); }
/* Initialize allocated block */
for (i=0; i
Getting header from payload
// block_t *block
// bp points to a payload
return (block_t *) ((unsigned char *) bp
– offsetof(block_t, payload));
C function offsetof(struct, member) returns offset of member within struct Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 27
Carnegie Mellon
Implicit List: Header access
Getting allocated bit from header
return header & 0x1;
Getting size from header
return header & ~0xfL; Initializingheader
block->header = size | alloc;
//block_t*block
Size
a
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
28
Carnegie Mellon
Implicit List: Traversing list
block size
Findnextblock
header
payload
unused
header
payload
static block_t *find_next(block_t *block) {
return (block_t *) ((unsigned char *) block + get_size(block));
}
Unused
End Block
16/0
32/1
64/0
32/1
8/1
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
29
Carnegie Mellon
Implicit List: Finding a Free Block
Firstfit:
▪ Search list from beginning, choose first free block that fits: ▪ Finding space for asize bytes (including header):
static block_t *find_fit(size_t asize) {
block_t *block;
for (block = heap_start; block != heap_end;
block = find_next(block)) {
{
if (!(get_alloc(block))
&& (asize <= get_size(block)))
return block; }
return NULL; // No fit found
}
heap_start
heap_end
16/0
32/1
64/0
32/1
8/1
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 30
Carnegie Mellon
Implicit List: Finding a Free Block
Firstfit:
▪ Search list from beginning, choose first free block that fits:
▪ Can take linear time in total number of blocks (allocated and free) ▪ In practice it can cause “splinters” at beginning of list
Nextfit:
▪ Like first fit, but search list starting where previous search finished
▪ Should often be faster than first fit: avoids re-scanning unhelpful blocks ▪ Some research suggests that fragmentation is worse
Bestfit:
▪ Search the list, choose the best free block: fits, with fewest bytes left over ▪ Keeps fragments small—usually improves memory utilization
▪ Will typically run slower than first fit
▪ Still a greedy algorithm. No guarantee of optimality
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 31
Carnegie Mellon
Comparing Strategies
1.4
1.2
1.0
0.8 NextFit First Fit Best Fit
0.6
0.4
0.2
0.0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Operation / Operation Count
Perfect Fit Data Fit Data
Total Overheads (for this benchmark) ▪ Perfect Fit: 1.6%
▪ Best Fit: 8.3%
▪ First Fit: 11.9%
▪ Next Fit: 21.6%
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 32
Memory Used / Peak Data
Carnegie Mellon
Implicit List: Allocating in Free Block Allocating in a free block: splitting
▪ Since allocated space might be smaller than free space, we might want to split the block
32
32
48
16
8
split_block(p, 32)
p
32
32
32
16
16
8
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
33
Carnegie Mellon
Implicit List: Splitting Free Block
split_block(p, 32)
p
64
16
32
32
16
// Warning: This code is incomplete
static void split_block(block_t *block, size_t asize){
size_t block_size = get_size(block);
if ((block_size - asize) >= min_block_size) { write_header(block, asize, true);
block_t *block_next = find_next(block); write_header(block_next, block_size – asize, false);
}
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 34
Carnegie Mellon
Implicit List: Freeing a Block
Simplest implementation:
▪ Need only clear the “allocated” flag
▪ But can lead to “false fragmentation”
32
32
32
16
16
8
free(p)
malloc(5*SIZ)
p
32
32
32
16
16
8
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
35
Yikes!
There is enough contiguous free space, but the allocator won’t be able to find it
Carnegie Mellon
Implicit List: Coalescing
Join (coalesce) with next/previous blocks, if they are free
▪ Coalescing with next block
free(p)
p
logically gone
32
32
32
16
16
8
32
32
48
16
16
1
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
36
Carnegie Mellon
Implicit List: Coalescing
Join (coalesce) with next block, if it is free
▪ Coalescing with next block
free(p)
p
logically gone
64
32
16
16
8
64
48
16
16
8
▪ How do we coalesce with previous block?
▪ How do we know where it starts?
▪ How can we determine whether its allocated?
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
37
Carnegie Mellon
Implicit List: Bidirectional Coalescing
Boundary tags [Knuth73]
▪ Replicate size/allocated word at “bottom” (end) of free blocks
▪ Allows us to traverse the “list” backwards, but requires extra space ▪ Important and general technique!
8
32
32
32
32
48
48
32
32
8
Size
a
Payload and padding
Size
a
Header
Format of allocated and free blocks
Boundary tag (footer)
a = 1: Allocated block a = 0: Free block
Size: Total block size
Payload: Application data (allocated blocks only)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
38
Carnegie Mellon
Implementation with Footers
asize
asize
dsize
Locating footer of current block
header
payload
unused
footer
header
payload
const size_t dsize = 2*sizeof(word_t);
static word_t *header_to_footer(block_t *block) {
size_t asize = get_size(block);
return (word_t *) (block->payload + asize – dsize); }
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 39
Carnegie Mellon
Implementation with Footers
1 word
Locating footer of previous block
header
payload
unused
footer
header
payload
static word_t *find_prev_footer(block_t *block) {
return &(block->header) – 1;
}
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 40
Carnegie Mellon
Splitting Free Block: Full Version
split_block(p, 32)
p
64
64
16
32
32
32
32
16
static void split_block(block_t *block, size_t asize){ size_t block_size = get_size(block);
if ((block_size – asize) >= min_block_size) { write_header(block, asize, true); write_footer(block, asize, true);
block_t *block_next = find_next(block); write_header(block_next, block_size – asize, false); write_footer(block_next, block_size – asize, false);
}
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 41
Carnegie Mellon
Constant Time Coalescing
Case 1 Case 2 Case 3
Block being freed
Case 4
Allocated
Allocated
Allocated
Free
Free
Allocated
Free
Free
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
42
Carnegie Mellon
Constant Time Coalescing (Case 1)
m1
1
m1
1
n
1
n
1
m2
1
m2
1
m1
1
m1
1
n
0
n
0
m2
1
m2
1
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 43
Carnegie Mellon
Constant Time Coalescing (Case 2)
m1
1
m1
1
n
1
n
1
m2
0
m2
0
m1
1
m1
1
n+m2
0
n+m2
0
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 44
Carnegie Mellon
Constant Time Coalescing (Case 3)
m1
0
m1
0
n
1
n
1
m2
1
m2
1
n+m1
0
n+m1
0
m2
1
m2
1
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 45
Carnegie Mellon
Constant Time Coalescing (Case 4)
m1
0
m1
0
n
1
n
1
m2
0
m2
0
n+m1+m2
0
n+m1+m2
0
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 46
Carnegie Mellon
Heap Structure
Dummy Footer
Dummy Header
Start of
heap
8/1
16/0
32/1
64/0
32/1
8/1
heap_start
heap_end
Dummy footer before first header
▪ Marked as allocated
▪ Prevents accidental coalescing when freeing first block
Dummy header after last footer
▪ Prevents accidental coalescing when freeing final block
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
47
Carnegie Mellon
Top-Level Malloc Code
const size_t dsize = 2*sizeof(word_t);
void *mm_malloc(size_t size)
{
size_t asize = round_up(size + dsize, dsize);
block_t *block = find_fit(asize);
if (block == NULL) return NULL;
size_t block_size = get_size(block);
write_header(block, block_size, true);
write_footer(block, block_size, true);
split_block(block, asize);
return header_to_payload(block); }
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
48
round_up(n, m) =
m *((n+m-1)/m)
Carnegie Mellon
Top-Level Free Code
void mm_free(void *bp)
{
block_t *block = payload_to_header(bp);
size_t size = get_size(block);
write_header(block, size, false); write_footer(block, size, false);
coalesce_block(block);
}
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 49
Carnegie Mellon
Disadvantages of Boundary Tags Internal fragmentation
Can it be optimized?
▪ Which blocks need the footer tag? ▪ What does that mean?
Size
a
Payload and padding
Size
a
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 50
Carnegie Mellon
No Boundary Tag for Allocated Blocks Boundary tag needed only for free blocks
When sizes are multiples of 16, have 4 spare bits
1 word
1 word
Size
b0
Unallocated
Size
b0
Size
b1
Payload
Optional padding
Allocated Block
Free Block
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
51
a = 1: Allocated block
a = 0: Free block
b = 1: Previous block is allocated b = 0: Previous block is free
Size: block size
Payload: application data
Carnegie Mellon
No Boundary Tag for Allocated Blocks (Case 1)
previous block
block being freed
next block
Use 2 bits (address bits always zero due to alignment): (previous block allocated)<<1 | (current block allocated)
m1
?1
n
11
m2
11
m1
?1
n
10
n
10
m2
01
Header:
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 52
Carnegie Mellon
No Boundary Tag for Allocated Blocks (Case 2)
previous block
block being freed
next block
Header: Use 2 bits (address bits always zero due to alignment): (previous block allocated)<<1 | (current block allocated)
m1
?1
n
11
m2
10
m2
10
m1
?1
n+m2
10
n+m2
10
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 53
Carnegie Mellon
No Boundary Tag for Allocated Blocks (Case 3)
previous block
block being freed
next block
Use 2 bits (address bits always zero due to alignment): (previous block allocated)<<1 | (current block allocated)
m1
?0
m1
?0
n
01
m2
11
n+m1
?0
n+m1
?0
m2
01
Header:
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 54
Carnegie Mellon
No Boundary Tag for Allocated Blocks (Case 4)
previous block
block being freed
next block
Header: Use 2 bits (address bits always zero due to alignment): (previous block allocated)<<1 | (current block allocated)
m1
?0
m1
?0
n
01
m2
10
m2
10
n+m1+m2
?0
n+m1+m2
?0
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 55
Carnegie Mellon
Summary of Key Allocator Policies
Placement policy:
▪ First-fit, next-fit, best-fit, etc.
▪ Trades off lower throughput for less fragmentation
▪ Interesting observation: segregated free lists (next lecture) approximate a best fit placement policy without having to search entire free list
Splitting policy:
▪ When do we go ahead and split free blocks?
▪ How much internal fragmentation are we willing to tolerate?
Coalescing policy:
▪ Immediate coalescing: coalesce each time free is called
▪ Deferred coalescing: try to improve performance of free by deferring coalescing until needed.
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 56
Carnegie Mellon
Implicit Lists: Summary
Implementation: very simple
Allocate cost:
▪ linear time worst case
Free cost:
▪ constant time worst case ▪ even with coalescing
Memory Overhead
▪ will depend on placement policy ▪ First-fit, next-fit or best-fit
Not used in practice for malloc/free because of linear- time allocation
▪ used in many special purpose applications
However, the concepts of splitting and boundary tag
coalescing are general to all allocators
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 57