EECS 3221:
OPERATING SYSTEM FUNDAMENTALS
Hamzeh Khazaei
Department of Electrical Engineering and Computer Science
Week 9, Module 1:
Main Memory
March 8 &15, 2020
EECS3221: Operating System Fundamentals 9.1 Main Memory
1
Chapter 9: Memory Management
! Background
! Contiguous Memory Allocation
! Paging
! Structure of the Page T able
! Swapping
! Example: The Intel 32 and 64-bit Architectures
! Example: ARMv8 Architecture
EECS3221: Operating System Fundamentals 9.2 Main Memory
2
1
Background
! Program must be brought (from disk) into memory and placed within a process for it to be run
! Main memory and registers are only storage CPU can access directly
! Memory unit only sees a stream of:
! addresses + read requests, or
! address + data and write requests
! Register access is done in one CPU clock
! Main memory can take many cycles, causing a stall
! Cache sits between main memory and CPU registers
! Protection of memory required to ensure correct operation
EECS3221: Operating System Fundamentals 9.3 Main Memory
3
Protection
! Need to censure that a process can access only those addresses in it address space.
! We can provide this protection by using a pair of base and limit registers define the logical address space of a process
EECS3221: Operating System Fundamentals 9.4 Main Memory
4
2
Hardware Address Protection
! CPU must check every memory access generated in user mode to be sure it is between base and limit for that user
! The instructions to loading the base and limit registers are privileged
EECS3221: Operating System Fundamentals 9.5 Main Memory
5
Address Binding
” Programs on disk, ready to be brought into memory to execute form an input queue
” Addresses represented in different ways at different stages of a program’s life
! Source code addresses usually symbolic (int x)
! Compiled code addresses bind to relocatable addresses
4 i.e. “14 bytes from beginning of this module”
! Linker or loader will bind relocatable addresses to absolute addresses
4 i.e. 74014
! Each binding maps one address space to another
EECS3221: Operating System Fundamentals 9.6 Main Memory
6
3
Binding of Instructions and Data to Memory
! Address binding of instructions and data to memory addresses can happen at three different stages
! Compile time: If starting point of memory location known a priori, absolute code can be generated; must recompile code if starting location changes
! Load time: Must generate relocatable code if memory location is not known at compile time
! Execution time: Binding delayed until run time if the process can be moved during its execution from one memory segment to another
4 Need hardware support for address maps (e.g., base and limit registers)
4 Most OSs use this method. Let’s see how this can be done?
EECS3221: Operating System Fundamentals 9.7 Main Memory
7
Multistep Processing of a User Program
EECS3221: Operating System Fundamentals 9.8 Main Memory
8
4
Logical vs. Physical Address Space
! The concept of a logical address space that is bound to a separate physical address space is central to proper memory management
! Logical address – generated by the CPU; also referred to as virtual address
! Physical address – address seen by the memory unit
! Logical and physical addresses are the same in compile-time and load-
time address-binding schemes
! logical (virtual) and physical addresses differ in execution-time address- binding scheme
! Logical address space is the set of all logical addresses generated by a program
! Physical address space is the set of all physical addresses generated by a program
EECS3221: Operating System Fundamentals 9.9 Main Memory
9
Memory-Management Unit (MMU)
! Hardware device that at run time maps virtual to physical address
! Many methods possible, covered in the rest of this chapter
EECS3221: Operating System Fundamentals 9.10 Main Memory
10
5
Memory-Management Unit (Cont.)
! Consider simple scheme, which is a generalization of the base- register scheme.
! The base register now called relocation register
! The value in the relocation register is added to every address
generated by a user process at the time it is sent to memory
! The user program deals with logical addresses; it never sees the real physical addresses
! Execution-time binding occurs when reference is made to location in memory
! Logical address bound to physical addresses
EECS3221: Operating System Fundamentals 9.11 Main Memory
11
Dynamic Loading
! The entire program does need to be in memory to execute
! Routine is not loaded until it is called
! Better memory-space utilization; unused routine is never loaded
! All routines kept on disk in relocatable load format
! Useful when large amounts of code are needed to handle infrequently occurring cases
! No special support from the operating system is required
! Implemented through program design
! OS can help by providing libraries to implement dynamic loading
EECS3221: Operating System Fundamentals 9.12 Main Memory
12
6
Dynamic Linking
! Static linking – system libraries and program code combined by the loader into the binary program image
! Dynamic linking –linking postponed until execution time
! Small piece of code, stub, used to locate the appropriate memory-
resident library routine
! Stub replaces itself with the address of the routine, and executes the routine
! Operating system checks if routine is in processes’ memory address
! If not in address space, add to address space
! Dynamic linking is particularly useful for libraries
! System also known as shared libraries
! Consider applicability to patching system libraries
! Versioning may be needed
! Unlike dynamic loading, dynamic linking and shared libraries generally
require help from the operating system.
EECS3221: Operating System Fundamentals 9.13 Main Memory
13
Contiguous Allocation
! Main memory must support both OS and user processes
! Limited resource, must allocate efficiently
! Contiguous allocation is one early method
! Main memory usually is divided into two partitions:
! Resident operating system, usually held in low memory with interrupt
vector
! User processes then held in high memory
! Each process contained in single contiguous section of memory
EECS3221: Operating System Fundamentals 9.14 Main Memory
14
7
Contiguous Allocation (Cont.)
! Relocation registers used to protect user processes from each other, and from changing operating-system code and data
! Base register contains value of smallest physical address
! Limit register contains range of logical addresses – each logical
address must be less than the limit register
! MMU maps logical address dynamically
! Can then allow actions such as kernel code being transient and kernel changing size
EECS3221: Operating System Fundamentals 9.15 Main Memory
15
Hardware Support for Relocation and Limit Registers
EECS3221: Operating System Fundamentals 9.16 Main Memory
16
8
Variable Partition
! Multiple-partition allocation
! Degree of multiprogramming limited by number of partitions
! Variable-partition sizes for efficiency (sized to a given process’ needs)
! Hole – block of available memory; holes of various size are scattered
throughout memory
! When a process arrives, it is allocated memory from a hole large enough to accommodate it
! Process exiting frees its partition, adjacent free partitions combined
! Operating system maintains information about:
a) allocated partitions b) free partitions (hole)
EECS3221: Operating System Fundamentals 9.17 Main Memory
17
Dynamic Storage-Allocation Problem
How to satisfy a request of size n from a list of free holes?
! First-fit: Allocate the first hole that is big enough
! Best-fit: Allocate the smallest hole that is big enough; must search entire list, unless ordered by size
! Produces the smallest leftover hole
! Worst-fit: Allocate the largest hole; must also search entire list
! Produces the largest leftover hole
First-fit and best-fit better than worst-fit in terms of speed and
storage utilization.
First fit is faster than best fit but there is no difference in terms of
storage utilization.
EECS3221: Operating System Fundamentals 9.18 Main Memory
18
9
Fragmentation
! External Fragmentation – total memory space exists to satisfy a request, but it is not contiguous
! Internal Fragmentation – allocated memory may be slightly larger than requested memory; this size difference is memory internal to a partition, but not being used
! First fit analysis reveals that given N blocks allocated, N/2 blocks lost to fragmentation
! 1/3 may be unusable -> 50-percent rule
EECS3221: Operating System Fundamentals 9.19 Main Memory
19
Fragmentation (Cont.)
! Reduce external fragmentation by compaction
! Shuffle memory contents to place all free memory together in one
large block
! Compaction is possible only if relocation is dynamic, and is done at execution time (not compilation or load time)
! I/O problem
4 Latch job in memory while it is involved in I/O 4 Do I/O only into OS buffers
! Fragmentation is a general problem in computing that can occur wherever we must manage blocks of data. (disk, external storages, etc.)
! Another possible solution to the external-fragmentation problem is to permit the physical address space of processes to be noncontiguous
EECS3221: Operating System Fundamentals 9.20 Main Memory
20
10
Paging
! Physical address space of a process can be noncontiguous; process is allocated physical memory whenever the latter is available
! Avoids external fragmentation
! Avoids problem of varying sized memory chunks
! Divide physical memory into fixed-sized blocks called frames
! Size is power of 2, between 512 bytes and 16 Mbytes
! Divide logical memory into blocks of same size called pages
! Keep track of all free frames
! To run a program of size N pages, need to find N free frames and load program
! Set up a page table to translate logical to physical addresses
! Backing store likewise split into pages
! Still have Internal fragmentation
EECS3221: Operating System Fundamentals 9.21 Main Memory
21
Address Translation Scheme
! Address generated by CPU is divided into:
! Page number (p) – used as an index into a page table which
contains base address of each page in physical memory
! Page offset (d) – combined with base address to define the physical memory address that is sent to the memory unit
page number page offset m -n n
! For given logical address space 2m and page size 2n
EECS3221: Operating System Fundamentals 9.22 Main Memory
p
d
22
11
Paging Hardware
EECS3221: Operating System Fundamentals 9.23 Main Memory
23
Paging Model of Logical and Physical Memory
EECS3221: Operating System Fundamentals 9.24 Main Memory
24
12
Paging Example
! Logicaladdress: n=2and m=4.Usingapagesizeof4bytesanda physical memory of 32 bytes (8 pages)
EECS3221: Operating System Fundamentals 9.25 Main Memory
25
Second Module
26
13
Paging — Calculating internal fragmentation
! Page size = 2,048 bytes
! Process size = 72,766 bytes
! 35 pages + 1,086 bytes
! Internal fragmentation of 2,048 – 1,086 = 962 bytes
! Worst case fragmentation = 1 frame – 1 byte
! On average fragmentation = 1 / 2 frame size
! So small frame sizes desirable?
! But each page table entry takes memory to track
! Page sizes growing over time
! Solaris supports two page sizes – 8 KB and 4 MB
! In Linux and Mac run:
bash:> getconf PAGESIZE
EECS3221: Operating System Fundamentals 9.27 Main Memory
27
Free Frames
Before allocation After allocation
EECS3221: Operating System Fundamentals 9.28
Main Memory
28
14
Implementation of Page Table
! Page table is kept in main memory
! Page-table base register (PTBR) points to the page table
! Page-table length register (PTLR) indicates size of the page table
! In this scheme every data/instruction access requires two memory accesses
! One for the page table and one for the actual data or instruction
! This is expensive; any idea how we can make this better?
! The two-memory access problem can be solved using a special fast- lookup hardware cache called translation look-aside buffers (TLBs) (also called associative memory).
EECS3221: Operating System Fundamentals 9.29 Main Memory
29
Translation Look-Aside Buffer
! Each entry in the TLB consists of two parts: a key (or tag) and a value.
! When the associative memory is presented with an item, the item is
compared with all keys simultaneously.
! TLBs typically small (64 to 1,024 entries)
! On a TLB miss, value is loaded into the TLB for faster access next time
! Replacement policies must be considered
! Some entries can be wired down for permanent fast access
! Typically, TLB entries for key kernel code are wired down.
EECS3221: Operating System Fundamentals 9.30 Main Memory
30
15
Hardware
! Associative memory – parallel search Page # Frame #
! Address translation (p, d)
! If p is in associative register, get frame # out
! Otherwise get frame # from page table in memory
EECS3221: Operating System Fundamentals 9.31 Main Memory
31
Paging Hardware With TLB
EECS3221: Operating System Fundamentals 9.32 Main Memory
32
16
Effective Access Time
! Hit ratio – percentage of times that a page number is found in the TLB
! An 80% hit ratio means that we find the desired page number in the TLB
80% of the time.
! Suppose that 10 nanoseconds to access memory.
! If we find the desired page in TLB then a mapped-memory access take 10 ns
! Otherwise, we need two memory access, so it is 20 ns
! Effective Access Time (EAT) EAT=0.80×10+0.20×20=12 nanoseconds
implying 20% slowdown in access time
! Consider amore realistic hit ratio of 99%, EAT = 0.99 x 10 + 0.01 x 20 = 10.1ns
implying only 1% slowdown in access time.
EECS3221: Operating System Fundamentals 9.33 Main Memory
33
Memory Protection in Page Table
! Memory protection implemented by associating protection bit with each frame to indicate if read-only or read-write access is allowed
! Can also add more bits to indicate page execute-only, and so on
! Valid-invalid bit attached to each entry in the page table:
! “valid” indicates that the associated page is in the process’ logical address space, and is thus a legal page
! “invalid” indicates that the page is not in the process’ logical address space
! Or use page-table length register (PTLR)
! Any violations result in a trap to the kernel
EECS3221: Operating System Fundamentals 9.34 Main Memory
34
17
Valid (v) or Invalid (i) Bit In A Page Table
EECS3221: Operating System Fundamentals 9.35 Main Memory
35
Shared Pages
! Shared code
! One copy of read-only code shared among processes (i.e., text
editors, compilers, window systems)
! Like multiple threads sharing the same process space
! Also useful for inter-process communication if sharing of read- write pages is allowed
! Private code and data
! Each process keeps a separate copy of the code and data
! The pages for the private code and data can appear anywhere in the logical address space
EECS3221: Operating System Fundamentals 9.36 Main Memory
36
18
Shared Pages Example
EECS3221: Operating System Fundamentals 9.37 Main Memory
37
Structure of the Page Table
” Memory structures for paging can get huge using straight-forward methods
! Consider a 32-bit logical address space as on modern computers
! Page size of 4 KB (212)
! Page table would have 1 million entries (232 / 212)
! If each entry is 4 bytes è each process 4 MB of physical address space for the page table alone
4 Don’t want to allocate that contiguously in main memory 4 Want to divide the page table into smaller pieces
page number page offset m -n n
! One simple solution is to divide the page table into smaller units 4 Hierarchical Paging
4 Hashed Page Tables
4 Inverted Page Tables
EECS3221: Operating System Fundamentals 9.38 Main Memory
p
d
38
19
Hierarchical Page Tables
” Break up the logical address space into multiple page tables
” A simple technique is a two-level page table
” We then page the page table
EECS3221: Operating System Fundamentals 9.39 Main Memory
39
Two-Level Paging Example
! A logical address (on 32-bit machine with 4K page size) is divided into:
! a page number consisting of 20 bits
! a page offset consisting of 12 bits
! Since the page table is paged, the page number is further divided into:
! a 10-bit page number
! a 10-bit page offset
! Thus, a logical address is as follows:
! Where p1 is an index into the outer page table, and p2 is the displacement within the page of the inner page table
! Known as forward-mapped page table
EECS3221: Operating System Fundamentals 9.40 Main Memory
40
20
Address-Translation Scheme
EECS3221: Operating System Fundamentals 9.41 Main Memory
41
64-bit Logical Address Space
! Even two-level paging scheme not enough
! If page size is 4 KB (212)
! Then page table has 252 entries
! If two level scheme, inner page tables could be 210 4-byte entries
! Address would look like
! Outer page table has 242 entries or 244 bytes
EECS3221: Operating System Fundamentals 9.42 Main Memory
42
21
Three-level Paging Scheme
! One solution is to add a 2nd outer page table
! But in the following example the 2nd outer page table is still 234 bytes in
size (i.e., 16 GB)
! And possibly 4 memory access to get to one physical memory location
! You can see from this example why, for 64-bit architectures, hierarchical page tables are generally considered inappropriate.
EECS3221: Operating System Fundamentals 9.43 Main Memory
43
Hashed Page Tables
! Common in address spaces > 32 bits
! The virtual page number is hashed into a page table
! This page table contains a chain of elements hashing to the same location
! Each element contains (1) the virtual page number (2) the value of the mapped page frame (3) a pointer to the next element
! Virtual page numbers are compared in this chain searching for a match
! If a match is found, the corresponding physical frame is extracted
EECS3221: Operating System Fundamentals 9.44 Main Memory
44
22
Hashed Page Table
EECS3221: Operating System Fundamentals 9.45 Main Memory
45
Inverted Page Table
! Rather than each process having a page table and keeping track of all possible logical pages, one page table to track all physical pages
! One entry for each real page of memory
! Entry consists of the virtual address of the page stored in that real
memory location, with information about the process that owns that page
! Decreases memory needed to store each page table, but increases time needed to search the table when a page reference occurs
EECS3221: Operating System Fundamentals 9.46 Main Memory
46
23
Inverted Page Table Architecture
EECS3221: Operating System Fundamentals 9.47 Main Memory
47
Hashed Inverted Page Table
! Use hash table to limit the search to one — or at most a few — page-table entries
! TLB can accelerate access
! But how to implement shared memory?
EECS3221: Operating System Fundamentals 9.48
http://web.eecs.umich.edu/~akamil/teaching/sp04/040104.pdf
Main Memory
48
24
Swapping
! A process can be swapped temporarily out of memory to a backing store, and then brought back into memory for continued execution
! Total physical address space of processes can exceed physical memory
! Backing store – fast disk large enough to accommodate copies of all memory images for all users; must provide direct access to these memory images
! Roll out, roll in – swapping variant used for priority-based scheduling algorithms; lower-priority process is swapped out so higher-priority process can be loaded and executed
! Major part of swap time is transfer time; total transfer time is directly proportional to the amount of memory swapped
! System maintains a ready queue of ready-to-run processes which have memory images on disk
EECS3221: Operating System Fundamentals 9.49 Main Memory
49
Swapping (Cont.)
! Question: does the swapped-out process need to swap back into the same physical addresses?
! Depends on address binding method
! Plus consider pending I/O to-from process memory space
! Modified versions of swapping are found on many systems (i.e., UNIX, Linux, and Windows)
! Swapping normally disabled
! Started if more than threshold amount of memory allocated
! Disabled again once memory demand reduced below threshold
EECS3221: Operating System Fundamentals 9.50 Main Memory
50
25
Schematic View of Swapping
Standard swapping involves moving entire processes between main memory and a backing store.
EECS3221: Operating System Fundamentals 9.51 Main Memory
51
Context Switch Time including Swapping
! If next processes to be put on CPU is not in memory, need to swap out a process and swap in target process
! Context switch time can then be very high
! 100MB process swapping to hard disk with transfer rate of
50MB/sec
! Swap out time of 2000 ms
! Plus swap in of same sized process
! Total context switch swapping component time of 4000ms (4 seconds)
! Can reduce if reduce size of memory swapped – by knowing how much memory really being used
! System calls to inform OS of memory use via request_memory() and release_memory()
EECS3221: Operating System Fundamentals 9.52 Main Memory
52
26
Context Switch Time and Swapping (Cont.)
! Other constraints as well on swapping
! Pending I/O – can’t swap out as I/O would occur to wrong process
! Or always transfer I/O to kernel space, then to I/O device
4 Known as double buffering, adds overhead
! Standard swapping not used in modern operating systems
! But modified version common
4 Swap only when free memory extremely low
EECS3221: Operating System Fundamentals 9.53 Main Memory
53
Swapping on Mobile Systems
! Not typically supported
! Flash memory based
4 Small amount of space
4 Limited number of write cycles (SSD drives)
4 Poor throughput between flash memory and CPU on mobile platform
! Instead use other methods to free memory if low
! iOS asks apps to voluntarily relinquish allocated memory
4 Read-only data thrown out and reloaded from flash if needed
4 Failure to free can result in termination
! Android terminates apps if low free memory, but first writes application
state to flash for fast restart
! Both OSes support paging as discussed below
EECS3221: Operating System Fundamentals 9.54 Main Memory
54
27
Swapping with Paging
EECS3221: Operating System Fundamentals 9.55 Main Memory
55
Example: The Intel 32 and 64-bit Architectures
Dominant industry chips
EECS3221: Operating System Fundamentals 9.56 Main Memory
Pentium CPUs are 32-bit and called IA-32 architecture
Current Intel CPUs are 64-bit and called IA-64 architecture
Many variations in the chips, cover the main ideas here
56
28
Example: The Intel IA-32 Architecture
! Supports both segmentation and segmentation with paging
! Each segment can be 4GB
! Up to 16K segments per process
! The logical address space of a process is divided into two partitions.
4 First partition of up to 8K segments are private to process (kept in local descriptor table (LDT))
4 Second partition of up to 8K segments shared among all processes (kept in global descriptor table (GDT))
EECS3221: Operating System Fundamentals 9.57 Main Memory
57
Intel IA-32 Segmentation
EECS3221: Operating System Fundamentals 9.58 Main Memory
58
29
Intel IA-32 Paging Architecture
EECS3221: Operating System Fundamentals 9.59 Main Memory
59
! ! ! !
Paging went to a 3-level scheme
Top two bits refer to a page directory pointer table
Page-directory and page-table entries moved to 64-bits in size
Net effect is increasing address space to 36 bits – 64GB of physical memory
Intel IA-32 Page Address Extensions
address limits led Intel to create page address extension (PAE), allowing
” 32-bit
32-bit apps access to more than 4GB of memory space
page directory 313029
page directory page pointer table directory
page table
9.60
offset
1211 0
21 20
CR3 register
page table
EECS3221: Operating System Fundamentals
Main Memory
4-KB page
60
30
No PAE, 4 KB pages
With PAE, 4 KB pages
No PAE, 4 MB pages
With PAE, 2 MB pages
No PAE vs With PAE
EECS3221: Operating System Fundamentals 9.61 https://en.wikipedia.org/wiki/Physical_Address_Extension
Main Memory
61
Intel x86-64
! Current generation Intel x86 architecture
! 64 bits is ginormous (> 16 exabytes)
! In practice only implement 48 bit addressing
! Page sizes of 4 KB, 2 MB, 1 GB
! Four levels of paging hierarchy
! Can also use PAE so virtual addresses are 48 bits and physical addresses are 52 bits
unused level 4 48 47
pointer table
39 38 30 29
9.62
offset
page map page directory
page directory
page table
63
21 20
12 11
0
EECS3221: Operating System Fundamentals
Main Memory
62
31
Example: ARM Architecture
” Dominant mobile platform chip (Apple iOS and Google Android devices for example)
” Modern, energy efficient, 32-bit CPU
” 4KBand16KBpages
” 1 MB and 16 MB pages (termed sections)
” One-level paging for sections, two-level for smaller pages
EECS3221: Operating System Fundamentals 9.63
outer page
32 bits
inner page offset
4-KB or 16-KB page
1-MB or 16-MB section
Main Memory
63
End of Chapter 9
64
32