Slide 1
Operating Systems
Hubertus Franke
.edu
CSCI-GA.2250-001
Lecture 5: Memory Management
Programmer’s dream
Memory
•Private
•Infinitely large
•Infinitely fast
•Non-volatile
•Inexpensive
Programmer’s Wish List
Memory
•Private
•Infinitely large
•Infinitely fast
•Non-volatile
•Inexpensive
Programs are getting bigger faster than memories.
Memory Hierarchy
Cache
(SRAM)
Main Memory
(DRAM)
Disk Storage
(Magnetic media)
Memory Hierarchy
Cache
(SRAM)
Main Memory
(DRAM)
Disk Storage
(Magnetic media + SSD)
Usually managed by hardware
Managed by OS
Managed by OS
New technologies
(NVMe persistent memory)
Memory Hierarchy
Cache
(SRAM)
Main Memory
(DRAM)
Disk Storage
(Magnetic media)
Usually managed by hardware
Managed by OS
Managed by OS
Memory Manager
Implications:
• Longer per cycle memory access times → Memory seems “further away”
• New Technologies: SDR -> DDR-1/2/3/4/5 -> QDR (technologies)
Still problem widens
Question: Who Cares About the
Memory Hierarchy?
CPU-DRAM
Gap widening
DRAM
7%/yr.
µProc
60%/yr
Memory Abstraction
• The hardware and OS memory manager makes
you see the memory in your process as a
single contiguous entity
• How do they do that?
– Abstraction
• Multiple processes of the same program see
the same address space (in principle).
– Addresses start at 0 .. 2^^N ( N=32 or N=64)
Is abstraction really necessary?
No Memory Abstraction
Even with no abstraction, we can have several setups!
No Memory Abstraction
Only one process at a time can be running
(remember threads share the address space of their process)
No Memory Abstraction
• What if we want to run multiple programs?
– OS saves entire memory on disk
– OS brings next program
– OS runs next program
• We can use swapping to run multiple programs
concurrently
– Memory divided into blocks
– Each block assigned protection bits
– Program status word contains the same bits
– Hardware needs to support this
– Example: IBM 360
Swapping
No Memory Abstraction
(in the presence of multiple processes)
• Processes access physical memory directly
No Memory Abstraction
(in the presence of multiple processes)
Using absolute
address is wrong here
• Processes access physical
memory directly, so they
need to be relocated in order
to not overlap in physical
memory.
No Memory Abstraction
(in the presence of multiple processes)
Using absolute
address is wrong here
• Processes access physical memory
directly, so they need to be
relocated in order to not overlap in
physical memory.
• Can be done at program load time
• Bad idea:
– very slow
– Require extra info from program
Bottom line: Memory abstraction is needed!
Memory Abstraction
• To allow several programs to co-exist in
memory we need
– Protection
– Relocation
– Sharing
– Logical organization
– Physical organization
• A new abstraction for memory:
– Address Space
• Definition:
Address space = set of addresses that a
process can use to address memory
Protection
• Processes need to acquire permission to
reference memory locations for reading or
writing purposes
• Location of a program in main memory is
unpredictable
• Memory references (load / store) generated
by a process must be checked at run time
Relocation
• Programmers typically do not know in advance which
other programs will be resident in main memory at
the time of execution of their program
• Active processes need to be able to be swapped in
and out of main memory in order to maximize
processor utilization
• Specifying that a process must be placed in the
same memory region when it is swapped back in
would be limiting
– may need to relocate the process to a different area
of memory
Sharing
• It is advantageous to allow each process
access to the same copy of the program
rather than have their own separate
copy
• Memory management must allow
controlled access to shared areas of
memory without compromising
protection
Logical Organization
• We see memory as linear one-
dimensional address space.
• A program = code + data + … = modules
• Those modules must be organized in
that logical address space
Address Space
• Defines where sections of data and
code are located in 32 or 64
address space
• Defines protection of such sections
ReadOnly, ReadWrite, Execute
• Confined “private” addressing
concept
➔ requires form of address
virtualization
Physical Organization
• Memory is really a hierarchy
– Several levels of caches
– Main memory
– Disk
• Managing the different modules of
different programs in such a way as:
– To give illusion of the logical organization
– To make the best use of the above hierarchy
Address Space:
Base and Limit
• Map each process address space onto a
different part of physical memory
• Two registers: Base and Limit
– Base: start address of a program in physical
memory
– Limit: length of the program
• For every memory access
– Base is added to the address
– Result compared to Limit
• Only OS can modify Base and Limit
Address Space:
Base and Limit
Main drawback:
Need to add and compare for each
memory address ( can be done in HW)
What if memory space is not enough for
all programs?
Then we may need to swap some programs out of the memory.
(remember swapping means moving entire program to disk )
Swapping
Swapping
• Programs move in and out of memory
• Holes are created
• Holes can be combined -> memory compaction
• What if a process needs more memory?
– If a hole is adjacent to the process, it is allocated to it
– Process has to be moved to a bigger hole
– Process suspended till enough memory is there
Dealing with unknown memory
requirements by processes
• Anticipate growth of usable address space and
leave gaps.
• Waste of phys memory as it will be allocated
whether used
or not.
Managing Free Memory
Bitmap Linked List
• These are universal methods used in OS and
applications. Other methods employ HEAPS.
Managing Free Memory
Bitmap Linked List
Slow: To find k-consecutive 0s for a new process
Managing Free Memory:
Linked List
• Linked list of allocated and free memory
segments
• More convenient be double-linked lists
Managing Free Memory:
Linked List
• How to allocate?
– First fit
– Best fit
– Next fit
– Worst fit
– …
• Each has their philosophy to why and what their
pros and cons are.
At the end you must weight efficiency of space
(fragmentation) and time (to manage)
What are really the problems?
• Memory requirement unknown for most apps
• Not enough memory
– Having enough memory is not possible with current
technology
• How do you determine “enough”?
• Exploit and enforce one condition:
Processor does not execute or access anything that
is not in the memory
(how would that even be possible ?)
Enforce transparently ( user is not involved )
Memory Management Techniques
• Memory management brings processes
into main memory for execution by the
processor
– involves virtual memory
– based on paging and segmentation
But We Can See That …
• All memory references are logical addresses in
a process’s address space that are dynamically
translated into physical addresses at run time
• An address space may be broken up into a
number of pieces that don’t need to be
contiguously located in main memory during
execution.
So:
It is not necessary that all of the pieces of an
address space be in main memory during execution.
Only the ones that I am “currently” accessing
Scientific Definition of
Virtual Memory
Mapping from
logical (virtual) address space
(user / process perspective)
to
physical address space
(hardware perspective)
Address Space
(reminder)
• Defines where sections of data
and code are located in 32 or 64
address space
• Defines protection of such
sections
• ReadOnly, ReadWrite, Execute
• Confined “private” addressing
concept
–➔ requires form of address
virtualization
The Story
1. Operating system brings into main memory a few pieces of the program.
2. An exception is generated when an address is needed that is not in main
memory (will see how).
3. Operating system places the process in a blocking state.
4. Operating system allocates memory and optionally issues a disk I/O Read
request to retrieve the content of that memory
5. Another process is dispatched to run while the disk I/O takes place.
6. An interrupt is issued when disk I/O is complete, which causes the
operating system to place the affected process in the Ready state
7. Piece of process that contains the logical address has been brought into
main memory.
The Story
Questions over Questions
1. Operating system brings into main memory a few pieces of the program.
What do you mean by “pieces”?
2. An exception is generated when an address is needed that is not in main memory (will
see how).
How do you know it isn’t in memory?
3. Operating system places the process in a blocking state.
4. Operating system allocates memory and optionally issues a disk I/O Read request to
retrieve the content of that memory or set it otherwise.
How do I know which one?
What if memory is full and I can’t allocate anymore ?
5. Another process is dispatched to run while the disk I/O takes place.
6. An interrupt is issued when disk I/O is complete, which causes the operating system to
place the affected process in the Ready state
7. Piece of process that contains the logical address has been brought into main memory.
Virtual Memory
• Each program has its own address space
• This address space is divided into pages (e.g 4kB)
• Pages are mapped into physical memory chunks
(called frames)
• By definition then sizeof(page) == sizeof(frame)
( the size is determined by the hardware manufacturer)
Virtual Memory
Single Process / Address Space view first
Allows to have larger virtual address space
then physical memory available
“X” → when you access a virtual page you
get a page fault (see prior story)
that needs to be resolved first
All other accesses are at hardware speed
and don’t require any OS intervention
(with the exception of protection
enforcement)
Virtual Memory
• Main memory can act as a cache for the
secondary storage (disk)
• Advantages:
– illusion of having more physical memory
– program relocation
– protection
Virtual addresses Physical addresses
Address translation
Disk addresses
Block-of-mem
Virtual
Physical
Virtual Memory
Swap disk
Process-1
Virtual Address Space
Physical Memory
Process-2
Virtual Address Space
Virt mem maps to phys mem
good for access
Virt mem saved on swap disk
access will create page fault
Non-valid
Segment
Virt mem never touched
PageTable
PageTable
Virtual page number Page offset
31 30 29 28 27 3 2 1 015 14 13 12 11 10 9 8
Physical page number Page offset
29 28 27 3 2 1 015 14 13 12 11 10 9 8
Virtual address
Physical address
Translation
CPU
( an address in main memory)
How does address translation work?
Virtual page number Page offset
31 30 29 28 27 3 2 1 015 14 13 12 11 10 9 8
Physical page number Page offset
29 28 27 3 2 1 015 14 13 12 11 10 9 8
Virtual address
Physical address
Translation
CPU
( an address in main memory)
Page table referenced
logically inside MMU,
but is really in memory
MMU = Memory Management Unit
PageTable[]
uses
Structure of a Page Table Entry
• Present bit: ‘1’ if the values in this entry is valid, otherwise translation is invalid
and an pagefault exception will be raised
• Frame Number: this is the physical frame that is accessed based on the translation.
• Protection bits: ’kernel’ + ‘w’ specifies who and what can be done on this page
if kernel bit is set then only the kernel can translate this page. If user accesses the
page a ‘privilege exception’ will be raised.
If writeprotect bit is set the page can only be read (load instruction). If attempted
write (store instruction), a write protection exception is raised.
• Reference bit: every time the page is accessed (load or store), the reference bit is set.
• Modified bit: every time the page is written to (store), the modified bit is set.
• Caching Disabled: required to access I/O devices, otherwise their content is in cpu cache
• Other unused bits typically available for the operating system to “remember” information in
struct PTE page_table[];
Accessed by Hardware
Avail
for OS
X86 examples for PTE
32-bit
64-bit
Multi-Level Page Table
aka
RadixTree or Hierarchical Page Table
• To reduce
storage
overhead in
case of large
memories
• For sparse
address spaces
(most
processes)
only few 2nd
tables required
!!! Pagetables components
exactly fit into a frame
Virtual address is broken
into “sections” to access
radix tree for translation
The Intel Pentium
Each running program has a page table.CR3
In its entirety this hierarchy is the “page table”
CR3: Privileged hardware register
4 level PageTable for 64-bit arch
• OS has to make sure no segment is allocated into high range of addess space ( 63-48 bits )
• If bits are set the MMU will raise an exception ( this is really an OS bug then )
unused
General Formula of PageTable
Management
• Hardware defines the frame size as 2**N
• (virtual) page size is typically equal frame size (otherwise it’s a power-of-
two multiple, but that is rare and we shall not base on this exception)
• Everything the OS manages is based on pagesize (hence framesize)
• You can compute all the semantics with some basic variables defined by
the hardware:
– Virtual address range ( e.g. 48-bit virtual address, means that the hardware only considers
the 48-LSB bits from an virtual address for ld/st, all other raise a SEGV error)
– Frame size
– PTE size (e.g. 4byte or 8byte)
– From there you can determine the number of page table hierarchies, offsets
• Example:
– Framesize = 4KB → 12bit offset to index into frame 12bits
– PTE size = 8Bytes -> 512 entries in each page table hierarchy 9bits / hierarchy
– Virtual address range 48-bits: → 12 + N*9bits must add up to 48 bits N=4 hierarchies
– ^^^^^^ the hardware does all the indexing as described before
• The OS must create its page table and VMM management following the
hardware definition.
Speeding Up Paging
• Challenges:
– Mapping virtual to physical address must be fast
– If address space is large, page table will be large
Speeding Up Paging
• Challenges:
– Mapping virtual to physical address must be fast
– we can not always traverse the page table to
get the VA -> PA mapping
– If address space is large, page table will be large
(but remember the sparsity, not fully populated)
Translation Lookaside Buffer(TLB)
•Multi-level page table
TLB
• Observation: most programs tend to make a
large number of references to a small number
of pages over a period of time -> only fraction
of the page table is heavily used
( data and instruction locality !!!)
• TLB
– Hardware cache inside the MMU
– Caches PageTable translations ( VA -> PA )
– Maps virtual to physical address without going to
the page table (unless there’s a miss)
TLB
• In case of TLB miss -> MMU accesses page table
and load entry from pagetable to TLB
• TLB misses occur more frequently than page
faults
• Optimizations
– Software TLB management
• Simpler MMU
• Becomes rare in modern system
TLB
• Sample content of a TLB
• Size often 256 – 512 entries
• 512*4KB = 2^9 * 2^12 = 2^21 =
2MB address coverage at any given time
TLB management
• TLB entries are not written back on access, just
record the R and M bits in the PTE (it is a cache
after all) upon access
• On TLB capacity miss, if the entry is dirty, write it
back to its associated PageTableEntry (PTE)
• On changes to the PageTable, potential entries in
the TLB need to be flushed or invalidated
– “TLB invalidate” e.g. when mmap area disappears.
– “TLB flush” write back when changes in TLB to
PageTable (either global or per address) must be
recorded, think when a “TLB invalidate” might be
insufficient
TLB based translation
Shared Pages
(efficient usage of memory)
• Shared code
– One copy of read-only (reentrant) code shared among
processes (i.e., text editors, compilers, window systems)
– Similar to multiple threads sharing the same process space
– Also useful for inter-process communication if sharing of
read-write pages is allowed
• Private code and data
– Each process keeps a separate copy of the code and data
– The pages for the private code and data can appear
anywhere in the logical address space
Shared Library Using Virtual
Memory
Copy-on-Write
• Copy-on-Write (COW) allows both parent and
child processes to initially share the same pages
in memory
– If either process modifies a shared page, only
then is the page copied
• COW allows more efficient process creation as
only modified pages are copied
Copy-on-Write (cont)
• In general, free pages are allocated from a pool of
zero-fill-on-demand pages
– Pool should always have free frames for fast
demand page execution
• Don’t want to have to free a frame when one
is needed for processing on page fault
– Why zero-out a page before allocating it?
-> so we get a known state of a page.
• vfork() variation on fork() system call has
parent suspend and child using copy-on-write
address space of parent
– Designed to have child call exec()
– Very efficient
Virtual Memory
Swap disk
Process-1
Virtual Address Space
Physical Memory
Virt mem maps to phys mem
good for access
Virt mem saved on swap disk
access will create page fault
Non-valid
Segment
Virt mem never touched
Virtual Memory
Swap disk
Process-1
Virtual Address Space
Physical Memory
Process-2
Virtual Address Space
Virt mem maps to phys mem
good for access
Virt mem saved on swap disk
access will create page fault
Non-valid
Segment
Virt mem never touched
Fork()
WP
WP
WP
WP
WP
WP
WP
WP
WP
WP
WP
WP
WP
1) Mark PTE.wp=write_protect
2) fork() == create copy of PageTable
Virtual Memory (COW)
Swap disk
Process-1
Virtual Address Space
Physical Memory
Process-2
Virtual Address Space
Virt mem maps to phys mem
good for access
Virt mem saved on swap disk
access will create page fault
Non-valid
Segment
Virt mem never touched
Fork()
WP
WP
WP
WP
WP
WP
WP
WP
WP
WP
WP
WP
write
Virtual Memory (COW)
Swap disk
Process-1
Virtual Address Space
Physical Memory
Process-2
Virtual Address Space
Virt mem maps to phys mem
good for access
Virt mem saved on swap disk
access will create page fault
Non-valid
Segment
Virt mem never touched
Fork()
WP
WP
WP
WP
WP
WP
WP
WP
WP
WP write
• Any write access will lead to an
exception and OS will identify that
the page actually belongs to a non-
write protected VMA and can
imply that the page MUST be copy-
on-write protected.
In case of page fault:
which page to remove from memory and swap ➔ stealing from self or some other process ?
We are running out of memory
now what ?
Replacement Policies
• Used in many contexts when storage is not
enough (not just operating systems)
– caches
– web servers
– Pages
• Things to take into account when designing
a replacement policy
– measure of success
– cost
Thin[gk]s to consider
• Local Policies:
– search processes’ pagetables for candidates
– O(N * sizeof(PageTable), where N = number of processes,
sizeof(PageTable) ~= size of Virtual address space
– Does not have a global view of usage
– Enables running “global” algorithms for page replacement which are now O(F), where as F is
#frames vs O(N*sizeof(PageTable), where as N is number of processes.
• Global Policies:
– search frames for candidates
– O(F) where F is number of physical frames ( ~= size of physical memory )
• In general global policies are preferred as O(F) << O(N*sizeof(PageTable))
for now let’s first talk about
- a single page table
- talk about page replacement in that single page table
- Describe this algorithm that way .. → local policy with N=1
- Later we generalize with global policy and multiple processes.
Data Structures Required
• Address Space (one per process)
– Represented as a sequence of VMAs (virtual memory areas)
• [ start-addr, length, Read/Write/Execute, … ]
– PageTable
• PageTable ( struct PTE pgtable[] )
• Frame Table (struct frame frame_table[])
– each chunk of physical memory (e.g. 4K) is described by meta
data [ struct frame ]
– Used and how often, locked ?
– Back reference to pte(s) that refer to this frame. Required
because we need access to PTE’s R and M bits to make
replacement decisions.
– In global algorithms we loop through the frametable but
reach back to the PTEs
– Note only pages that are mapped to frames can be candidates
for paging !!!!
Optimal Page Replacement
Algorithm
• Each page labeled with the number of
instructions that will be executed
before this page is referenced
• Page with the highest label should be
removed
• Impossible to implement
( just used for theoretical evaluations)
More realistic Algos
• We don’t have to make the best, optimal
decision
• We need to make a reasonable decision
at a reasonable overhead
• Its even OK to make the absolute worst
decision occasionally as the system will
implement correct behavior
nevertheless.
The FIFO Replacement Algorithm
• OS maintains a list of the pages currently in
memory [ that would be for instance frame
table ]
• The most recent arrival at the tail
• On a page fault, the page at the head is
removed
• In lab3: for simplicity you implement FIFO with
a simple round robin using a HAND == pointer
to the frametable
The Second-Chance
Page Replacement Algorithm
• Modification to FIFO
• Inspect the R bit of the oldest page
– If R=0 page is old and unused -> replace
– If R=1 then
• bit is cleared
• page is put at the end of the list
• the search continues
• If all pages have R=1, the algorithm
degenerates to FIFO
The Clock
Page Replacement Policy
• Keep page frames on a circular list in the form of a
clock
• The hand points to the oldest uninspected page
• When page fault occurs
– The page pointed to by the hand is inspected
– If R=0
• page evicted
• new page inserted into its place
• hand is advanced
– If R = 1
• R is set to 0
• hand is advanced
• This is essentially an implementation of 2nd-Chance
The Clock
Page Replacement Policy
The Clock
Page Replacement Policy
( an optimization )
head
tail
Tail moves in front and “processes” candidates
– writing optimistically modified pages back to swap
and clear “M” bit and R bit
The Not Recently Used (NRU)
Replacement Algorithm
Sometimes called: Enhanced
Second Chance Algorithm
• Two status bits with each page
– R: Set whenever the page is referenced (used)
– M: Set when the page is written / dirty
• R and M bits are available in most computers implementing
virtual memory
• Those bits are updated with each memory reference
– Must be updated by hardware
– Reset only by the OS
• Periodically (e.g. on each clock interrupt) the R bit is cleared
– To distinguish pages that have been referenced recently
The Not Recently Used
Replacement Algorithm
R M
Class 0: 0 0
Class 1: 0 1
Class 2: 1 0
Class 3: 1 1
NRU algorithm removes
a page at random from the
lowest numbered un-empty
Class.
Note: class_index = 2*R+M
Note: in lab3 to we select the first page encountered in a class
so we don’t have to track all of them, we also use a clock like algorithm
to circle through pages/frames
Ref-Bit reset doesn’t happen each and every search.
Typically done at specific times through a daemon.
The Least Recently Used (LRU)
Page Replacement Algorithm
• Good approximation to optimal
• When page fault occurs, replace the
page that has been unused for the
longest time
• Realizable but not cheap
LRU
Hardware Implementation 1
• 64-bit counter increment after each
instruction
• Each page table entry has a field large enough
to include the value of the counter
• After each memory reference, the value of the
counter is stored in the corresponding page
entry
• At page fault, the page with lowest value is
discarded
LRU
Hardware Implementation 1
• 64-bit counter increment after each
instruction
• Each page table entry has a field large enough
to include the value of the counter
• After each memory reference, the value of the
counter is stored in the corresponding page
entry
• At page fault, the page with lowest value is
discarded
Too expensive!
Too Slow!
LRU:
Hardware Implementation 2
• Machine with n page frames
• Hardware maintains a matrix of nxn bits
• Matrix initialized to all 0s
• Whenever page frame k is referenced
– Set all bits of row k to 1
– Set all bits of column k to 0
• The row with lowest value is the LRU
LRU:
Hardware Implementation 2
Pages referenced: 0 1 2 3 2 1 0 3 2 3
LRU
Hardware Implementation 3
• Maintain the LRU order of accesses in
frame list by hardware
• After accessing page 3
4 2 7 3 5…
4 2 7 5…
Head Tail
3
LRU Implementation
• Slow
• Few machines (if any) have required
hardware
Approximating LRU in Software
• Not Frequently Used (NFU) algorithm
• Software counter associated with each page,
initially zero
• At some periodicity (e.g. a second), the OS scans all
page table entries and adds the R bit to the
counter of that PTE
• At page fault: the page with lowest counter is
replaced
Still a lot of overhead ( not really practical either)
Aging Algorithm
• NRU never forgets anything
-> high inertia
• Modifications:
– shift counter right by 1
– add R bit as the leftmost bit
– reset R bit
• This modified algorithm is called aging
• Each bit in vector represents a period
• The page whose counter is lowest is replaced at page
replacement
Aging Algorithm
The Working Set Model
• Working set: the set of pages that a process is
currently using
• Thrashing: a program causing page faults at
high rates (e.g. pagefaults/instructions metric)
An important question:
In multiprogramming systems, processes are
sometimes swapped to disk (i.e. all their pages
are removed from memory). When they are
brought back, which pages to bring?
The Working Set Model
• Try to keep track of each process’
working set and make sure it is in
memory before letting the process run.
K
w(k,t): the set of pages accessed in the last k references at instant t
The Working Set Model
• OS must keep track of which pages are in the
working set
• Replacement algorithm: evict pages not in the
working set
• Possible implementation (but expensive):
– working set = set of pages accessed in the last k
memory references
• Approximations
– working set = pages used in the last 100 msec or 1 sec
(etc.)
Working Set
Page Replacement Algorithm
age = current virtual time – time of last use
time of last use == ref-bit was reset
R=0
The WSClock Page Replacement Algorithm
(specific implementation of Working Set Replacement)
• Based on the clock algorithm and uses working set
• data structure: circular list of page frames ( clock )
• Each entry contains: time of last use, R bit
• At page fault: page pointed by hand is examined
– If R = 1:
• Record current time , reset R
• Advance hand to next page
– If R = 0
• If age > threshold and page is clean -> it is reclaimed
• If page is dirty -> write to disk is scheduled and hand advances
(note in lab3, we skip this step for simplicity reasons, but think why this
is advantageous ?)
• Advance hand to next page
The WSClock
Page Replacement Algorithm
Update time and reset R=0
Let Tau = 500
Design Issues for Paging:
Local vs Global Allocation
• How should memory be allocated among the
competing runnable processes?
• Local algorithms: allocating every process a fixed
fraction of the memory
• Global algorithms: dynamically allocate page frames
• Global algorithms work better
– If local algorithm used and working set grows →
thrashing will result
– If working set shrinks → local algorithms waste memory
Global Allocation
• Method 1: Periodically determine the
number of running processes and allocate
each process an equal share
• Method 2 (better): Pages allocated in
proportion to each process total size
• Page Fault Frequency (PFF) algorithm: tells
when to increase/decrease page allocation
but says nothing about which page to
replace.
Global Allocation: PFF
Unacceptable high
process given more pages
so low, process
has too much memory
page frames may be taken away
Design Issues:
Load Control
• What if PFF indicates that some processes
need more memory but none need less?
• Swap some processes to disk and free up
all the pages they are holding.
• Which process(es) to swap?
– Strive to make CPU busy (I/O bound vs CPU
bound processes)
– Process size
Global Policies and
OS Frame Tables
• In general the OS keeps meta data for each frame, typically 8-16 bytes.
→ 2-4% of memory required to describe memory.
• Meta Data includes
– reverse mapping ( pid, vpage) of user
– Helper data for algorithm (e.g. age, tau)
• Enables running “global” algorithms for page replacement which are now O(F), where
as F is #frames vs O(N*sizeof(PageTable), where as N is number of processes.
• Replacement algo can now run something similar to this.
for ( i=0 ; i do what ever you have to do based on PTE determine best frame to select } Design Issues: Page Size • Large page size → internal fragmentation • Small page size → – More overhead transferring from disk • Remember the Hardware Designers decide Design Issues: • Assume: • So: • We want to minimize the overhead: Design Issues: • To save space, when same program is • If separate I and D spaces: process • In case of common I and D spaces: track of shared pages Design Issues: • Dynamically linked • Loaded when program is loaded or when • Compilers must not produce instructions Design Issues: • Paging daemon • Sleeps most of the time • Awakened periodically to inspect state of • If too few pages/frames are free -> Summary of Paging • Virtual address space bigger than physical memory • Mapping virtual address to physical address • Virtual address space divided into fixed-size units • Physical address space divided into fixed-size units • Virtual address space of a process can be and are non- Paging MMU Involvement Paging MMU Involvement ? OS Involvement With Paging • When a new process is created • When a process is scheduled for execution • When process exits • When page fault occurs OS Involvement With Paging • When a new process is created (initially) process table • When a process is scheduled for execution • When process exits • When page fault occurs OS Involvement With Paging • When a new process is created • When a process is scheduled for execution – MMU reset for the process – Process table of next process made current • When process exits • When page fault occurs OS Involvement With Paging • When a new process is created • When a process is scheduled for execution • When process exits • When page fault occurs OS Involvement With Paging • When a new process is created • When a process is scheduled for execution • When process exits • When page fault occurs exception Page Fault Exception Handling 1. The hardware: 2. An assembly routine saves general registers and 3. OS tried to discover which virtual page is needed 4. OS checks address validation and protection and Page Fault Handling 5. If page frame selected is dirty 6. Once the page frame is clean 7. When disk interrupts indicates page has arrived Page Fault Handling 8. Faulting instruction is backed up to its 9. Process is scheduled for execution and OS 10. The routine reloads registers and other Virtual Memory & I/O Interaction • Process issues a syscall() to read a file into a buffer • Process suspended while waiting for I/O • New process starts executing • This other process gets a page fault • If paging algorithm is global there is a chance the page • The I/O operation of the first process will write some data into One solution: Locking (pinning) pages engaged in I/O Backing Store • Swap area: not a normal file system on it. • Associates with each process the disk address of its swap • Before process starts swap area must be initialized area] out [dynamic] • Instead of disk partition, one or more preallocated files Backing Store Static Swap Area Dynamic Page Fault Handler Memory Mappings • Each process consists of many memory areas: • segments – Ex: Heap, stack, code, data, ronly-data, etc. • Each has different characteristics • Each process can have 10s-100s of these. Example: emacs VMAs 336 VMAs More on memory regions Anonymous memory: Memory mapped files: Memory Mapped Files • It’s contiguous • On PageFault, it fetches the content from file at the • On “swapout”, writes back to file ( different versions though ) Stack Code FileArea1 AddressSpace FileArea2 File1 File2 Benefits • Can perform Linux Example #1 • Objects: Source: http://duartes.org/gustavo/blog/post/how-the-kernel-manages-your-memory/ Linux Example • VMA areas • Anonymous Organization of Memory Regions • Cells are by default non-overlapping • Organized as AVL trees • Identify in O(logN) time during pgfault • Rebalanced when VMA is added or deleted VMA organization • Organized as balanced tree • Node [ start – end ] • On add/delete • Lookup in O(log(n)) Linux MemMgmt LRU Approximation • Arrange the entries into LRU • Two queues FrameTable each physical frame) Active Queue Inactive Queue Maintain Ratio Minor Faults forced by setting present=0 Conclusions • We are done with Chapter 3 • Main goal • Constraints • Memory management Lab3 (how to approach) • You are emulating HW / SW behavior: • For every instruction simulated – Check whether PTE is present – If not present → pagefault -> OS must resolve: • select a victim frame to replace (make pluggable with different replacement algorithms) • Unmap its current user (UNMAP) • Save frame to disk if necessary (OUT / FOUT) • Fill frame with proper content of current instruction’s address space (IN, FIN,ZERO) • Map its new user (MAP) • Its now ready for use and instruction – Mark reference/modified bit as indicated by instruction Lab3 • Create your frame table • Make all algorithms run through frame table • Make all algorithms maintains a “hand” from where to • On process exit() return all used frames to a free pool and Proc-1 Data Structures (also lab3) PageTable (one per process) FrameTable: track the usage of each frame by a pagetable – one entry related to each physical frame Inverse mapping PageTable Entry (one per virtual page) reverse mappings
– larger page table
the frame/page sizes !!
Page Size
– s = process size
– p = page size
– e = size of each page table entry
– number of pages needed = s/p
– occupying: se/p bytes of page table space
– wasted memory due to fragmentation: p/2
– overhead = se/p + p/2
– Take derivative of overhead and equate to 0:
– -se/p2 + ½ = 0 → p = √2se
Shared Pages
running by several users for example
table has pointers to Instruction page
table and Data page table
– Special data structure is needed to keep
– Copy on write for data pages
Shared Libraries
functions in them are called for the
first time
using absolute addresses in libraries →
position-independent code
Cleaning Policy
the memory
daemon begins selecting pages to evict and
gets more aggressive the lower the free
page count sinks.
called pages
called pages frames
contiguous in physical address space
OS
OS
– Determine how large the program and data will be
– Create page table
– Allocate space in memory for page table
– Record info about page table and swap area in
– TLB flushed for current process
– OS releases the process page table
– Frees its pages and disk space
– Saves program counter
– Exception leads to kernel entry
calls OS
assign a page frame (page replacement may be
needed)
– Page scheduled to transfer to disk
– Frame marked as busy
– OS suspends the current process
– Context switch takes place
– OS looks up disk address where needed page is
– OS schedules a disk operation
– Faulting process still suspended
– OS updates page table
original state before page fault and PC is
reset to point to it.
returns to the assembly routine.
state information and returns to user
space.
(Interesting Scenario)
containing the buffer could be removed from memory
the buffer and some other on the just-loaded page !
so that they will not be removed by replacement algo
area; store in the process table
– One way: copy all process image into swap area [static swap
– Another way: don’t copy anything and let the process swap
within the normal file system can be used [Windows uses
this approach.]
– Aka:
• regions
• VMAs virtual memory areas.
– Protection (executable, rw, rdonly )
– Fixed, can grow (up or down) [ heap, stack ]
– Swap space required
– Typically no swap space required
– The file is the swap space
• Maps a VMA -> file segment ( see mmap() fd argument )
appropriate offset into a virtual page of the address space
load/store
operations vs
input/output
vs.
filebacked
– Called VMA (Virtual Memory Areas)
– Pgfault(vaddr) → VMA
→ rebalance
(one entry related to
Inactive Queue .. → clean proactively
Inactive Queues create minor faults
and setting “minor-fault sw-bit” in PTE.
Move page from inactive to active and set present
– Provide Processes with illusion of large and fast memory
– Speed
– Cost
– Protection
– Transparency
– Efficiency
– Paging
start the next “selection”
start getting a free frame from there again till free pool
is empty; then continue algorithm at hand.
(this can at times select a frame that was just used ..
C’est la vie .. You can make algorithms to complicated,
rather make simple and occasionally not make an optimal
decision)
– This is a data structure the OS maintains to
(speak reverse mapping).
– Used by OS to keep state for each frame
Reference count
Locked
Linkage