程序代写代做 algorithm C chain data structure cache kernel arm Virtual Memory

Virtual Memory
1

Learning Outcomes
• An understanding of page-based virtual memory in depth.
– Including the R3000’s support for virtual memory.
2

Memory Management Unit (or TLB)
The position and function of the MMU
3

Virtual Address
15 14 13 12 11 10 9 8
Page-based VM
• PhysicalMemory
– Divided into equal-sized frames
F
E
D
C
B
A
Space
• Virtual Memory
– Divided into equal-
sized pages
– Amappingisa
translation between
• A page and a frame
• A page and null
– Mappings defined at runtime
• They can change
– Address space can have holes
– Processdoesnot have to be contiguous in physical memory
77 66
55 44 33 22
11 00
Physical Address Space 4
A
F
D
B
C
E

Virtual Address Space
Kernel
Stack
Shared Libraries
BSS (heap)
Data
Text (Code)
Typical Address Space Layout
• Stack region is at top, and can grow down
• Heap has free space to grow up
• Text is typically read-only
• Kernel is in a reserved, protected, shared region
• 0-th page typically not used, why?
5
K
T
S
L
F
E
D
C
B
A

Virtual Address Space
• Aprocessmay be only partially resident
– Allows OS to store individual pages on disk
– Saves memory for infrequently used data & code
• Whathappensif we access non- resident memory?
Programmer’s perspective: logically present
System’s perspective: Not mapped, data on disk
K
T
S
L
F
E
D
C
B
A
D
T
F
B
L
A
K
E
C
S
Physical Address Space 6
Disk

Proc 1 Address Space
Currently running
Proc 2 Address Space
U
T
S
D
C
B
A
0
Z
Y
X
M
L
K
J
0
Physical Address Space
K
D
X
M
A
C
S
U
T
Y
B
L
Z
J
Disk
Memory Access
7

Page Faults
• Referencing an invalid page triggers a page fault • An exception handled by the OS
• Broadly, two standard page fault types
– IllegalAddress(protectionerror) • Signal or kill the process
– Page not resident
• Get an empty frame
• Load page from disk
• Update page (translation) table (enter frame #, set valid bit, etc.)
• Restart the faulting instruction
8

Virtual Address Space 15 14
• Pagetablefor 13
resident part of
address space
Page Table
15 14 13 12
K
T
S
L
F
E
D
C
B
A
6
0
3
1
7
12
11 11
10 9 8 7
10 9 8 7 6 5 4 3 2
A
K
E
C
S
7 66
5 4 3
5 4 3
22
1 0
1 Physical 1 0 Address Space 90

• Private code and data
– Each process has own copy of code and data
– Code and data can appear anywhere in the address space

Shared code
– Single copy of code shared between all processes executing it
– Code must not be self modifying
– Code must appear at same address in all processes
10
Shared Pages

Proc 1 Address Space
Proc 2 Address Space
0
1
7
2
U
T
S
D
C
B
A
Z
Y
X
N
M
B
A
5
4
7
2
Physical Address Space
B
X
N
A
C
S
Two (or more) processes running the same program and sharing the text section
Page Table
Page Table
11

Page Table Structure
• Page table is (logically) an array of frame numbers
– Index by page number
• Each page-table entry (PTE) also has other bits
5
4
7
2
Page Table
12

PTE Attributes (bits)
• Present/Absent bit
– Also called valid bit, it indicates a valid mapping for the page
• Modified bit
– Also called dirty bit, it indicates the page may have been modified in memory
• Reference bit
– Indicates the page has been accessed
• Protection bits
– Read permission, Write permission, Execute permission – Or combinations of the above
• Caching bit
– Use to indicate processor should bypass the cache when accessing memory
• Example: to access device registers or memory
13

Address Translation
• Every (virtual) memory address issued by the CPU must be translated to physical memory
– Every load and every store instruction – Every instruction fetch
• Need Translation Hardware
• In paging system, translation involves replace page number with a frame number
14

Virtual Memory Summary
virtual and physical mem chopped up in pages/frames • programs use virtual
addresses
• virtual to physical mapping
by MMU
-first check if page present
(present/absent bit)
-if yes: address in page table form
MSBs in physical address
-if no: bring in the page from disk
 page fault

Page Tables
• Assumewehave
– 32-bit virtual address (4 Gbyte address space) – 4 KByte page size
– How many page table entries do we need for one process?
16

Page Tables
• Assumewehave
– 64-bit virtual address (humungous address space)
– 4 KByte page size
– How many page table entries do we need for one process?
• Problem:
– Page table is very large
– Access has to be fast, lookup for every memory reference
– Where do we store the page table? • Registers?
• Main memory?
17

Page Tables
• Page tables are implemented as data structures in main memory
• Most processes do not use the full 4GB address space – e.g.,0.1–1MBtext,0.1–10MBdata,0.1MBstack
• We need a compact representation that does not waste space
– But is still very fast to search
• Three basic schemes
– Use data structures that adapt to sparsity
– Use data structures which only represent resident pages
– Use VM techniques for page tables (details left to extended OS)
18

Two-level Page Table
• 2nd –level page tables representing unmapped pages are not allocated
– Null in the top-level
page table
19

20

Two-level Translation
21

Example Translations
22

Alternative: Inverted Page Table
PID VPN
offset
Hash Anchor Table (HAT)
PID
VPN
ctrl
next
Hash
Index 0
1
2
3
4
5
6 …
IPT: entry for each physical frame

Alternative: Inverted Page Table
PID VPN 0 0x5
offset
0x123
Hash Anchor Table (HAT)
Hash
1 0x1A
2
0x40C 0 0x5 0x40D
Index PID VPN 0
1
2 …
ctrl
next
0x40C 0x0
offset
0x123
… …
ppn
0x40C

Inverted Page Table (IPT)
• “Invertedpagetable”isanarrayofpage numbers sorted (indexed) by frame number (it’s a frame table).
• Algorithm
– Compute hash of page number
– Extract index from hash table
– Use this to index into inverted page table
– Match the PID and page number in the IPT entry
– If match, use the index value as frame # for translation
– If no match, get next candidate IPT entry from chain field
– If NULL chain entry  page fault
25

Properties of IPTs
• IPT grows with size of RAM, NOT virtual address space
• Frame table is needed anyway (for page replacement, more later)
• Need a separate data structure for non-resident pages
• Saves a vast amount of space (especially on 64-bit systems)
• Used in some IBM and HP workstations
26

Given n processes
• how many page tables will the system
have for
– ‘normal’ page tables – inverted page tables?

Another look at sharing…

Proc 1 Address Space
Proc 2 Address Space
U
T
S
D
C
B
A
Z
Y
X
N
M
B
A
Physical Address Space
B
X
N
A
C
S
Two (or more) processes running the same program and sharing the text section
29

Improving the IPT: Hashed Page Table
• Retain fast lookup of IPT
– A single memory reference in best case
• Retain page table sized based on physical memory size (not virtual)
– Enable efficient frame sharing
– Support more than one mapping for same frame
30

Hashed Page Table
PID VPN offset
PID
VPN
PFN
ctrl
next
Hash
HPT: Frame number stored in table

PID VPN 0
offset
Hashed Page Table
0x5
0x123
PID
VPN
PFN
ctrl
next
0
0x5
0x42
0x0
1
0x1A
0x13
0x3
Hash
1 2 3 4 5 6 …
ppn offset
0x42
0x123

PID VPN 0
offset
Sharing Example
0x5
0x123
PID
VPN
PFN
ctrl
next
1
0x5
0x42
0x0
0
0x5
0x42
0x3
Hash
1 2 3 4 5 6 …
ppn offset
0x42
0x123

Sizing the Hashed Page Table
• HPTsizedbasedonphysicalmemorysize
• Withsharing
– Each frame can have more than one PTE
– More sharing increases number of slots used • Increases collision likelihood
• However, we can tune HPT size based on: • Physical memory size
• Expected sharing
• Hash collision avoidance.
– HPT a power of 2 multiple of number of physical memory frame
34

VM Implementation Issue
• Performance?
– Each virtual memory reference can cause two physical memory accesses
• One to fetch the page table entry • One to fetch/store the data Intolerable performance impact!!
• Solution:
– High-speed cache for page table entries (PTEs)
• Called a translation look-aside buffer (TLB)
• Contains recently used page table entries
• Associative, high-speed memory, similar to cache memory • May be under OS control (unlike memory cache)
35

On-CPU hardware device!!!
TLB operation
Data structure in main memory
36

Translation Lookaside Buffer
• Givenavirtualaddress,processorexaminesthe TLB
• If matching PTE found (TLB hit), the address is translated
• Otherwise(TLBmiss),thepagenumberisused to index the process’s page table
– If PT contains a valid entry, reload TLB and restart
– Otherwise, (page fault) check if page is on disk • Ifondisk,swapitin
• Otherwise, allocate a new page or raise an exception
37

TLB properties
• Pagetableis(logically)anarrayofframe numbers
• TLBholdsa(recentlyused)subsetofPTentries
– Each TLB entry must be identified (tagged) with the page # it translates
– Access is by associative lookup:
• All TLB entries’ tags are concurrently compared to the page # • TLB is associative (or content-addressable) memory
38

TLB properties
• TLBmayormaynotbeunderdirectOScontrol
– Hardware-loaded TLB
• On miss, hardware performs PT lookup and reloads TLB • Example: x86, ARM
– Software-loaded TLB
• On miss, hardware generates a TLB miss exception, and exception handler reloads TLB
• Example: MIPS, Itanium (optionally)
• TLBsize:typically64-128entries
• Can have separate TLBs for instruction fetch and data access
• TLBs can also be used with inverted page tables (and others)
39

TLB and context switching
• TLBisasharedpieceofhardware
• Normal page tables are per-process (address space)
• TLB entries are process-specific
– On context switch need to flush the TLB (invalidate all entries)
• high context-switching overhead (Intel x86)
– or tag entries with address-space ID (ASID)
• called a tagged TLB
• used (in some form) on all modern architectures
• TLB entry: ASID, page #, frame #, valid and write-protect bits
40

TLB effect
– Average number of physical memory references per virtual reference
=2
• With TLB (assume 99% hit ratio)
– Average number of physical memory references per virtual reference
= .99 * 1 + 0.01 * 2 = 1.01
• Without TLB
41

Recap – Simplified Components of
Virtual Address Spaces (3 processes)
VM System
Page Tables for 3 processes
Frame Table
1
2
3
Frame Pool
1
2
3
CPU
TLB
Physical Memory
42

Recap – Simplified Components of
Virtual Address Spaces (3 processes)
VM System
Inverted Page Table
Frame Pool
1
2
3
CPU
TLB
Physical Memory
43

Recap – Simplified Components of
Virtual Address Spaces (3 processes)
VM System
Hashed Page Table
Frame Table
Frame Pool
1
2
3
CPU
TLB
Physical Memory
44

MIPS R3000 TLB
• N=Notcacheable
• D = Dirty = Write protect
• G = Global (ignore ASID in lookup)
• V=validbit
• •
64 TLB entries
Accessed via software through Cooprocessor 0 registers
– EntryHi and EntryLo
45

R3000 Address Space Layout
• kseg0:
– 512 megabytes
– Fixed translation window to physical memory
• 0x80000000 – 0x9fffffff virtual = 0x00000000 – 0x1fffffff physical
• TLB not used
– Cacheable
– Only kernel-mode accessible
– Usually where the kernel code and data is placed
0xffffffff
0xC0000000
0xA0000000
0x80000000
0x00000000
kseg2
kseg1
kseg0
kuseg 46
Physical Memory

R3000 Address Space Layout
• kuseg:
– 2 gigabytes
– TLB translated (mapped)
– Cacheable (depending on ‘N’ bit)
– user-mode and kernel mode accessible
– Page size is 4K
0xFFFFFFFF
0xC0000000 0xA0000000
0x80000000
0x00000000
kseg2
kseg1
kseg0
kuseg 47

R3000 Address Space Layout
– Switching processes switches the translation (page table) for kuseg
0xFFFFFFFF
0xC0000000 0xA0000000
0x80000000
0x00000000
kseg2
kseg1
kseg0
Proc 1 kuseg
Proc 2 kuseg
Proc 3 kuseg
48

R3000 Address Space Layout
• kseg1:
– 512 megabytes
– Fixed translation window to physical memory
• 0xa0000000 – 0xbfffffff virtual = 0x00000000 – 0x1fffffff physical
• TLB not used
– NOT cacheable
– Only kernel-mode accessible
– Wheredevicesareaccessed(and boot ROM)
0xffffffff
0xC0000000
0xA0000000
0x80000000
0x00000000
kseg2
kseg1
kseg0
kuseg 49
Physical Memory