编程代写 KB 212 B

Virtual memory (II): System Architecture and Design

instruction
instruction

Copyright By PowCoder代写 加微信 powcoder

Recap: Virtual memory
0x80000000
0x80008000
tual Memory Space
rtual Memory Space
Instructions Data
Instructions Data

Recap: Demand paging
instruction data
0x0 0x80000000 instruction
Page fault! 0x0
Virtual A s Space for Ch m Page fault! u a pple Music
each of this is a fix-sized page
Page fault!
Page fault!
0x80008000
Instructions Data
Instructions Data

Recap: Segmentation v.s. demand paging
How many of the following statements is/are correct regarding
! Segmentscancausemoreexternalfragmentationsthandemandpaging
segmentation and demand paging?
” Pagingcanstillcauseinternalfragmentations
# Theoverheadofaddresstranslationinsegmentationishigher
— the main reason why we love paging! — within a page
— you need to provide finer-grained mapping in paging — you may need to handle page faults!
$ Consecutivevirtualmemoryaddressmaynotbeconsecutiveinphysical address if we use demand paging
A. 0 B. 1 C. 2
We haven’t seen pure/true implementation of segmentations for a while, but we still use segmentation fault errors all the time!

Recap: Address translation
Processor receives virtual addresses from the running Virtual address space is organized into “pages”
Virtual address
code, main memory uses physical memory addresses
The system references the page table to translate addresses
Each process has its own
page table
The page table
content is maintained by OS
Page table
Physical address
In addition to valid bit and physical page #, the page table may also store
Reference bit Modified bit Permissions
virtual page number
0x 0 0 0 0 B
0x D E A D B
physical page number
page offset
page offset
valid access
permission

Assume that we have 64-bit virtual address space, each page is 4KB, each page table entry is 8 bytes (64-bit addresses), what magnitude in size is the page table for 32 processes?
A. MB — 220 Bytes
B. GB—230Bytes 8bytes!264 B =23B!264 B =255 B=32PB
Recap: Size of page table
C. TB — 240 Bytes 4 KB 212 B
D. PB—250Bytes
32 PB ! 32 = 260B = 1 EB
E. EB—260Bytes

“Paged” page table
0xFFFFFFFFFFFFFFFF
Heap Virtual Address Space Stack
Break up entries into pages!
Each of these occupies exactly a page
212 B Thesenodesarespreadout,
— 23 B = 29 PTEs per node how to locate them in the memory? Otherwise, you always need to find more
than one consecutive pages — difficult!
These are nodes are not presented
if they are not referenced at all — save space
Allocate page table entry nodes “on demand”

Hierarchical Page Table
0xFFFFFFFFFFFFFFFF
Heap Virtual Address Space Stack
“log29 264 B # = “log29252# = 6 levels 212 B
264 B 212 B
These are nodes are not presented astheyarenotreferencedatall.
page table entries/leaf nodes (worst case)

Why: If we expose memory directly to the processor (III)
What if both programs need to use memory?
gmentation or pa
Instructions Data
Instructions Data

If we expose memory directly to the processor (I)
What if my program needs more memory?
0f00bb27 509cbd23 00005d24 0000bd24 2ca422a0 130020e4 00003d24 2ca4e2b3
InstructionsInstructions Data

If we expose memory directly to the processor (II)
What if my program runs on a machine with a different memory size?
0f00bb27 00c2e800
00000008 00005d24 00c2f000 0000bd24 00000008
2ca422a0 130020e4 00000008
00003d24 00c30000 2ca4e2b3 00000008
Memory Memory
Instructions Data

Swapping VAX/VMS Design M

Demand Paging + Swapping
0x000000000000
(1) an instruction accesses virtual Code address 0xDEADBEEF
Static Data Heap
0xFFFFFFFFFFFF
(3) running out of space on DRAM
Physical memory
(5) map the requesting page to the freed space (2) page fault! — exception
page table
Virtual memory
(4) kick some page out and store it in the secondary storage

The mechanism: demand paging + swapping
Divide physical & virtual memory spaces into fix-sized units — pages
In case if we are running out of physical memory —
Allocate a physical memory page whenever the virtual memory page
containing your data is absent
• Reservespaceondisks
• Disks are slow: the access time for HDDs is around 10 ms, the access time for SSDs
is around 30us – 1 ms
Disks are orders of magnitude larger than main memory
When you need to make rooms in the physical main memory, allocate a page
in the swap space and put the content of the evicted page there
When you need to reference a page in the swap space, make a room in the 14
physical main memory and swap the disk space with the evicted page

Latency Numbers Every Programmer Should Know (2020 Version)
Operations
Latency (ns)
Latency (us)
Latency (ms)
L1 cache reference
~ 1 CPU cycle
Branch mispredict
L2 cache reference
14x L1 cache
Mutex lock/unlock
Send 2K bytes over network
Main memory reference
20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy
Read 1 MB sequentially from memory
Read 4K randomly from SSD*
Read 1 MB sequentially from SSD*
Round trip within same datacenter
500,000 ns
Read 1 MB sequentially from disk
825,000 ns
2,000,000 ns
4x datacenter roundtrip
Send packet CA-Netherlands-CA
150,000,000 ns
150,000 us
https://colin-scott.github.io/personal_website/research/interactive_latency.html

How much slower (approximately) is your average memory access time in a system when the probability of a page fault/ swapping is 0.1% comparing with the case when there is no page fault/swapping?
https://www.pollev.com/hungweitseng close in
The swapping overhead
(Assume you swap to a hard disk) A. 10x
D. 10000x E. 100000x
Operations
Latency (ns)
L1 cache reference
Branch mispredict
L2 cache reference
Mutex lock/unlock
Main memory reference
Compress 1K bytes with Zippy
Send 1K bytes over 1 Gbps network
Read 4K randomly from SSD*
Read 1 MB sequentially from memory
150,000 ns
250,000 ns
Round trip within same datacenter
Read 1 MB sequentially from SSD*
500,000 ns
1,000,000 ns
10,000,000 ns
Read 1 MB sequentially from disk
Send packet CA-Netherlands-CA
20,000,000 ns
150,000,000 ns

How much slower (approximately) is your average memory access time in a system when the probability of a page fault/swapping is 0.1% comparing with the case when there is no page fault/swapping?
(Assume you swap to a hard disk)
The swapping overhead
Operations
Memory (i.e. RAM) access time: 100ns
Disk access time: 10ms
Pf: probability of a page fault
Effective Access Time = 100 ns + Pf * 107 ns
L1 cache reference
When Pf = 0.001:
Effective Access Time = 10,100ns
When Pf = 0.001, even with an SSD
Effective Access Time = 100 ns + 10-3 * 105
Takeaway: disk accesses are tolerable only
when they are extremely rare 20
Branch mispredict
Latency (ns)
L2 cache reference
Mutex lock/unlock
Main memory reference
Compress 1K bytes with Zippy
Round trip within same datacenter
Read 1 MB sequentially from SSD*
Send 1K bytes over 1 Gbps network
Read 4K randomly from SSD*
150,000 ns
Read 1 MB sequentially from memory
250,000 ns
500,000 ns
1,000,000 ns
10,000,000 ns
Read 1 MB sequentially from disk
Send packet CA-Netherlands-CA
20,000,000 ns
150,000,000 ns

How much slower (approximately) is your average memory access time in a system when the probability of a page fault/ swapping is 0.1% comparing with the case when there is no page fault/swapping?
The swapping overhead
L1 cache reference
(Assume you swap to a hard disk) A. 10x
D. 10000x E. 100000x
Operations
Branch mispredict
Latency (ns)
L2 cache reference
Main memory reference
Mutex lock/unlock
Compress 1K bytes with Zippy
Send 1K bytes over 1 Gbps network
Read 4K randomly from SSD*
150,000 ns
Read 1 MB sequentially from memory
250,000 ns
Round trip within same datacenter
Read 1 MB sequentially from SSD*
500,000 ns
1,000,000 ns
10,000,000 ns
Read 1 MB sequentially from disk
Send packet CA-Netherlands-CA
20,000,000 ns
150,000,000 ns

Page replacement policy
Implementation Goal: Minimize the amount of software and hardware overhead
• Memory (i.e. RAM) access time: 100ns
• Disk access time: 10ms
• Pf: probability of a page fault
• Effective Access Time = 10-7 + Pf * 10-3
Takeaway: Disk access tolerable only when it is extremely rare 22
Goal: Identify page to remove that will avoid future page faults (i.e. utilize
locality as much as possible)
When Pf = 0.001:
Effective Access Time = 10,100ns

Big picture
Virtual Mem
Processor Processor
Processor Core
Core Registers
Page Table
TLB Last-Level $

Virtual Memory Management in the VAX/
VMS Operating System
H. M. Levy and P. H. Equipment Corporation

Why: The goals of VAX/VMS
How many of the following statements is/are true regarding the
optimization goals of VAX/VMS?
! Reducingthediskloadofpaging
” Reducingthestartupcostofaprogram
# Reducingtheoverheadofpagetables
$ Reducingtheinterferencefromheavilypagingprocesses A. 0
https://www.pollev.com/hungweitseng close in

The system needs to execute various types of applications efficiently
The “Why” behind VAX/VMS VM
— Reducing the interference from heavily paging processes
The system runs on different types of
As a result, the memory management system has to be capable of adjusting the changing demands characteristic of time sharing while allowing predictable performance required by real-time and batch processes
— Reducing the overhead of page tables
— Reducing the startup cost of a program
— Reducing the disk load of paging

Why: The goals of VAX/VMS
How many of the following statements is/are true regarding the
optimization goals of VAX/VMS?
! Reducingthediskloadofpaging
” Reducingthestartupcostofaprogram
# Reducingtheoverheadofpagetables
$ Reducingtheinterferencefromheavilypagingprocesses A. 0

What VAX/VMS proposed to achieve these goals?
Considering the optimization goals and the proposed VAX/ VMS mechanisms, which of the following combinations is incorrect?
Goal Optimization
Process startup cost
Demand-zero & copy-on-refernce
Process performance interference
Process-local replacement
Page table lookup overhead
Page clustering
Paging load on disks
Page caching
https://www.pollev.com/hungweitseng close in

What VAX/VMS proposed to achieve these goals?
Considering the optimization goals and the proposed VAX/ VMS mechanisms, which of the following combinations is incorrect?
Goal Optimization
Process startup cost
Demand-zero & copy-on-refernce
Process performance interference
Process-local replacement
Page table lookup overhead
Page clustering
Paging load on disks
Page caching

virtual virtual virtual
Virtual Memory Space for Process #1
page #1 page #2 page #3
virtual virtual virtual
Virtual Memory Space for Process #2
page #1 page #2 page #3
physical page #1
physical page #2
physical page #3
physical page #4
physical page #5
physical page #6
physical page #7
Copy the page content to different locations before the new process can start
What happens on a fork? fork()

virtual virtual virtual
Virtual Memory Space for Process #1
page #1 page #2 page #3
virtual virtual virtual
Virtual Memory Space for Process #2
page #1 page #2 page #3
physical page #1
physical page #2
physical page #3
physical page #4
physical page #5
physical page #6
physical page #7
The modified bit of a writable page will be set when it’s loaded from the executable file The process eventually will have its own copy of that page
Copy-on-write

virtual virtual virtual
Virtual Memory Space for Process #2
page #1 page #2 page #3
physical page #1
physical page #2
physical page #3
physical page #4
physical page #5
physical page #6
physical page #7
The linker does not embed the pages with all 0s in the compiled program
When page fault occurs, allocate a physical page fills with zeros
Set the modified bit so that the page can be written back
page with “0”s
Demand zero

What VAX/VMS proposed to achieve these goals?
Considering the optimization goals and the proposed VAX/ VMS mechanisms, which of the following combinations is incorrect?
Goal Optimization
Process startup cost
Demand-zero & copy-on-refernce
Process performance interference
Process-local replacement
Page table lookup overhead
Page clustering
Paging load on disks
Page caching

Each process has a maximum size of memory
When the process exceeds the maximum size, replaces from its own set of memory Control the paging behavior within each process
Local page replacement policy
Virtual page #1
Virtual Virtual
page #2 page #3
emory Space for Process A
Virtual page #4 can only gooneoftheseif3isthe maximum memory size of the process
Page for Process A
Page for Process A
Page for Process C
Page for Process C
Page for Process
Page for Page for Process Process
What’s the policy?
FIFO! Low overhead!

What VAX/VMS proposed to achieve these goals?
Considering the optimization goals and the proposed VAX/ VMS mechanisms, which of the following combinations is incorrect?
Goal Optimization
Process startup cost
Demand-zero & copy-on-refernce
Process performance interference
Process-local replacement
Page table lookup overhead
Page clustering
Paging load on disks
Page caching

Page clustering
Read or write a cluster of pages that are both consecutive in Combining consecutive writes into single writes
virtual memory and the disk

Latency Numbers Every Programmer Should Know (2020 Version)
Operations
Latency (ns)
Latency (us)
Latency (ms)
L1 cache reference
~ 1 CPU cycle
Branch mispredict
L2 cache reference
14x L1 cache
Mutex lock/unlock
Send 2K bytes over network
Main memory reference
20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy
Read 1 MB sequentially from memory
Read 4K randomly from SSD*
Read 1 MB sequentially from SSD*
sequential access for a larger block i
Round trip within same datacenter
500,000 ns
Read 1 MB sequentially from disk
825,000 ns
for a 512B sector
2,000,000 ns
4x datacenter roundtrip
Send packet CA-Netherlands-CA
150,000,000 ns
150,000 us
https://colin-scott.github.io/personal_website/research/interactive_latency.html

RS of Process B 2 pages 4 pages
Page caching to cover the performance loss
Evicted pages will be put into one of the lists in DRAM
• Freelist:cleanpages
• Modifiedlist:dirtypages—needstocopydatatothedisk
Page fault to any of the page in the lists will bring the page back
• Reducesthedemandofaccessingdisks
page fault!
RS of Process A
page fault!
page fault!
Physical Memory
Modified List

Page caching

What VAX/VMS proposed to achieve these goals?
Considering the optimization goals and the proposed VAX/ VMS mechanisms, which of the following combinations is incorrect?
Page clustering
Process startup cost
Goal Optimization
Process performance interference
Page table lookup overhead
Paging load on disks
Demand-zero & copy-on-refernce
Process-local replacement
Page caching
also helps reduce disk loads

Process memory layout
Stack Other data
System: software vectors, hardware data structures, executive data, executive procedures, record management, dynamic storage
The VAX/VMS allows the OS code to access user-space memory
P0 (Program) Region
P1 (Control) Region
System Region

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com