ECE 391 Virtualization
Portions taken from ECE 391 Lecture Notes by , , , Wikipedia, the free
Copyright By PowCoder代写 加微信 powcoder
encyclopedia, ’ x86 Assembly Guide, ’s Programming from the Ground Up, and the X86
Assembly Wikibook
Not Tested
On Students
Ad from 1994 Byte Magazine
You should be familiar with the following:
• virtual memory
– virtual/logical address
– linear address
– physical address
• problems addressed by
virtual memory
– protection between
– code/data sharing
– memory fragmentation
– code/data relocation
• x86 segmentation
– shadow bits (in a
– Global Descriptor Table
– Local Descriptor Table
– task state segment (TSS)
What is virtualization?
How do we do it?
Virtualization
• Virtualization is the process of creating a
software-based, or virtual, representation of
something, such as virtual applications,
servers, [memory, cpu, ] storage and
https://www.vmware.com/solutions/virtualization.html
How do we do it?
• Time-based sharing
• Space-based sharing
What is virtual memory?
• A useful abstraction
– between memory addresses seen by software and
those used by hardware
– Enabled by indirection
• Typically done with large blocks, e.g., 4kB in
addresses,
mapped I/O
(per program)
not actually
not in use/
not visible
Why use virtual memory?
What does it cost?
Why use virtual memory?
• protection
– one program cannot accidentally or deliberately
destroy another’s data
– the memory is simply not accessible
Why use virtual memory?
• more effective sharing
– two (or more) programs that share library code
can share a single copy of the code in physical
– code and data not actively used by a program can
be pushed out to disk to make room for other
programs’ active data; provides the illusion of a
much larger physical memory
Why use virtual memory?
• no fragmentation [little to none, anyway]
– systems without virtual memory suffer
fragmentation effects when they try to multitask
– for example, if we run A followed by B followed by
C, and then B finishes, we can’t give D a
contiguous block of memory, even though it fits in
the absolute sense
Why use virtual memory?
• simplifies program loading and execution:
no relocation of code, rewriting stored pointer
values, etc.
Trade-offs
• Complexity
x86 Support for VM
• protection model
• segmentation
x86 Protection Model
• four rings: kernel (ring 0) through user (ring 3)
– lower numbers are more privileged
– lower numbers never call/trust higher numbers
– higher numbers call lower numbers only through
narrow interfaces (e.g., system calls)
x86 Protection Model
• CPL—current privilege level
(of executing code) [in CS]
• RPL—requested privilege
level; when code at high
privilege level executes on
behalf of code at lower
level, some accesses may
voluntarily lower privilege
to that of caller/beneficiary
• DPL—descriptor privilege
level; level necessary to
execute code/data
not used by Linux
if MAX(CPL,RPL) > DPL, processor generates
an exception (general protection fault)
x86 Segmentation
• x86 actually has two levels of indirection, but
one is mostly unused…(this one!)
• a segment is a contiguous portion of a linear
address space such as the 32-bit space of
physical addresses
• x86 in protected mode always uses
segmentation
16-Bit Table Limit
32-bit Linear Base Address
global descriptor table (GDT)
8B descriptors include:
base address, size, DPL,
& some other bits
#0 is not usable
8191 (max.)
48-bit register
GDTR points to
table & holds
table size as well
(really size – 1
Segment descriptors
• descriptors can also differentiate
– code (executable and possible readable) from
– data (readable and possibly writable)
– and a few other somewhat useful things
• finally, descriptors in the GDT can describe
certain aspects of program state (e.g., the task
state segment, or TSS), which we talk about
Segment Registers
code segment CS each segment register has 16
bits visible + ~64 bits shadow
(not accessible via ISA) that
cache the description of the
segment # referenced by the
visible 16 bits
data segment DS
extra data
still more extras
(floating point +
stack segment SS
segment register meaning
1 for Local Descriptor
Table (not mentioned
yet, & essentially not
used by Linux)
index in table
or, since table entries are 8B,
offset to find entry
8191 (max.)
jumbled mess… J
note: if a descriptor in the table (GDT)
changes, a segment register that references it
must be reloaded to update the shadow
portion of the register
• GDT entries can also describe local descriptor
tables (LDTs)
– LDT originally meant to be per-task segment
– LDTR points to current LDT (includes base, size,
and index of LDT in GDT)
8191 (max.)
kernel data seg
kernel code seg.
user data seg
user code seg.
BIOS support,
per-CPU data,
TSS for double
faults, etc.
LDT segment is not
present by default
(all bits are 0)
[together on one cache line]
each starts at address 0
and has size 4GB,
so, effectively, segmentation
is not used in Linux
task state &
LDT for this CPU
thread-local
storage segments
(glibs, WINE, etc.)
each CPU has its own GDT;
look at per_cpu__gdt_page
in a debugger
(see asm/segment.h for details)
https://manybutfinite.com/post/memory-translation-and-segmentation/
x86 Support for VM
• protection model
• segmentation
x86 Paging
• Paging is the second level of indirection in x86
• Each page of a virtual address space is in one
of three states
– doesn’t exist
– exists and is in physical memory
– exists, but is now on the disk rather than in
x86 Paging
• We can encode these possibilities as follows using 4B
• These 4B are called a page table entry (PTE); a group
of them is a page table
present in physical memory
leftovers for other uses
4kB-aligned address
not present
31 bits to differentiate between blocks
on disk & blocks that don’t exist
• If we use 4B for every 4kB, how big is the page
table for a single virtual address space?
• If we use 4B for every 4kB, how big is the page
table for a single virtual address space?
• (4 / 4096) * 232 = 4MB
• too big…
• Solution?
x86 Paging
• Solution
– page the page table
– i.e., use a hierarchical structure
• The page table is just another 4kB page
– it holds 4096 / 4 = 1024 PTEs
page table
present bits
page directory
present bits
page table
x86 Paging
• What about the page directory?
• 232 bytes total (32-bit address space)
• 210 PTEs per table
• 212 bytes per page
– the page directory needs 232/(210 ´ 212) = 210
entries total
– which also fits in one page
(and could be paged out to disk, although Linux does not)
x86 Paging
31 021 12 11
directory # table # offset
pointers to
page tables
controls);
page directory
page table
page directory base register
(PDBR, usually called control register 3, or cr3)
controls);
x86 Paging
• To translate a virtual address into a physical
– start with the PDBR (cr3)
– look up page directory entry (PDE) using the 10 MSb
of virtual address
– extract page table address from PDE
– look up page table entry (PTE) using next 10 bits of
virtual address
– extract page address from PTE
– use last 12 bits of virtual address as offset into page
• What are the draw backs?
• Way too slow to do on every memory access!
• Solution?
x86 Paging
– Caveats?
• Hence the translation lookaside buffers (TLBs)
– keep translations of first 20 bits around and reuse
– only walk tables when necessary (in x86, OS
manages tables, but hardware walks them)
What Does This Do?
x86 Paging
• Remember the 11 free bits in the PTEs?
• What should we use them to do?
– optimize to improve performance
– User/Supervisor (U/S) page or page table
• User means accessible to anyone (any privilege level)
• Supervisor requires PL < 3 (i.e., MAX (CPL,RPL) < 3)
– Read-Only or Read/Write
x86 Paging
• Optimize
– TLBs must be fast, so you can’t use many:
• 386: 32 TLB entries
• Zen 3: 64 ITLB / 64 DTLB
• : 64+16 ITLB, 64+32+4 DTLB
• some translations are the same for all programs
• bigger translations could be used when possible
(e.g., use one translation for 4MB rather than 1024
translations)
x86 Paging
• x86 supports both
– G flag—global
• TLB not flushed when changing to new program or address
space (i.e., when cr3 changes)
• used for kernel pages (in Linux)
– 4MB pages
• skip the second level of translation
– indicated by PS (page size) bit in PDE
– PS=1 means that the PDE points directly to a 4MB page
• remaining 22 bits of virtual address used as offset
• Many Intel architectures provide separate TLBs for 4kB &
4MB translations
Which Pages Belong in Memory? (1/2)
• Cache philosophy: recent use predicts future use
• Hardware provides Accessed bit to help OS
determine recency-of-access
Which Pages Belong in Memory? (2/2)
• If a page changes after it is brought in from
disk, must be written back to disk (Dirty bit)
TLBs in Multiprocessors
Not Tested
On Students
• TLB entries may be inconsistent with
updated page tables if
the OS is not careful
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com