代写代考 SPEC2000

Memory Hierarchy Design
pCache Organization
pVirtual Memory
p2.3 Six Basic Cache Optimizations

Copyright By PowCoder代写 加微信 powcoder

pTen Advanced Optimizations of Cache Performance
pMemory Technology and Optimizations
pVirtual Memory and Protection
pProtection: Virtual Memory and Virtual Machines

Memory Hierarchy Design
Three Categories of Cache Optimizations
Average Memory Access Time
= Hit Time + Miss Rate* Miss Penalty
1) Reducing the miss rate
• Larger block size, larger cache size, higher associativity
2) Reducing the miss penalty
• Multilevel caches, giving reads priority over writes
3) Reducing the time to hit in the cache
• Avoiding address translation when indexing the cache
The Average Memory Access Time

Memory Hierarchy Design
Three Categories of Misses
To gain better insights into the causes of cache misses
• The very first access to a block cannot be in the cache
• If the cache cannot contain all the blocks needed during execution of a program, capacity misses occur
• In case of set associative and direct mapped, conflict misses occur
Compulsory Capacity Conflict

Memory Hierarchy Design
Comparison of Three Categories
Total miss rate
Distribution of miss rate
The percentage in each category
Actual data cache miss rates
Running SPEC2000

Memory Hierarchy Design
Six Basic Cache Optimizations
• Reduces compulsory misses
• Increases capacity and conflict misses, increases miss penalty
• Increases hit time, increases power consumption
• Reduces conflict misses
• Increases hit time, increases power consumption
Larger block size
Larger total cache capacity to reduce miss rate
Higher associativity
• Reduces overall memory access time
• Reduces miss penalty
• Reduces hit time
Higher number of cache levels
Giving priority to read misses over writes
Avoiding address translation in cache indexing

Memory Hierarchy Design
2) Larger Caches (Miss Rate↓)
p Good for reducing capacity misses pDrawback: potentially longer hit time, and
higher cost and power
p Popular in off-chip caches

Memory Hierarchy Design
3) Higher Associativity (Miss Rate↓) n-way cache, we increase n
With a higher associativity, the cache has a smaller number of sets.
Total miss rate
Distribution of miss rate

Memory Hierarchy Design
3) Higher Associativity (cont’)
Two general rules of thumb
Eight-way set associative
• is for practical purposes as effective in reducing misses
for these sized caches as fully associative
2:1 cache rule of thumb.
• A direct-mapped cache of size N has about the same miss rate as a two-way set associate cache of size N/2

Memory Hierarchy Design
4) Multilevel Caches (Penalty ↓)
pInsert additional levels of cache between level-1 cache and memory
mOtherwise references should go to memory which introduces long latency!

Memory Hierarchy Design
4) Multilevel – Performance Analysis
Average Memory Access Time
= Hit Time L1+ Miss Rate L1 * Miss Penalty L1
Miss Penalty L1 =Hit Time L2+ Miss Rate L2 * Miss Penalty L2
Miss RateL2
Global miss rate: Miss RateL1 Miss RateL1 * Miss RateL2
Local miss rate: Miss Rate L1
(i.e., you can find neither in L1 nor L2)

Memory Hierarchy Design
5) Giving Priority to Read misses
(Penalty ↓)
without buffer
with buffer
write buffer
To serve reads before writes have been completed!

Memory Hierarchy Design
5) Potential Issue

Memory Hierarchy Design
5) Giving Priority to Read misses (Cont’)
Two Solutions
The simplest solution
• For a read miss, to wait until the write buffer is empty
A better solution (priority to read misses)
• Upon a read miss, to check the contents of the write buffer.

Memory Hierarchy Design
6) Avoiding Address Translations (Hit Time↓)
pFor each memory access, we need to check whether the data is in cache
pTo do the checking, we need to do two tasks, mLocating the possible data in cache ( by using the
index) mComparing the tags
PA miss hit
Trans- lation
Main Memory

Memory Hierarchy Design
Cache with Virtual Address
If virtual addresses are used for cache, the time for address translations is saved (which is a frequent operation!!)
Trans- lation
Main Memory
A virtually addressed cache would only require address translation on cache misses

Memory Hierarchy Design
Cache with Virtual Address
Why not use virtual addresses to organize cache?
What are the main issues of using virtual cache (cache organized with virtual addresses)?

Memory Hierarchy Design
Issue 1: Protection
p Multiprogramming allows a computer to be shared by several programs running concurrently
p The OS should guarantee that processes do not interfere with each other’s computations
p The computer designer helps the OS to provide protection so that one process cannot modify another
p Page-level protection is checked as part of the virtual to physical address translation
Page-level Protection
Virtual Addr Physical Addr
Virtual P#
Physical P#

Memory Hierarchy Design
Issue 2: Process Switching
p The processor is shared by different processes
p It is common that a process is switched
p Note that each process is using its own virtual address space
p Each time a process is switched, the cache should be flushed!!
Virtual Add Space
Virtual Add Space

Memory Hierarchy Design
Issue 3: Synonyms or Aliases
Shared block
p OS and processes are using different virtual addresses for the same physical address
p The duplicate addresses, called synonyms or aliases, could result in two copies of the same data in the virtual cache
p Lead to coherence issues: Must update all cache entries with the same physical address or the memory becomes inconsistent
Virtual Add Space
Virtual Add Space

Memory Hierarchy Design
Idea: Overlapping Address Translation and Cache Reading
Goal: to get the best of both virtual and physical caches! Virutally indexed, physically tagged!
p But, how?
p Note that page offset is identical in both virtual and
physical addresses
Approach: overlapping the cache access with the TLB access
Works (1) when the high order bits of the VA are used to access the TLB while (2) the low order bits are used as index into cache

Memory Hierarchy Design
Reducing Translation Time
p Idea: use page offset to index the cache (page offset is not used for address translation)
p At the same time
p The cache is being read using the index p The address is translated
p The tag match still uses physical addresses
Virtual memory organization
Cache organization
Page number #
Page offset
Block offset

Memory Hierarchy Design
Limiting the cache size
To keep TLB out of the critical path, virtually indexed physically tagged cache.
This limits the size of the cache
= the page size 􏰀 the associativity

Memory Hierarchy Design
Limiting the cache size (Cont’)
Virtual memory organization
Cache organization
Page number #
Page offset
Block offset
CacheSize=BlockSize􏰀 #ofSets􏰀 #ofWays =2^(block_offset+index)􏰀 #ofWays ≤2^(page_offset)􏰀 #ofWays
Page Size!

Memory Hierarchy Design
More: Limitation of virtually indexed, physically tagged approach
The index is 9 bits, the block offset is 6 bits Directed mapped cache
What is the size of the cache?
What is the minimum size of the page?
The index is 9 bits, the block offset 6 bits 8-way associative cache
What is the size of the cache?
What is the minimum size of the page?

Memory Hierarchy Design
Reducing Translation Time
Virtual page # Page offset
Block offset Index
2-way Associative Cache
Cache Hit Desired word

Memory Hierarchy Design
Impact of Cache Performance and Complexity

Memory Hierarchy Design
Impact of Cache Performance and Complexity

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com