代写代考 Lect. 4: Coherence Protocols

Lect. 4: Coherence Protocols
Snooping coherence Directory coherence
Parallel Architectures – 2019-20 !1

Copyright By PowCoder代写 加微信 powcoder

Snooping Coherence Protocol
Snooping coherence on simple shared bus
Line state
Line state
Cache states: 00 = invalid 01 = shared 10 = modified
– “Easy” as all processors and memory controller can observe all transactions
– Bus-side cache controller monitors the tags of the lines involved and reacts if
necessary by checking the contents and state of the local cache
– Busprovidesaserializationpoint(i.e.,anytransactionAiseitherbeforeorafter
another transaction B)
▪ More complex with split transaction buses
Parallel Architectures – 2019-20 !2

Snooping on a simple bus
Line state
L1 P2 L1 Line state
Cache states: 00 = invalid 01 = shared 10 = modified
Read/Write miss
– When should memory provide data?
▪ Wait until inhibit is deasserted
▪ If Wired OR (sharers, modified) is false, then provides data.
– Write-backs?
▪ Don’t want to wait for writes!Write-back buffer
Parallel Architectures – 2019-20 !3

Snooping on Simple Bus
“The devil is in the details”, Classic Proverb
▪ Problem: conflict when processor and bus-side controller must
checkthecache
Line state
P1 L1 Ld/St
Line state
Cache states: 00 = invalid 01 = shared 10 = modified
Solutions:
– Use dual-ported modules for the tag and state array – Or, duplicate tag and state array
▪ Both must be kept consistent when one is changed
Parallel Architectures – 2019-20

Snooping on Simple Bus
Fig 6.4 Culler et al.
Parallel Architectures – 2019-20 !5

Snooping on Simple Bus
Problem: even if bus is atomic, state transitions are not instantaneous
and may require several steps → transitions are not atomic
– E.g.,read-misstransaction=waitforbus+waitforbus-sidecontrollersto
check cache + data response (or memory response)
– E.g.write-upgradetransactions=waitforbus+waitforbus-sidecontrollersto
invalidate
What to do if there are conflicting requests on the bus to same cache
– E.g., an upgrade request may lose bus arbitration to another processor’s and may
have to be re-issued as a full write miss (due to the intervening invalidation)
– Introduce transient states to cache lines and the protocol (the I, S, M, etc states
seen in Lecture 3 are then called the stable states)
Parallel Architectures – 2019-20 !6

Example: Extended MESI Protocol ▪ TransactionsoriginatingatthisCPU:
CPU write miss Invalid
I→S,E bus granted & shr.
CPU read miss & shr.
bus granted & no shr.
CPU read miss & no shr.
CPU read hit
CPU read hit
CPU write miss
bus granted
CPU read hit CPU write hit
bus granted & no conflict
Parallel Architectures – 2019-20

Snooping with Multi-Level Hierarchies
– ProcessorinteractswithL1whilebussnoopingdeviceinteractswithL2,and
propagating such operations up or down is not instantaneous – Note: L2 lines could be bigger than L1 lines
Line state
Line state
Line state
Line state
Cache states: 00 = invalid 01 = shared 10 = modified
Parallel Architectures – 2019-20

Snooping with Multi-Level Hierarchies
1. Maintain inclusion property
– Lines in L1 must also be in L2 → no data is found solely in L1, so no risk of
missing a relevant transaction when snooping at L2
– LinesinMstateinL1mustalsobeinMstateinL2→snoopingcontrollerat
L2 can identify all data that is modified locally
2. Propagate coherence transactions to L1 as well.
▪ Propagate all transactions from to L1, whether relevant or not
▪ Keep extra state in the L2 lines to tell whether the line is also present in L1 or not
(inclusion bits). If it is present in L2, but inclusion bits say it is not present in L1,
no need to propagate transaction to L1.
Parallel Architectures – 2019-20 !9

Snooping with Multi-Level Hierarchies
Maintaining inclusion property
Assume: L1: associativity a1, number of sets n1, block size b1
L2: associativity a2, number of sets n2, block size b2 – Difficulty:Replacementpolicy(e.g.,LRU)
Assume: a1=a2=2; b1=b2; n2=k*n1; lines m1, m2, and m3 map to same set in L1 and the same set in L2; initially m1 is present in L1 and L2
L1 2 miss fill 5
8 miss fill 11
L2 3 miss fill 4
miss fill 10
Parallel Architectures – 2019-20

Snooping with Multi-Level Hierarchies
Maintaining inclusion property
Assume: L1: associativity a1, number of sets n1, block size b1
L2: associativity a2, number of sets n2, block size b2 – Difficulty:Differentlinesizes
Assume: a1=a2=1; b1=1, b2=2; n1=4, n2=8
Thus, words w0 and w17 can coexist in L1, but not in L2
Parallel Architectures – 2019-20 !11

Snooping with Multi-Level Hierarchies
Maintaining inclusion property
– Most combinations of L1/L2 size, associativity, and line size do not
automatically lead to inclusion
– Static solution: One solution is to have a1=1, a2≥1, b1=b2, and n1≤n2
– Dynamicsolution:MorecommonsolutionistoinvalidatetheL1line(or
lines, if b1CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com