Operating Systems
Distributed Systems
4160 – Distributed and Parallel Computing
Copyright By PowCoder代写 加微信 powcoder
Goal: High Performance Replicated State Machine
• Recall that Raft isn’t for high performance
• All operations go to the one leader
• If for strict correctness (linearizability), client can’t read from a follower but the leader
– Since a follower may not in the majority
• High performance metric:
– Throughput: E.g., 10000 operations/second
• A KV-store that follows Raft (called ZAB there) when write but read any follower
Raft’s performance (back-of-the-envelop)
• 3 servers (1 leader, 2 followers) //same data center
Best case: (leader+ 1follower=majority)
2 roundtrip 2ms
+ 2 disk write 4ms
1 put = 6ms
Throughput = 180 puts/s
AppendEntries()
Zookeeper throughput • Write-only throughput
– 21000 puts/s
• Other things that you can observe:
– For write-only
• More servers, slower
– Needs a longer time to get the majority
– For read-heavy
• More servers, faster
Zookeeper’s magic design choice (tradeoff) • 1)
• 2) Read the follower!
– Fast but might be wrong
Synchronous Put
Asynchronous Put (batching)
àPut(x) ßOk àPut(y) ßOk
àPut(x) àPut (y) …
Get()àcan get either the latest (if reading the committed replica in the majority) or stale value (if reading the replica in the minority)
1st Get(x)àget the latest value (from the updated replica)
2nd Get(x)àget the old value (from the slow replica)
Violate Linearizability
• “Behavelikeasinglemachine”
1. A total order of operations
2. Order match real-time
3. Read op returns latest write
So, they define another notion of ‘correct’
1. Still linearizable write (through leader)
2. FIFO client order
– A client can read its own latest write
– A client can read the others committed prefix
• But once read a prefix, it can’t read an even older prefix the next time
Zookeeper’s twist of Raft
• Client’s put(x) gets the commitIdx of from leader’s log in addition
• Then when client get(y) to a follower f
– Also tells f the leader’s commitIdx that the client knows – So, f won’t reply until its commitIdx has caught up
So, Zookeeper rules out case 1 • But case 0 could still happen
– Say another client that has never written anything tries to get() from another slow follower…
Get()àcan get either the latest (if reading the committed replica in the majority) or stale value (if reading the replica in the minority)
1st Get(x)àget the latest value (from the updated replica)
2nd Get()àget the old value (from the slow replica)
Now correctness burden on application developers
• Since programmers can’t reason the system like a single machine anymore, it would make their life very difficult
• Offer an API from multi-core synchronization to ease reasoning
• E.g., get(x, watch=“on”)
• client.CallBackForWatch() { //zookeeper will call this
function if the value of x changes resume();
In case you know …
• test-and-set and compare-and-swap in multi-core era (atomic instructions)
• Optimistic concurrency control – Versioning
• Their client API will expose the version number in their API
– Value, version = getData(x) //return version of x
– setData(x, version) //only set if version test passes
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com