CS505 DISTRIBUTED SYSTEMS CONSENSUS & CONSISTENCY
Yongle Zhang
More Properties: Consensus
Copyright By PowCoder代写 加微信 powcoder
§ Traditional formulation of consensus: several nodes want to come to agreement about a single value
• In context of total order broadcast: this value is the next message to deliver
• In our lab: this value is “who is primary”
§ Definition
• Termination: All non-faulty processes eventually decide on a value • Agreement: All processes that decide do so on the same value
• Validity: Value decided must have been proposed by some process
System Model / Failure Model
§ System model
• Node model (crashes): correct, fail-stop, fail-recovery, byzantine
• Network model (message loss): reliable, fair-loss, arbitrary • Timing model (latency): synchronous/asynchronous
§ Assume bidirectional point-to-point communication between two nodes, with one of:
• Reliable (perfect) links:
A message is received if and only if it is sent. Messages may be reordered.
• Fair-loss links:
Messages may be lost, duplicated, or reordered.
If you keep retrying, a message eventually gets through.
• Arbitrary links (active adversary):
A malicious adversary may interfere with messages (eavesdrop, modify, drop, spoof, replay).
§ Each node executes a specified algorithm, assuming one of the following:
• Crash-stop (fail-stop):
A node is faulty if it crashes (at any moment). After crashing, it stops executing forever.
• Crash-recovery (fail-recovery):
A node may crash at any moment, losing its in-memory state. It may resume executing sometime later.
• Byzantine (fail-arbitrary):
A node is faulty if it deviates from the algorithm. Faulty nodes may do anything, including crashing or malicious behaviour.
§ A node that is not faulty is called “correct” 5
§ Assume one of the following for network and nodes:
• Synchronous:
Message latency no greater than a known upper bound. Nodes execute algorithm at a known speed.
• Partially synchronous:
The system is asynchronous for some finite (but unknown) periods of time, synchronous otherwise.
• Asynchronous:
Messages can be delayed arbitrarily. Nodes can pause execution arbitrarily. No timing guarantees at all.
§ Note: other parts of computer science use the terms “synchronous” and “asynchronous” differently.
Violations of synchrony in practice
§ Networks usually have quite predictable latency, which can occasionally increase:
• Message loss requiring retry
• Congestion/contention causing queueing
• Network/route reconfiguration
§ Nodes usually execute code at a predictable speed, with
occasional pauses:
• Operating system scheduling issues, e.g. priority inversion
• Stop-the-world garbage collection pauses
• Page faults, swap, thrashing
§ Real-time operating systems (RTOS) provide scheduling
guarantees, but most distributed systems do not use RTOS
Consensus is meaningful wrt. system models
§ Primary backup replication
• One correct node (view server) & other nodes fail-stop, fair-loss
network, partially synchronous timing
§ Paxos, Multi-Paxos, Raft, Viewstamped Replication, Zab
• f+1 correct nodes (e.g., in a total of n=2f+1 nodes) & other nodes fail-
recovery, fair-loss network, partially synchronous timing
• f faulty nodes (n=3f+1) & faulty nodes byzantine, arbitrary network,
partially asynchronous timing
• f+1 correct nodes (n=2f+1) & other nodes byzantine, arbitrary network, asynchronous timing
When & Why Consensus? E.g.
§ Leader Election
• You want one process to play a distinguished role and you want all
other processes to know about it.
§ Total Order Broadcast
• You want all processes to deliver the same messages in the same order.
§ Distributed Mutual Exclusion
• You want processes to take turns getting mutually exclusive access to some shared resource.
§ A family of algorithms
§ Roles of processes • Proposers
• Acceptors
• Learners
§ A process can be any even all of these 3 roles.
Terminology
§ Value: a possible operation to put in the next slot in the operation log (letter values)
§ Proposal: to select a value; proposals are uniquely numbered
§ Accept: of a specific proposal, value
§ Chosen: Proposal/value accepted by a majority
§ Learned: Fact that proposal is chosen is known
§ Know what is the majority § Able to persist data
• remember what it proposed to survive crash recovery § Unique proposal number/ID
§ Single value
A Paxos Run
P1 A1 A2 A4 L1 L2 prepare (5)
A Paxos Run
P1 A1 A2 A4 L1 L2 prepare (5)
promise(5) promise(5)
A Paxos Run
A1 A2 A4 prepare (5)
promise(5) promise(5)
accept(5,0)
A Paxos Run
P1 A1 A2 A4 L1 L2 prepare (5)
promise(5) promise(5)
accept(5,0)
accepted(5,0) accepted(5,0)
A Paxos Run
P1 A1 A2 A4 L1 L2 prepare (5)
promise(5) promise(5)
accept(5,0)
accepted(5,0) accepted(5,0)
Consensus reached!
A Paxos Run
P1 A1 A2 A4 prepare (5)
promise(5) promise(5)
accept(5,0)
accepted(5,0) accepted(5,0)
Consensus reached!
Assigning Proposal Numbers
§ Proposal numbers must be unique and infinite § A proposal number server won’t work…
§ Instead, assign each proposer an infinite slice
§ Proposer i of N gets: i, i+N, i+2N, i+3N, … • 0,4,8,12,16,…
• 1,5,9,13,17,…
• 2,6,10,14,18,…
• 3,7,11,15,19,…
Majorities
§ Why does Paxos use majorities?
• Majorities intersect: for any two majorities S and S’, there is some node in both S and S’
1. Proposer
• It sends a “prepare(ID)” to the majority (or all) of Acceptors • Timeout? Retry with higher ID (local)
1. Proposer
• It sends a “prepare(ID)” to the majority (or all) of Acceptors • Timeout? Retry with higher ID (local)
2. Acceptor
• When it gets a prepare(n) message, checks if it has promised to ignore requests with a lower ID
• Yes: ignores it.
• it now promises to ignore any request (prepare/accept) with lower ID • (?) reply to Proposer with a promise(n) message
3. Proposer
• Once it gets promise message for a specific ID (promise(n)) from majority of the Acceptors
• It sends an accept(n, val) to a majority (or all) of Acceptors. • (?) it picks any value it wants
3. Proposer
• Once it gets promise message for a specific ID (promise(n)) from majority of the Acceptors
• It sends an accept(n, val) to a majority (or all) of Acceptors. • (?) it picks any value it wants
4. Acceptor
• Once it receives an accept(n, val) request, check if it promised to ignore requests with ID as n
• Yes: ignores this request.
• No: Reply with accepted(n, val) to Proposer and all Learners.
Milestones
§ If a majority of Acceptors promise, no requests with ID < promised ID can get through
§ If a majority of Acceptors has accepted (n, val), consensus is reached.
§ If a proposer/learner gets accepted(n, val) for a specific ID n, they know that consensus has reached on val.
A Paxos Run
A1 A2 A4 prepare (5)
promise(5) promise(5)
accept(5,0)
Milestone1
accepted(5,0) accepted(5,0)
Consensus reached!
A Paxos Run
A1 A2 A4 P2
prepare (5)
promise(5) promise(5)
accept(5,0)
prepare(4)
What happens?
accepted(5,0) accepted(5,0)
A Paxos Run
A1 A2 A4 P2
prepare (5)
promise(5) promise(5)
accept(5,0)
prepare(4)
promise(4)
accepted(5,0) accepted(5,0)
A Paxos Run
A1 A2 A4 P2
prepare (5)
promise(5) promise(5)
accept(5,0)
prepare(4)
promise(4)
accepted(5,0) accepted(5,0)
prepare(6)
What happens?
Remember, we want to agree on single value
2. Acceptor
• When it gets a prepare(n) message, checks if it has promised to ignore requests with a lower ID
• Yes: ignores it.
• it now promises to ignore any request (prepare/accept) with lower ID • (?) reply to Proposer with a promise(n) message
2. Acceptor
• When it gets a prepare(n) message, checks if it has promised to ignore requests with a lower ID
• Yes: ignores it.
• it now promises to ignore any request (prepare/accept) with lower ID • Has it accepted anything?
• Yes: (accepted (n-prev, val-prev)) reply with promise(n, (n-prev, val-prev))
A Paxos Run
A1 A2 A4 P2
prepare (5)
promise(5) promise(5)
accept(5,0)
prepare(4)
promise(4)
accepted(5,0) accepted(5,0)
promise(6, (5,0))
prepare(6) promise(6)
What happens?
Remember, we want to agree on single value
3. Proposer
• Once it gets promise message for a specific ID (promise(n)) from majority of the Acceptors
• It sends an accept(n, val) to a majority (or all) of Acceptors. • (?) it picks any value it wants
3. Proposer
• Once it gets promise message for a specific ID (promise(n)) from majority of the Acceptors
• It sends an accept(n, val) to a majority (or all) of Acceptors.
• Hasitgotanyalreadyacceptedvaluefrompromises?
• No: it picks any value it wants.
• Yes: (e.g., promise(n2, (n1, val1)), promise(n2, (n0, val0))) it picks the value with the highest n-prev
A Paxos Run
A1 A2 A4 P2
prepare (5)
promise(5) promise(5)
accept(5,0)
prepare(4)
promise(4)
accepted(5,0) accepted(5,0)
prepare(6)
promise(6) accept(6, 0)
promise(6, (5,0))
accepted(6, 0)
A Paxos Run
A1 A2 A4 P2
prepare (5)
promise(5) promise(5)
accept(5,0)
prepare(4)
promise(4)
accepted(5,0) accepted(5,0)
prepare(6)
promise(6) accept(6, 0)
promise(6, (5,0))
accepted(6, 0)
A Paxos Run
A1 A2 A4 P2
prepare (5)
promise(5) promise(5)
accept(5,0)
prepare(4)
promise(4)
accepted(5,0) accepted(5,0)
prepare(6)
promise(6) accept(6, 0)
promise(6, (5,0))
accepted(6, 0)
P1 A1 A2 A4 P2 prepare (5)
promise(5) promise(5)
A1 A2 A4 P2
prepare (5)
prepare(6)
promise(5) promise(5)
promise(6)
A1 A2 A4 P2
prepare (5)
promise(5) promise(5)
prepare(6)
prepare (7)
promise(7) promise(7)
promise(6)
A1 A2 A4 P2
prepare (5)
promise(5) promise(5)
prepare(6)
promise(6) prepare(8)
promise(8)
prepare (7)
promise(7) promise(7)
§ Randomized delay, exponential backoff § Leader election (?)
Brief Intro to Multi Paxos
§ Totally Ordered Broadcast boils down to repeated consensus on single value
• Challenges?
• A lot of messages
Brief Intro to Multi Paxos
P1 A1 A2 A4 prepare (5)
promise(5) promise(5)
accept(5,val1,0) accepted(5,val1,0)
accept(5,val2,1)
accepted(5,val2,1) accepted(5,val2,1)
Brief Intro to Multi Paxos
A1 A2 A4 prepare (5)
promise(5) promise(5)
accept(5,val1,0) accepted(5,val1,0)
accept(5,val2,1)
accepted(5,val2,1) accepted(5,val2,1)
What if another proposer starts proposing?
Brief Intro to Multi Paxos
A1 A2 A4 prepare (5)
promise(5) promise(5)
accept(5,val1,0) accepted(5,val1,0)
accept(5,val2,1)
accepted(5,val2,1) accepted(5,val2,1)
What if another proposer starts proposing? Resolves the same way as Paxos
Brief Intro to Multi Paxos
A1 A2 A4 prepare (5)
promise(5) promise(5)
What if another proposer starts proposing? Resolves the same way as Paxos
accept(5,val1,0) accepted(5,val1,0)
accept(5,val2,1)
accepted(5,val2,1) accepted(5,val2,1)
prepare(6)
promise(6)
Brief Intro to Multi Paxos
A1 A2 A4 prepare (5)
promise(5) promise(5)
What if another proposer starts proposing? Resolves the same way as Paxos
accept(5,val1,0) accepted(5,val1,0)
accept(5,val2,1)
accepted(5,val2,1) accepted(5,val2,1)
prepare(6)
promise(6) promise(6,
Brief Intro to Multi Paxos
P1 A1 A2 A4 P2
accept(5,val2,1)
accepted(5,val2,1) accepted(5,val2,1)
prepare(6)
promise(6)
promise(6,
accepted(6,val3,2)
Brief Intro to Multi Paxos
P1 A1 A2 A4 P2
accept(5,val2,1)
accepted(5,val2,1) accepted(5,val2,1)
prepare(6)
promise(6)
promise(6,
accept(6,val1,2)
What happens if P2 proposes a value for a decided pair?
Brief Intro to Multi Paxos
P1 A1 A2 A4 P2
accept(5,val2,1)
accepted(5,val2,1) accepted(5,val2,1)
prepare(6)
promise(6)
promise(6,
accept(6,val1,0)
What happens if P2 proposes a value
for a decided pair?
Cannot happen.
Single-value paxos picks the accepted value with the highest proposal number.
Paxos & Fault Tolerance
§ How many acceptors can crash?
• A minority
• If f is the number of acceptor failures you want to tolerate, you need 2f+1 acceptors
§ How many proposers can crash?
• All but one
• If f is the number of proposer failures you want to tolerate, you need f+1 proposers
Paxos & Fault Tolerance
§ How does Paxos under work omission faults? (message loss) • Some of my message lost?
• If proposer send message to ALL accepters • All my message lost?
Consensus & FLP
§ Properties
• Termination: All non-faulty processes eventually decide on a value
• Agreement: All processes that decide do so on the same value
• Simplest consensus?
• Validity: Value decided must have been proposed by some process (any,
not just their own)
§ Impossible to satisfy all 3! (async, fail-stop) • Which to sacrifice?
§ FLP result (Fischer, Lynch, Paterson)
• There is no deterministic consensus algorithm that is guaranteed to 54 terminate in an asynchronous crash-stop system model.
Paxos & FLP
§ Paxos is always safe–despite asynchrony
§ Once a leader is elected, Paxos is live.
• To be live, Paxos requires a single leader
• “Leader election” is impossible in an asynchronous system
§ Given FLP, Paxos is the next best thing:
• always safe, and live during periods of synchrony
Challenges for Multi Paxos
§ Performance optimization • Leader election
• Eliminate prepare requests
§ Configuration changes
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com