ECS656U/ECS796P
Distributed Systems
What this lecture is about
Copyright By PowCoder代写 加微信 powcoder
Distributed consensus algorithms
• Introduction to consensus algorithms • Paxos
What is that?
Distributed consensus algorithms deal with reaching agreement among a group of processes connected by an unreliable communications network.
What is that?
Let’s go to the pub!
Let’s go to the pub!
What is that?
What is that?
Meanwhile, someone that got the message delayed, or was not listening…
Let’s go to the pub!
Cinema instead?
What is that?
Let’s go to the pub!
instead? Yes!
Someone needs to back off.
What is that?
Let’s go to the pub!
Ok. I prefer Cinema
instead? Yes!
We are in democracy. Majority wins.
What is that?
Let’s go to the pub!
Ok. I prefer Cinema
instead? Yes!
Ok then. Cinema.
So, what is consensus?
• Consensus is agreeing on one result
• Once a majority agrees on a proposal, that is the consensus
• The reached consensus can be eventually known by everyone
• The involved parties want to agree on any result, not on their proposal
• Communication channels may be faulty, that is, messages can get lost
Why do we care?
Example: You transfer money from one person to another. The money went out one account but never arrived the other one because a server or the network itself somewhere had failed.
Why do we care?
We need to be able to reliably reach agreement even though there are failures
A bit of history
• 1985: FLP (Fisher-Lynch-Patterson) impossibility paper.
we cannot guarantee agreement in an asynchronous system where even one host
might fail
The problem of consensus was known to be solvable in a synchronous setting, where processes could proceed in simultaneous steps
• The synchronous solution was resilient to faults: you can easily detect them!
A bit of history
• 1985: FLP (Fisher-Lynch-Patterson) impossibility paper.
we cannot guarantee agreement in an asynchronous system where even one host
might fail
Asynchronous means:
• No upper bound on processing time
• No upper bound on clock drift rate
• No upper bound on networking delay
A bit of history
• 1985: FLP (Fisher-Lynch-Patterson) impossibility paper.
we cannot guarantee agreement in an asynchronous system where even one host
might fail.
We cannot detect reliably failures. We cannot know for sure the difference between a slow host/network and a failed host
A bit of history
• 1985: FLP (Fisher-Lynch-Patterson) impossibility paper.
we cannot guarantee agreement in an asynchronous system where even one host
might fail
The FLP result shows that in an asynchronous setting, where only one processor might crash, there is no distributed algorithm that solves the consensus problem
A bit of history – cont’d
• 1985: FLP impossibility paper
• 1989: Lamport: The Part-Time Parliament
• Distributed Systems are getting more important, thanks to Internet
How? (a bit of history)
• 1985: FLP impossibility paper
• 1989: Lamport: The Part-Time Parliament
• Distributed Systems are getting more important, thanks to Internet
Let’s remember the original WHY we could not guarantee agreement: We cannot detect reliably failures.
Can we categorize failures?
• Non-Byzantine
• Failed nodes stop communicating
with other nodes
• “Clean” failure
• Fail-stop behavior
• Byzantine
• Failed nodes will keep sending
• Incorrect and potentially misleading
• Failed node becomes a traitor
Two types of failures
• Non-Byzantine
• Failed nodes stop communicating
with other nodes
• “Clean” failure
• Fail-stop behavior
• Byzantine
• Failed nodes will keep sending
• Incorrect and potentially misleading
• Failed node becomes a traitor
They are defined as arbitrary deviations of a process from its assumed behavior, e.g., software bug, a hardware malfunction, or a malicious attack.
Two types of failures
• Non-Byzantine
• Failed nodes stop communicating
with other nodes
• “Clean” failure
• Fail-stop behavior
• Byzantine
• Failed nodes will keep sending
• Incorrect and potentially misleading
Two types of failures
• Failed node becomes a traitor Assumption: asynchronous, non-byzantine model
The goal for consensus
• We want agreement between processes (mutable states)
• Processes are concurrent, asynchronous and failure-prone
• Consensus: making a decision (liveness) which is also correct (safety)
Liveness property
• Liveness: guarantee that something good will happen
• Examples:
• Real world: “at least one of the athletes in the 100m final will win gold” is
• Consensus: All processes decide a value
Safety property
• Safety: guarantee that something bad will never happen
• Examples:
• Real world: A peace treaty between nations provide safety as war will never
• Consensus: no two processes decide on different value
Many slides from Ion Stoica presentation: (https://ucbrise.github.io/cs262a-spring2018/)
• Paxos is a family of protocols for solving consensus in a network of unreliable processors
• Paxos is a family of protocols for solving consensus in a network of unreliable processors
• Basic Paxos
• Multi-Paxos
• Cheap Paxos
• Fast Paxos
• Byzantine Paxos
• Generalized Paxos
• Paxos is a family of protocols for solving consensus in a network of unreliable processors
• Basic Paxos
• Multi-Paxos
• Cheap Paxos
• Fast Paxos
• Byzantine Paxos
• Generalized Paxos
Basic Paxos
• (one of the initial core developer of LaTeX!!!)
Basic Paxos
• (one of the initial core developer of LaTeX!!!)
• Deterministic and fault tolerant consensus protocol
Basic Paxos
• (one of the initial core developer of LaTeX!!!)
• Deterministic and fault tolerant consensus protocol
• Named after a Greek Island
(taken from the example Lamport carries on his paper about elections in the island)
Basic Paxos
• (one of the initial core developer of LaTeX!!!)
• Deterministic and fault tolerant consensus protocol
• Named after a Greek Island
• Guarantees consistent results
Does Paxos solve consensus?
• Provides safety and eventual liveness
• Only a value which has been proposed can be chosen
• Only a single value can be chosen
• A process never learns a value unless it was chosen
• Eventual liveness:
• If things go well, at some point in the future, consensus is eventually
reached. However, this is not guaranteed.
Does Paxos solve consensus?
• FLP result still applies: Paxos is not guaranteed to reach consensus
• There is NO time bound
• We talk about eventual liveness
So simple, so obvious
“In fact, it is among the simplest and most obvious of distributed algorithms.”
Simple pseudocode
A political analogy
• A part-time parliament
A political analogy
• Each round
• Phase 1: A leader is elected (election)
• Phase 2: Leader proposes a value (bill), processes acks • Phase 3: Leader multicast final value (law)
• Three types of roles
• Proposer: It receives a request from the client, and attempts to get a
quorum of acceptors to agree on it
• Acceptor: It is a participant in the maintenance of the distributed storage. A state change in a Paxos cluster does not occur until a majority (quorum) of acceptors agree upon it
• Learner: It learns the agreed upon value. They can be later queried to know what the consensus value was
In practice..
• Paxos nodes can take multiple roles, even all of them
• A single node can send proposals to other nodes, they can contribute to
reaching consensus and they learn the final agreed upon value • Paxos nodes must know how many acceptors a majority is
In practice..
• Paxos nodes must be persistent: they cannot forget what they accepted • Even if the communication channel is faulty, they cannot forget
• A Paxos run aims at reaching a single consensus
• Once a consensus is reached, it cannot progress to another consensus
• In order to reach another consensus, a different Paxos run must happen
• Time synchronization not required
• If you are in round j and hear a message from round j+1, abort everything and move to round j+1
• Each round consists of three phases
• Phase 1: A leader is elected (Election)
• Phase 2: Leader proposes a value, processes acks (Bill) • Phase 3: Leader multicasts final value (Law)
• Rounds are asynchronous
Phase 1 – Election
• Potential leader chooses a unique ballot ID, higher than anything it has seen so far • Sends ballot ID to all processes
• Processes respond to highest ballot ID
Please elect me!
Phase 1 – Election
• If majority (i.e., quorum) respond OK then you are the leader
• If no one has majority, start new round
Please elect me!
Phase 2 – Proposal (Bill)
• Leader sends proposal value v to all
• Use v=v’ if some process already decided in a previous round and sent you its decided value v’
• Otherwise propose its own value
• Recipient log on disk, and responds OK
Value v ok?
Please elect me!
Phase 3 – Decision (Law)
• If leader hears OKs from majority, it lets everyone know of the decision
• Recipients receive decisions, log it on disk
Value v ok?
Please elect me!
When is Consensus Achieved?
Value v ok?
Please elect me!
When is Consensus Achieved?
• When a majority of processes hear proposed value and accept it:
Value v ok?
Please elect me!
When is Consensus Achieved?
• When a majority of processes hear proposed value and accept it: • Are about to respond (or have responded) with OK!
• At this point decision has been made even though • Processes or even leader may not know!
• What if leader fails after that? Value v ok?
Please elect me!
Easy right?J
Value v ok?
Please elect me!
Easy right?J
• Let’s have a look in more details now…
Value v ok?
Please elect me!
Basic Paxos Protocol
Phase 1a: “Prepare”
Select proposal number* N and send a prepare(N) request to a quorum of acceptors.
* = record to stable storage
Basic Paxos Protocol
Phase 1a: “Prepare”
Select proposal number* N and send a prepare(N) request to a quorum of acceptors.
Phase 1b: “Promise”
If N > number of any previous promises or acceptances,
* promise to never accept any future proposal less than N,
– send a promise(N, U) response
(where U is the highest-numbered proposal accepted so far (if any))
* = record to stable storage
Basic Paxos Protocol
Phase 1a: “Prepare”
Select proposal number* N and send a prepare(N) request to a quorum of acceptors.
Phase 2a: “Accept!”
If proposer received promise responses from a quorum,
– send an accept(N, W) request to those acceptors
(where W is the value of the highest-numbered proposal among the promise
responses, or any value if no promise contained a proposal)
Phase 1b: “Promise”
If N > number of any previous promises or acceptances,
* promise to never accept any future proposal less than N,
– send a promise(N, U) response
(where U is the highest-numbered proposal accepted so far (if any))
* = record to stable storage
Basic Paxos Protocol
Phase 1a: “Prepare”
Select proposal number* N and send a prepare(N) request to a quorum of acceptors.
Phase 2a: “Accept!”
If proposer received promise responses from a quorum,
– send an accept(N, W) request to those acceptors
(where W is the value of the highest-numbered proposal among the promise
responses, or any value if no promise contained a proposal)
Phase 1b: “Promise”
If N > number of any previous promises or acceptances,
* promise to never accept any future proposal less than N,
– send a promise(N, U) response
(where U is the highest-numbered proposal accepted so far (if any))
Phase 2b: “Accepted”
If N >= number of any previous promise,
* accept the proposal
– send an accepted notification to the learner
* = record to stable storage
Milestones
• If the majority of acceptors promise, no ID < IDp can make it through
• If a majority of acceptors accept (IDp,value), consensus is reached. Consensus is
and will always be on a value
• If a proposer/learner gets the majority of accept for a specific IDp, they know that consensus has been reached on a value
Time 0: P1 wants to propose “A”
prepare(1) prepare(1)
Time 0: P1 wants to propose “A”
promise(1) promise(1)
prepare(1) prepare(1)
Time 0: P1 wants to propose “A”
promise(1) promise(1)
prepare(1) prepare(1)
accept(1, A) accept(1, A)
Time 0: P1 wants to propose “A”
promise(1) promise(1)
prepare(1) prepare(1)
accept(1, A) accept(1, A)
accepted(A) accepted(A)
If the acceptor has accepted something before..
A1 Highest Accept: (5, A) Highest Prepare: 15
Prepare(10)
prepare(10)
Highest Accept: (5, A) Highest Prepare: 8
It needs to reply with PROMISE ID and (accepted ID,
promise(10, (5, A))
A1 Highest Accept: (5, A) Highest Prepare: 15
prepare(10)
prepare(10)
Highest Accept: (5, A) Highest Prepare: 8
Highest Accept: (5, A) Highest Prepare: 10
The proposer needs to pick the value with the highest ID that it got
A1 Highest Accept: (5, A) Highest Prepare: 15
Assume it got promise for 10 from a quorum
Highest Accept: (5, A) Highest Prepare: 10
accept(10, A) accept(10, A)
Recap example: P1 want to propose value A
Recap example: P1 want to propose value A
prom(5) prom(5)
Recap example: P1 want to propose value A
prom(5) prom(5)
accept(5, A) accept(5, A)
accept(5, A)
prom(5) prom(5)
Recap example: P1 want to propose value A
accepted(A)
accept(5, A) accept(5, A)
accept(5, A)
accepted(A)
Recap Example
accepted(D)
accept(5,D) accept(5, D)
accept(5, D)
accepted(D)
prom(5, (2, B))
prom(5, (3, C)) prom(5, (4, D))
What can go wrong? (Liveness)
• Process fails
• still works as long as majority are up
• Leader fails
• Start another round
accept-request
What can go wrong? (Liveness)
• Message dropped
• If too flaky, start another round
• Note that anyone can start a round any time
accept-request
What can go wrong? (Liveness)
• Protocol may never end – tough luck, buddy!
• Impossibility result not violated
• If things go well sometime in the future, consensus reached!
• Example: two or more simultaneous proposer prepare
accept-request
A1: Highest accept; (5, A) Highest prepare: 8
promise(10, (5, A))
prepare(10)
A1: Highest accept; (5, A) Highest prepare: 10
promise(10, (5, A))
Promise(11,(5, A))
prepare(10) prepare(11)
A1: Highest accept; (5, A) Highest prepare: 11
promise(10, (5, A))
promise(11,(5, A))
prepare(10) prepare(11) accept(10, A)
A1: Highest accept; (5, A) Highest prepare: 11
promise(10, (5, A))
promise(11,(5, A))
promise(12, (5, A))
prepare(10) prepare(11) accept(10, A) prepare(12)
A1: Highest accept; (5, A) Highest prepare: 12
promise(10, (5, A))
promise(11,(5, A))
promise(12, (5, A))
promise(13,(5, A))
prepare(10) prepare(11) accept(10, A) prepare(12) prepare(13)
A1: Highest accept; (5, A) Highest prepare: 13
promise(10, (5, A))
promise(11,(5, A))
promise(12, (5, A))
promise(13,(5, A))
prepare(10) prepare(11) accept(10, A) prepare(12) prepare(13) So now?
promise(10, (5, A))
promise(11,(5, A))
promise(12, (5, A))
promise(13,(5, A))
prepare(10) prepare(11) accept(10, A) prepare(12) prepare(13)
This is a hot spot that can stall the Paxos run. A solution is to set an exponential back off in place.
promise(10, (5, A))
promise(11,(5, A))
promise(12, (5, A))
promise(13,(5, A))
prepare(10) prepare(11) accept(10, A) prepare(12) prepare(13) Indeed, if there is enough time....
P1 wants A, and P2 wants B
prom(5) prom(5)
prom(8) prom(8)
prep(8) prep(5)
Example: P1 wants A, and P2 wants B
prom(5) prom(5)
prom(8) prom(8)
accept(5, A) accept(5, A)
accept(5, A)
prep(8) prep(5)
Example: P1 wants A, and P2 wants B
back off time
prom(5) prom(5)
prom(8) prom(8)
accept(8, B)
accept(8, B)
accept(5, A) accept(5, A)
prep(8) prep(5)
accept(5, A)
accept(8, B)
Example: P1 wants A, and P2 wants B
prom(5) prom(5)
prom(8) prom(8)
accepted(8, B)
accept(8, B)
prep(8) prep(5)
accept(5, A)
accept(8, B)
accept(5, A) accept(5, A)
accept(8, B)
accepted(8, B)
• Acceptors might send “NACKs” responses if they are not going to accept a proposal. This would tell the proposer that it can stop its attempt to create consensus with proposal N
• We said the proposal number needs to be strictly increasing and globally unique. How to do it?
• Acceptors might send “NACKs” responses if they are not going to accept a proposal. This would tell the proposer that it can stop its attempt to create consensus with proposal N
• We said the proposal number needs to be strictly increasing and globally unique. How to do it?
Tick: set low-order bits to proposer’s (server) ID
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com