Operating Systems
Conflict Free Replicated Data
4160 – Distributed and Parallel Computing
Copyright By PowCoder代写 加微信 powcoder
Eventual consistency
One type of eventual consistency is about visibility –
users can see intermediate inconsistency
– as long as they eventually see a consistent view
• i.e., end-result-wise it is still (e.g., casual) consistent – But the visible inconsistency makes it very weak
• Many use-cases it might give you heart-attack
– e.g., “hey, you own the bank 1 million”
– 10 hours later, “Oh, sorry, your account is back to normal”
• But acceptable in some use cases:
– collaborative editing (you see ‘fast-forward’ sequences of intermediate
states while others are typing)
– Amazon’s first generation of shopping cart — DynamoDB
Eventual consistency != no consistent • Otherwise:
I like to eat apple
I like to eat apple
banana I like to eat
I like to eat banana
A1) Prepend (“I like to eat”)
B1) Delete
(“apple”) B2) Prepend
(“banana”)
A key to solve the problem: commutative
• As long as the operations are commutative, then they can reach the same final state (despite the intermediate states may diverge)
• E.g., 5 (+1) (+2) (+3)
– Alice ((5 + 1) + 2) + 3 = 11
• user-viewable states: 6, 8, 11 – Bob ((5+3) + 2) + 1 = 8
• user-viewable states: 8, 10, 11
So, the trick is
• Are the operations in your application commutable?
– For algebra, we know what operators are commutable, what not, e.g.,
• +, x commutable
• / not-commutable
– What about
– What is the theory to confirm which operation is
commutable, which one is not?
– If an operation is not, can we do something such that it becomes so?
Eventual Consistency • Order Theory
– http://jtfmumm.com/blog/2015/11/17/crdt-primer-1- defanging-order-theory/
– We basically want operations that are associative, commutative, idempotence such that the state transition graphs form a semi-join lattice
Operational Transform
• If an operation is not commutative, can we do something to make it so?
• Design an operational transform algorithm to wrap an operation to achieve deterministic identical result
– despite the potentially different input orders
• op’ = OT(op)
• Initial: empty
• Alice: add (“pass”, index=0, ts=1)
• Bob: add (“word”, index=0, ts=2)
• OT(add(w, i, t)) – If same index i?
• late comer’s i + s.length – E.g., In Alice:
• add (“word”, index=0, ts=2) from Bob becomes
• add (“word”, index=4)
Operational Transform
• So, here is the lattice part
– A proper OT has to ensure from the beginning state, despite different paths, reach the same end state
*ack: (ESTR4104, year 2021)
Full p2p OT algorithm is hard to implement
– Very complicated logic when >2 clients • If (user-id 1 > user-id 2 > user-id-3) then
– Operation command lost in network
– Peer not responding
– Peer wakes up and resends
• Appended “word” twice to “pass”
Google wave OT algorithm
• Not a peer-to-peer n2 communication algorithm
• All requests send to a centralized server – Then it becomes n-to-1 communications
– Way easier to transform
Conflict-free Replicated Data Type (CRDT)
• CmRDT (operation-based)
– Commutative replicated data types (CRDT ^_^’)
– Abstract OT as class (libraries) for easy re-use – Broadcast the operations
• CvRDT (state-based)
– Convergent replicated data types (CRDT again ^_^”) – Broadcast the states
CmRDT – insert
• Alice & Bob both see “apple” – Alice: insert “s” at index 4
– Bob: insert “!” at index 4
– Since the index 4 is ambigious
• May result in apple!s in Alice and apples! in Bob
– No ambiguity
– Turn a string of (non-unique) characters into a string of unique characters (uuid):
• E.g., “apple” -> “22” “78” “43” “90” “17”
– Alice: insert “s” after character 17
– Bob: insert “!” after character 17
CmRDT – insert
• So, both Alice and Bob got:
– “apple” -> 22 78 43 90 17 (“s”,“!”)
• Next, ensure the CRmDT read() function is deterministic:
– Say, in ascending order – “s” is 89, “!” is 99
CmRDT – delete by setting tombstone
• Alice & Bob both see “apple” (22 78 43 90 17)
– Alice: delete “e” à delete “17” à tombstone “17”
– Bob: insert “s”
• Insert “s” after 17
• Order 1: deleteàinsert
– Delete 17: 22 78 43 90 17
– Insert “s”: 22 78 43 90 17 “s”
• Order 2: insertàdelete
– Insert “s”: 22 78 43 90 17 “s” – Delete 17: 22 78 43 90 17 “s”
CvRDT – convergent replicated data types • Instead of broadcasting the operations
– Broadcast the state
• Receiver merges the states • Still uuid based
• set-based
• A string as a directed graph of uuid nodes:
• “apple”:22->78->43->90->17 – G=(V,E)
• E = {22->78, 78->43, 43->90, 90->17}
• Alice: delete “e” (tombstoning);
Bob: insert “!” (99) after e
– Alice: V = {22, 78, 43, 90, 17}; E = E
– Bob: V = {22, 78, 43, 90, 17, 99}; E+ =E U {17-> 99}
• State-merge using union
CmRDT vs CvRDT
CmRDT (operation-based)
CvRDT (state-based)
Use less bandwidth (as sending operation)
Use more bandwidth (as sending state)
More network traffic (as 1 network send per operation)
and missing one operation would cause trouble
Less network traffic (can skip some intermediate states and send the final state)
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com