The University of Sydney Page 1
COMP3221: Distributed
Systems
Course Review
Unit Coordinator
Dr Nguyen Tran
School of Computer Science
The University of Sydney Page 2
Outline
– What to expect at the final exam ?
– Structure
– Model questions and answers
The University of Sydney Page 3
Final Exam
COMP3221 Course Review
The University of Sydney Page 4
Final Exam
– To Pass this course, you must
– Score at least 50% overall, and
– Score at least 40% in the final exam
– EXAM CONDITIONS:
– Canvas Quiz: multiple choice (MCA) and essay type questions
– EXAM WRITING TIME: 2 hours
– READING TIME: 10 minutes
– MATERIALS TO BE SUPPLIED TO STUDENTS:
– None
The University of Sydney Page 5
Practice Questions
COMP3221 Course Review
The University of Sydney Page 6
Exercise 1
– 1.1 Give one example of a problem that can occur in real life
when a distributed system, which uses message passing, suffers
from a failure? Even though there might be multiple examples,
give only one example. [2 pts]
The University of Sydney Page 7
Exercise 1
– 1.2 What is the general definition of a distributed system used
in this course? [3 pts]
– Sample Answer:
– A collection of independent computers that appears to its users
as a single coherent system.
The University of Sydney Page 8
Exercise 1
– Consider the distributed execution represented in Figure 1.
Indicate the vector clock associated with each event. [6 pts]
The University of Sydney Page 9
Logical time
– Definition of the relation called “happens-before”
– “a happens before b”, denoted by a → b, is true if:
– a and b are events from the same process such that a occurs before b, or
– if a is the event of a message being sent by one process and b is the event
of the same message being received by another process, or
– it exists some event c such that a → c and c → b (i.e., transitive closure)
– If two events, x and y, happen in different processes that do not
exchange messages (not even indirectly via third parties);
– then x → y is not true, but neither is y → x.
– In this case, x and y are said to be Concurrent
The University of Sydney Page 10
Vector clocks
– Goal: to make also the “did-not-happen-before” relation
visible
– Key idea:
– each process should also maintain an approximation of the clock of
other processes
– let’s introduce a function vector clock VC: events ↦ integers × processes
– Vector clocks algorithm:
1. Each process pa keeps a vector VCa of integers for each process k.
2. Before executing an event (i.e. sending a message to another process
or an internal event), pa increments VCa, i.e. VCa ←VCa + 1.
3. When process pa sends a message m to process pb, it sends VCa as the
time stamp of the message, i.e. ts(m)=VCa
4. Upon the receipt of a message, process pb adjust its own local counter
as VCb[k] ← max{VCb[k], ts(m)[k]} for each process k.
5. pb executes step 2 and delivers the message to the application
The University of Sydney Page 11
Vector clocks
Example:
– Processes: p0, p1, p2
– Events: a, b, c, d, e, f
– a → b, a → c, a → d, etc.
– e ↛ f and f ↛ e
VC(i) < VC(j), if and only if i → j
VC0= (1,0,0)
p0
p1
p2
time
VC1= (1,2,0)
VCo= (2,2,0)
VC2= (1,2,1) VC2= (1,2,2)
a e
b
c
d f
VC1= (1,1,0)
The University of Sydney Page 12
Lamport’s logical clocks
– Goal: to make the “happens-before” relation visible to processes
– Key idea:
– each process should keep track of the order in which events appear to take
place locally
– let’s introduce a function logical clock C: events ↦ integers
– Lamport’s logical clocks algorithm:
1. Each process pa maintains a local counter Ca.
2. Before executing an event (i.e. sending a message to another process or
an internal event), pa increments Ca, i.e. Ca ← Ca + 1.
3. When process pa sends a message m to process pb, it sends Ca as the time
stamp of the message.
4. Upon the receipt of a message, process pb adjust its own local counter as
Cb ← max{Ca, Cb}
5. pb executes step 2 and delivers the message to the application
– Result: If Pa→ Pb then Ca < Cb
The University of Sydney Page 13
Lamport’s logical clocks
Example:
– Processes: p0, p1, p2
– Events: a, b, c, d, e, f, g, h, I, j, k, m, n, q, r
– c → n, j → k, e → h, etc.
– a and b are concurrent and a and d are concurrent but b → d
b
d
a c e
f
g
h
j
k
m n r
0
0
0 3
42
1
21
2
4 5
6
5
6
3 7
7
C0=
C1=
C2=
p0
p1
p2
i q
time
The University of Sydney Page 14
Exercise 1
– In the distributed execution of the totally ordered multicast
represented in Figure 3, where a circle represents message
reception and a square represents reception of the last ack,
write the corresponding timestamp ABOVE the event on the
diagram. [8 pts]
4
2
3
A
B
A
A
B
B
C
C
C
B
B
B
A
A
A
C
C
C
The University of Sydney Page 15
Multicast
– Goal: broadcasting messages that got delivered in the same order at all
nodes
– Each process maintains a logical clock and a priority queue of messages
(waiting to be delivered to the top layer)
1. A processor pi increments its logical clock and fifo-multicasts a message with the
current logical clock as the timestamp message
2. Any incoming message is queued in a priority queue according to its temporary
timestamp, and acknowledged using a fifo-multicast message to every other
processes (including the sender)
3. Once all processes have acknowledged, the sender sets the definitive clock as the
current clock it has for this message
4. A message in the queue is delivered once there is a message, acknowledged by
all, from each process with a larger timestamp in the queue
Totally ordered multicast
The University of Sydney Page 16
Multicast
Totally ordered broadcast algorithm (con’t)
– Initial clocks are 3, 2, 4 at resp. processes 1, 2, 3; messages are A, B, C
– Circles represent msg reception
– Squares represent the last ack reception for a particular message
– Upon the receipt of a message, each process pi adjust its own local clock Ci ←
max{Ci, ts(m)} and increments its clock by 1
– Upon the receipts of an acknowledgement, pi adjust its own local clock Ci ←
max{Ci, ts(m)} and does not increment the clock
– Use the notation
to illustrate the clock value after the reception of an ack.
– Example: Value 7.3 represents the message was received at process 3 at clock
7. Then, max{local, received} is taken to reach the new local clock.
–
The University of Sydney Page 17
Exercise 1
– In what order will the messages A, B, C be delivered in the
execution of Figure 2? State the rule that controls message
delivery. [3 pts]
The University of Sydney Page 18
Exercise 2
– Draw the final routing tables of each node obtained with RIP
on the communication graph depicted in Figure 5 where the
label on each edge is its identifier and each node represents a
process.
B
D
G
A
FE
C
1
2
3 4
5
6
7 8
The University of Sydney Page 19
iterative, asynchronous: each
local iteration caused by:
– local link cost change
– DV update message from
neighbor
distributed:
– each node notifies
neighbors only when its DV
changes
– neighbors then notify their
neighbors if necessary
wait for (change in local link
cost or msg from neighbor)
recompute estimates
if DV to any dest has changed,
notify neighbors
each node:
Distance vector algorithm
The University of Sydney Page 20
x y z
x
y
z
0 2 7
∞ ∞ ∞
∞ ∞ ∞
fr
om
cost to
fro
m
fr
om
x y z
x
y
z
0
x y z
x
y
z
∞ ∞
∞ ∞ ∞
cost to
x y z
x
y
z
∞ ∞ ∞
7 1 0
cost to
∞
2 0 1
∞ ∞ ∞
2 0 1
7 1 0
time
x z
12
7
y
node x
table
Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)}
= min{2+0 , 7+1} = 2
Dx(z) = min{c(x,y) +
Dy(z), c(x,z) + Dz(z)}
= min{2+1 , 7+0} = 3
32
node y
table
node z
table
cost to
fr
om
The University of Sydney Page 21
x y z
x
y
z
0 2 3
fr
om
cost to
x y z
x
y
z
0 2 7
fr
om
cost to
x y z
x
y
z
0 2 3
fr
om
cost to
x y z
x
y
z
0 2 3
fr
om
cost to
x y z
x
y
z
0 2 7
fr
om
cost to
2 0 1
7 1 0
2 0 1
3 1 0
2 0 1
3 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x y z
x
y
z
0 2 7
∞ ∞ ∞
∞ ∞ ∞
fr
om
cost to
fro
m
fr
om
x y z
x
y
z
0
x y z
x
y
z
∞ ∞
∞ ∞ ∞
cost to
x y z
x
y
z
∞ ∞ ∞
7 1 0
cost to
∞
2 0 1
∞ ∞ ∞
2 0 1
7 1 0
time
x z
12
7
y
node x
table
Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)}
= min{2+0 , 7+1} = 2
Dx(z) = min{c(x,y) +
Dy(z), c(x,z) + Dz(z)}
= min{2+1 , 7+0} = 3
32
node y
table
node z
table
cost to
fr
om
The University of Sydney Page 22
Exercise 2 B
D
G
A
FE
C
1
2
3 4
5
6
7 8
– RIP Routing tables for all nodes
The University of Sydney Page 23
w3
4
v
x
u
5
3
7 4
y
8
z
2
7
9
Dijkstra’s algorithm: example
Step N’
D(v)
p(v)
0
1
2
3
4
5
D(w)
p(w)
D(x)
p(x)
D(y)
p(y)
D(z)
p(z)
u ∞ ∞ 7,u 3,u 5,u
uw ∞ 11,w 6,w 5,u
14,x 11,w 6,wuwx
uwxv 14,x 10,v
uwxvy 12,y
Notes:
• Construct shortest path tree by
tracing predecessor nodes
• Ties can exist (can be broken
arbitrarily)
uwxvyz
The University of Sydney Page 24
Comparison of LS and DV algorithms
Message complexity
– LS: with n nodes, E links, O(nE)
msgs sent
– DV: exchange between neighbors
only
– convergence time varies
Speed of convergence
– LS: O(n2) algorithm requires O(nE)
msgs
– may have oscillations
– DV: convergence time varies
– may be routing loops
– count-to-infinity problem
Robustness: what happens if
router malfunctions?
LS:
– node can advertise incorrect
link cost
– each node computes only its
own table
DV:
– DV node can advertise
incorrect path cost
– each node’s table used by
others
• error propagate thru
network
The University of Sydney Page 25
Exercise 2
– Draw the final routing tables of each node obtained with RIP
on the communication graph depicted in Figure 5 where the
label on each edge is its identifier and each node represents a
process.
B
D
G
A
FE
C
1
2
3 4
5
6
7 8
2
4
2
2
1
11
6
The University of Sydney Page 26
Exercise 3
– For each of the following distributed execution in which all
variables have initially the value 0, indicate whether it is
linearizable.
time
R(x,2) W(x,3) W(y,2)
R(y,2) W(x,2)
The University of Sydney Page 27
Exercise 4
– Let process p2 try to synchronise with process p1 by running Cristian’s
algorithm. Assume that both clocks increase at the same rate and that
initially (at time T1 when the query message is sent) the clock of p1 is CPIT1 =
10h38mn35s800ms while the clock of p2 is CP2T1 = 10h38mn39s750ms.
Consider the example depicted in Figure 10 where the query message
takes 0.3s to be delivered while the response takes 0.4s and where 0.2s
elapse between p1 receiving the query and sending the response. What is
the value of δp1, the offset of p1 relative to p2?
p1
p2
.3s .2s .4s
The University of Sydney Page 28
Key Contents of the 2nd part of semester
COMP3221 Course Review
The University of Sydney Page 29
2nd Part of Semester
– Blockchain
– Distributed Machine Learning
– Security
The University of Sydney Page 30
Student Feedback – USS Survey
– Your feedback is very important
– So far only 14% response rate (~10 students)
– Let’s take 5-10 minutes to fill out the Unit of Study Survey
– https://student-surveys.sydney.edu.au/students/
– Have a chance winning a range of Apple products including a
64gb Apple iPad Air, an Apple Watch and JB HiFi Gift Cards.
https://student-surveys.sydney.edu.au/students/
The University of Sydney Page 31
Blockchain
1) Roles of miners
2) Proof of Work
3) Can the blockchain be attacked? How ?
The University of Sydney Page 32
Distributed Machine Learning
1) Linear Regression
2) Distributed Optimization (GD/mini-batch SGD)
3) Logistic Regression
The University of Sydney Page 33
Security
1) Encryption – RSA Algorithm
2) Authentication
3) Integrity
4) Privacy
The University of Sydney Page 34
Why this course ?
– Distributed Computing Systems are everywhere !
– Practically you can not avoid them.
– Knowledge and experience in Distributed Systems will be
useful;
– For your final year thesis project
– To improve your productivity
– Pursue your passion as a hobby
– Just for Fun !
– Improve your chances of getting a better job
– COMP3221 is about;
– What is a distributed system?
– How it works?
– How to run yours?
The University of Sydney Page 35
Thank You !
– Remember – Final Exam schedule
• Check your location:
– Good Luck !