程序代写代做代考 C game graph chain go algorithm Distributed Computing, COMP 4001 1

Distributed Computing, COMP 4001 1
“0
Fault Tolerant Broadcasting
Definition
is
faulty
.
of important
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 2
Outline
•0Broadcasting
• Fault-free Broadcasting
• Fault-Tolerant Broadcasting • Multiple Intitiators
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 3
Broadcasting (1/2) C
• Broadcasting refers to sending a message simultaneously to
many users.
• It is usually initiated by a user in a network.
– number of messages, and
• We are interested in ecient broadcasting, as measured by
– time required
to complete successfully the broadcast.
1k¥
usep#
• Broadcasting uses available communication channels.
– We must specify the communication channels to be used
and in what order.
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

llhrelesslhktwork

Ethernet
Flat routing
°

Distributed Computing, COMP 4001 4
Broadcasting (2/2)
• Broadcasting is a preferred routing me#thod because it is flat.
• Broadcasting is used in If I
– Eternet, – sleep – Wireless,
and other networks.
• Eciency of broadcasting depends on (and is also constrained
by) underlying graph G = (V, E).
• Broadcasting in general graphs is multihop.
– Typically, message transmission for broadcasting is based on building a BFS tree from the “broadcast initiating” node.
m%ooo
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 5
General Setting
• Consider a (strongly) connected network of N processes 0…N 1.
– This may be a multi-hop graph.
• Each process i has a “stable (unchanging) value” s(i)
associated with it.
• The goal is to devise an algorithm by which every process i can
broadcast its value s(i) to every other process in this system. – This may require multiple hops. 0
• At the end, each process i will have the set of all possible valuesVi ={s(k):0kN1}.
• Generally, the problem is solved with a so-called “heart beat” algorithm.
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001
6
Heartbeat Algorithms
i has
value
sci )
• Initially Vi = {s(i)}.
• To complete the broadcast: in roundsa every process i will
periodically
{1. send its current Vi along each of its outgoing channels,
these types of algorithms are called heartbeat algorithms.
• Two important issues need attention:
– The termination of the algorithm
– The message complexity
aEach round involves Send; Receive; Process;
2. receive whatever values have been received by it along the incoming channels
3. update Vi.i÷¥÷÷v • The operation resembles the pumping of blood in the heart, so
inn
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 7
Heartbeat Algorithm (1/2)
• No need to send Vi, if it has not changed since the last send operation: suces to send the incremental change only
• Each process i is associated with two sets of values:
– Vi denotes the current set of values collected so far,
– Wi will represent the last value of Vi sent along the outgoing channels so far.
• Let (i, j) represent the channel i ! j from i to j.
• The algorithm terminates when no process receives any new value, and every channel is empty.
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 o 8 Heartbeat Algorithm (2/2)
u
• The program for process i is given below. u
# steps
diameter of
• Correctness is proved in two steps.
– 1st step: show that when empty(i,k) holds, Wi ✓ Vk.
– 2nd step: show that at the end every process must have received the value s(i) from every other process i.
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 9
Fault-Tolerant
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 10
Main Question
• Is broadcasting possible in a network when some nodes (may) fail to transmit messages?
€arjIs
uo←
• Two main issues:
– In which networks?
– What does it mean nodes may fail? 0101110 OOllOl
byzantine :
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 11 Broadcasting
• In the sequel we look at broadcasting and its eciency when the underlying network is Kn (the complete graph on n nodes).
– In this setting, broadcasting is an instance of flat routing.
• Further, the communication model does not allow for “multicasting”, whereby a given node can communicate with specific nodes at a time.
can only send one message at
u no;
a lame. ✓4
Ng
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 12
Deterministic Strategies
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 13
Adamo
Calling
• Alice wants to organize a party for 121 students (including herself ).
• She does not know their email addresses, but0
– she has a list of all 120 students with their phone numbers
which was recently given to by every student.
multi casting model
non-
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 14
-Calling and Broadcasting
• Broadcasting depends on the medium (i.e., channels) used.
– If Alice can shout simultaneously to all of them then it takes only one step (Ethernet uses this idea!).
• However this may cause collisions and in any case it is not an option in our current study.
processor can talk to another processor at a time.
– We will approach tOhe problem imposing “message scheduling and processor coordination”.
•EEE
C
• Here the communication medium allows only phone calls. – This is the so-called point-to-point model: only one
– We also call this the phone call model.
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 15
Broadcasting
• In the phone call model, broadcasting itself would require the availability of 120 separate channels to make the phone calls:
– which may not be available.
• Alice could try to do 120 phone calls herself, – which would consume a lot of time.
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 16
New Idea: Assisted Broadcasting
• Any other nodesd can assist in the broadcast.
• Not only Alice can call but certain users can call other users.
– Therefore the design would require some form of coordination of who can send to whom and by when.
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 17 Alternative Approach (1/2)
• The first strategy that comes to mind is a silent post game resembling whispering:
– Alice just calls the first person on the list and asks him/her to call the next one on the list,
– who will then call the next one and asks him/her to call the
next one on the list,
– and so on,
– until everybody on the list has been reached.
mesg
msg student
Alice : student .
120 0
• The advantage of this strategy is that every student only has to make one call. cost is 1 message per
.
A ans Arg
• To make this work, an order of users must have been agreed on.
St. Evangelos Kranakis, Carleton University, SCS (November 7, 2020)
i
msg
.

Distributed Computing, COMP 4001 18
Alternative Approach (2/2)
• Since the calls have to be performed one after the other, a very long time can go by until all students have been reached.
– Algorithm imposes an underlying “Line Graph’ topology’. – For n students this takes time n 1
• Some issues:
– If just 10% of the students (i.e., 12 students) do not reach the next one on the list within the same day they were called, it takes at least 12 days until everybody has been informed.
– Even worse: if someone does not bother to call the next one on the list, the whole system will break down!
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 19
New Idea: Binary Search
• A master uses two helpers to cut a sorting problem into two smaller sorting problems,
– who themselves use two helpers each to cut their sorting problems into even smaller problems,
– and so on, . . .
– . . . until just one element is left.
• The idea resembles binary search.
• Couldn’t something similar to this also work for the distribution of calls?
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 20
Partitioning Strategy
• Alice could divide the phone list into two halves and call the first person on each of the two halves.
• Each of them will then be asked to cut their list into two further halves and call the first person on these halves.
• This is continued until everybody has been called, i.e., we reach a level in which people are called who just have to take care of an empty list.
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 21 Partitioning
• In this way, the students can be reached much quicker. – For n students this takes time O(log n)
• Alice determines that just seven rounds of calls are sucient to reach all 120 fellow students.
• This is much better than 120 rounds of calls!
• However, the strategy sounds very confusing/technical,
– it’s questionable whether the other students can be made to adhere to the rules without errors.

A
• Thus, Alice thinks about an alternative strategy.
[
losing
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 22
Natural Calling Rounds
• Alice calls the first two people on the list, 1 and 2, and asks 1 to call the students at positions 3 and 4 while 2 is asked to call the students at positions 5 and 6, and so on.
– User i calls users 2i+1 and 2i+2.
• General rule: everybody at position i in the list will call the students at positions 2i + 1 and 2i + 2 (if they exist).
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001
23
Example
• Assume n = 31 (Alice included):
• Rule: i!2i+1,2i+2
•Start:Alice!1,2
– 1 ! 3, 4 and 2 ! 5, 6
– 3!7,8and4!9,10and5!11,12and6!13,14
3-7 zp8
• Information spreads as fast as the previous strategy, – calling rule seems much more natural and
Alice “793092 4
– –
O
→ 9
– 7!15,16and8!17,18and9!19,20and10!21,22 and 11 ! 23, 24 and 12 ! 25, 26 and 13 ! 27, 28 and
14 ! 29, 30
– easier to understand.
Use
overtops
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 24
Not Robust Enough
• Alice is not quite happy with this calling strategy.
• What if one of her fellow students does not count right and
calls a wrong pair of students on the list?
• Moreover, there can still be a couple of students who just forget or do not bother calling their pair on the list.
• In this case, some students would not be informed.
• Therefore, Alice thinks about a more robust strategy:
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 25
New Idea: Allow Overlap
• allow overlappnig calls
– to ensure fault tolerance
• allow calls from multiple initiators! – to ensure fault tolerance
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 26
More Robust (1/2)
• One possibility would be that for each i, the person in “list” position i would call the four students at positions
2i+1,2i+2,2i+3,2i+4.
• Thus, for each i,
i ! 2i + 1, 2i + 2, 2i + 3, 2i + 4.
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001
27
M→ore Robust (2/2) • In Summary
244 receivedfrom
i ! 2i + 1, 2i + 2, 2i + 3, 2i + 4 i + 1 ! 2i + 3, 2i + 4, 2i + 5, 2i + 6 i + 2 ! 2i + 5, 2i + 6, 2i + 7, 2i + 8
. ! .
• Notice the overlap between consecutive transmissions!
,
int
ith – k, who
24+1+1,21411+2 ,
2kt) -13,24+4+4
has sent
Given
the
message
to k
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 28
Calling Patterns
• Who is going to call user k?
• Assumek=2liseven. C
– k is called by users l C
k£1
1 and l2.
l 2 ! 2(l 2) + 1, 2(l 2) + 2, 2(l 2) + 3, 2(l 2) + 4

41-4
• Assumek=2l+1isodd.
¥ l – 2 at 2 #
→ l 1 ! 2(l 1) + 1, 2(l 1) + 2, 2(l 1) + 3, 2(l 1) + 4
– k is called by users l1 and l.
l 1 ! 2(l 1) + 1, 2(l 1) + 2, 2(l 1) + 3, 2(l 1) + 4
Is
e- O¥
l ! 2l + 1, 2l + 2, 2l + 3, 2l + 4
• All students (except for the first four on the list who will
a
directly be called by Alice) will be called by exactly two students in the ideal case.
*
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

A
node k w ill always receive the message
that itstwosende.rs
fail
provided
of did
send
” limited ”
not
to
.
fault Tolerance that is quantified with the# offaults
one

1. e . system .
a
when
we
study
does then
b
numbfer not exceed
If
the
t h e algorithm
is role at . a
distributed
x

Distributed Computing, COMP 4001 29
f o f-
• Assume n = 17 (Including Alice)a art: Alice ! 1, 2, 3, 4

.
Example (1/2)
With Partition
o Alg
O
• St& 3fa-

l
• Rule: i!2i+1,2i+2,2i+3,2i+4 – 1 ! 3,4,5,6, 2 ! 5,6,7,8,
– 3 ! 7,8,9,10,
– 4 ! 9,10,11,12, – 5 ! 11,12,13,14, – 6 ! 13,14,15,16, – 7!15,16
aNB: user 7 sends only to two users instead of four. To overcome this problem the algorithm can wrap-around to the beginning nodes 1, 2, We will not discuss this issue in detail.
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 30
Example (2/2)
• Information spreads as fast as the previous strategy, – calling rule sounds now much more precise, and – the system is now fault tolerant.
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 31
More Robust
• Thus, as long as for each such pair at most one of the students is unreliable (by not being reachable or forgetting to make the call), all of the reliable students will still be informed.
• Intuitively, this can be argued as follows:
– If one can select a caller for each student who works reliably, then everybody who is reliable has a reliable call chain from himself or herself back to Alice.
• This strategy can be made even more robust, so that she can be really sure to reach everybody who is reachable:
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 32
Even More Robust (1/2)
• For some fixed r:a
If every student at position i calls the students at positions
i
then every student (except for the first 2r ones who are directly called by Alice) will be called by exactly r many students in the ideal case.
aThe parameter r is related to the desired fault tolerance.
2i+1,2i+2,…,2i+2r
i s : D: *
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Alice party
fails the whole
w ill not take place!
If
Design algorithms w ith
multiple initiators .
8 OO OO OO
O
OO
O

Distributed Computing, COMP 4001 33
nIodes :IF2
it
i + 1 ! 2i + 3, 2i + 4, . . . , 2i + 2r + 2 i + 2 ! 2i + 5, 2i + 6, . . . , 2i + 2r + 4
Even More Robust (2/2)
i
• In Summary
i!2i+1,2i+2,…,2i+2rH2 :
:p
‘ ,
i
. ! .
i + r 1 ! 2(i + r) 1, 2(i + r), . . . , 2(i + r) + 2r 2
. ! .
• Notice the overlap of consecutive transmissions!
Evangelos Kranakis, Carleton University, SCS (November 7, 2020)

Distributed Computing, COMP 4001 34
Calling Patterns (1/2)
• How many users will call user k?
• Use the Euclidean algorithm to divide k by 2r and let j < 2r be the remainder and q 1 the quotient so that k = 2qr + j • Observe that k = 2qr + j = 2(qr + s) + j 2s, for all s (positive or negative). • Recall the calling rule for i = qr + s: i ! 2i + 1, 2i + 2, . . . , 2i + 2r • Henceuserkwillbecalledbyusersisuchthati=qr+s, provided that 0  j 2s  2r. Evangelos Kranakis, Carleton University, SCS (November 7, 2020) Distributed Computing, COMP 4001 35 Calling Patterns (2/2) • Therefore, if r was used in the design of broadcast each user is called by r alternate users. • Hence, as long as at most r 1 of these are faulty (e.g., they are not calling) all reliable students will still be reached. • Thus, the algorithm ensures fault tolerance for up to r 1 faults! not If exceed does rI then # of faults strategy - is fault the tolerant . Evangelos Kranakis, Carleton University, SCS (November 7, 2020) Distributed Computing, COMP 4001 36 Multiple Initiators Evangelos Kranakis, Carleton University, SCS (November 7, 2020) Distributed Computing, COMP 4001 37 Question: If Alice Fails? • A weakness of the algorithm is that it depends on a single initiator, namely Alice. • Consider the following scenario: – k initiator nodes wake-up at the same time. – A given number among them, say f < k, may fail. • Can we design a fault tolerant algorithm? r - initiators O A OOOO Evangelos Kranakis, Carleton University, SCS (November 7, 2020) Distributed Computing, COMP 4001 38 Exercisesa 1. Prove the correctness of the Heart-Beat algorithm by proving. (a) show that when empty(i,k) holds, Wi ✓ Vk. (b) show that at the end every process must have received the value s(i) from every other process i. 2. List some of the issues that may arise in the complete network Kn by the simultaneously transmitting of messages by many nodes. 3. Verify the calling patterns discussed in the lecture, when every student at position i calls the students at positions 2i+1,2i+2,...,2i+2r 4. Show that tor n students the partitioning algorithm takes time O(log n). aNot to submit Evangelos Kranakis, Carleton University, SCS (November 7, 2020) Distributed Computing, COMP 4001 39 5. (?) Design a broadcast algorithm which is fault tolerant under k initiators. 6. (?) Extend the previous broadcast algorithm to be fault tolerant under < r participants. 7. (??) An interesting analysis for broadcasting is the average case. For a given number x (e.g., 10) of unreliable students, who are assumed to be randomly distributed over the list, we want to determine the minimum value of r for which the probability that all reliable students are reached is still above, say, 90%. In other words, Given n participants and a parameter 0 < p < 1, what is the minimum value of r such that Pr[all reliable students are reached] p Give a broadcasting algorithm and analyze its complexity. As a Evangelos Kranakis, Carleton University, SCS (November 7, 2020) Distributed Computing, COMP 4001 40 hint this can be based on an array A that is defined as follows: (a) N is the total number of students. (b) A: array [1 . . . N ] of integers; A[i] counts, for a reliable student at position i, the number of calls that student would get from other reliable students. (c) For every reliable student, A[i] is initially set to 0. (d) For all unreliable students at position i, A[i] will initially be set to r (so that even after r calls there will not be a positive value in A[i]). In order to determine this r, one can use the algorithm presented below (Below Ste = Alice). Evangelos Kranakis, Carleton University, SCS (November 7, 2020) Distributed Computing, COMP 4001 41 Simulate this algorithm and test its performance. 8. Design fault tolerant broadcasting algorithms assuming multiple initiators. Evangelos Kranakis, Carleton University, SCS (November 7, 2020) Distributed Computing, COMP 4001 42 9. Design a probe-echo algorithm to compute the topology of a network whose topology is a strongly connected graph. When the algorithm terminates, the initiator of the algorithm should have knowledge about all the nodes and the links in the network. 10. Design an algorithm to count the total number of processes in a unidirectional ring of unknown size. Note that any process in the ring can initiate this computation, and more than one processes can concurrently run the algorithm. Feel free to use process ids. Evangelos Kranakis, Carleton University, SCS (November 7, 2020) Distributed Computing, COMP 4001 43 References • Christian Scheideler. Broadcasting–How Can I Quickly Disseminate Information? In Algorithms Unplugged. Berthold V ̈ocking H, Alt, M. Dietzfelbinger, R. Reischuk, C. Scheideler, H. Vollmer, D. Wagner (Editors), 2011. • R. Karp, S. Shenker, C. Schindelhauer, and B. V ̈ocking: Randomized Rumor Spreading. In: IEEE Symposium on Foundations of Computer Science (FOCS), pp. 565-574, 2000. Evangelos Kranakis, Carleton University, SCS (November 7, 2020)