Distributed Computing, COMP 4001 1
Introduction
, , SCS (September 3, 2021)
This is the 1st lecture
Distributed Computing, COMP 4001 2
Outline
• Challenges and Goals
– Internet
– Multiprocessor Computers
– Synchronization: Why is it Di�cult?
– Algorithms and Programs
– Concurrent/Distributed Algorithms
• What is solvable?
– Acting as a Unit
– Coordination
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 3
Major Challenges
• One of the most challenging problems in computer science is to
improve the design of systems where communicating devices
interact with one another, and invent new applications that
will take advantage of these new capacities.
• Hardware’s shift to mobile and multicore platforms requires a
fundamental change in how applications are written for
computer networks and multicore computers.
• Concurrencya and synchronization are fundamental issues that
are critical for the design of applications:
– without them Internet, Database systems, Operating
systems, and Concurrent programming would be impossible.
aConcurrency exists also in nature and living organisms at the molecular
level as well as at those of cells, organs, individuals, communities, and ecological
systems.
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 4
Goals within Distributed Computing
• Understand (some of) the basic principles and results of
computer science which involve interactions between computing
devices (or processors on the same or di↵erent devices).
• These fundamental principles and results, underlying the
design of algorithms for distributed systems,
– are presented by comparing issues that arise when dealing
with groups of computing devices
– to those that arise when a group of people has to share their
resources and work as a team to solve various problems.
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 5
Interacting Networked Entities
• Communicating entities forming a unit.
• No particular structure imposed a priori.
, , SCS (September 3, 2021)
%¥¥ne liar
Distributed Computing, COMP 4001 6
Internet
• Is a vast collection of computer networks which are connected
into a single huge network.
– Designed as a network without the expectation that it
would have a significant role in our daily lives; its scale and
heterogeneity have far surpassed all expectations.
• Many applications of large-scale computer networks require a
high level of reliability and security.
– Today’s economy involves manufacturing, distributing, and
retailing of goods; it also has to do with creating and
disseminating information (book publishing, filmmaking).
• Future economy is likely to be dominated by information.
– The digital revolution is about converting analog to digital
information and use computer networks to move the digital
information around.
, , SCS (September 3, 2021)
Go
Distributed Computing, COMP 4001 7
Multiprocessor Computers
• A processor is the brain of the computer.
– It is a component that interprets and executes computer
program instructions and processes data.
• Until a few years ago mainstream computers were built with a
single processor.
– Now, all computer manufacturers are o↵ering a new
generation of multiprocessor computers where a single
computer includes several processors all executing
concurrently, and interact and collaborate with one another.
• This change in computing architecture requires a fundamental
change in how such computers are programmed.
– Much of the future of computers with multiple processors
will be told by how well programmers can take advantage of
the new concurrent computing architecture.
, , SCS (September 3, 2021)
E.¥ .
sinlxty ) sink’_%ed )
–
–
Distributed Computing, COMP 4001 8
Synchronization
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 9
Synchronization (1/2)
• Synchronization techniques are perceived as essential to design
and support the working activities of groups of computers and
processors.
• Synchronization is used to ensure that only one participant
(and not more) will take a specific action at a given time.
• Computation on computer networks and computation on a
single multiprocessor computer have many aspects in common.
– In both cases we need to understand how separate
computers on the Internet or, similarly, separate processors
within a single computer, interact and synchronize with one
another.
• There are also significant di↵erences between the two!
, , SCS (September 3, 2021)
:¥%É
Distributed Computing, COMP 4001 10
Synchronization (2/2)
• Another type of synchronization has to do with cooperation!
• Many of our daily interactions with other people involve
synchronization.
• You and your partner may have to synchronize on who will buy
the groceries, empty the garbage can, take the kids to school,
which one of you will be the first to take a shower (assuming
you only have only one shower at home), will take the car, or
use the single computer you have.
• Assume that you have a cat and your neighbor has a dog and
you and your neighbor are sharing a yard, then you and your
neighbor might want to coordinate to make sure that not both
pets are in the yard at the same time.
, , SCS (September 3, 2021)
Mutual Exclusion
Distributed Computing, COMP 4001 11
Why is Synchronization Di�cult?
• Synchronization is needed in all systems and environments
where several processors can be active at the same time.
• Without proper synchronization, the integrity of the data may
be destroyed
– if two computers update a common file at the same time,
and as a result, deposits and withdrawals could be lost,
confirmed reservations might have disappeared, etc.
• While achieving synchronization between humans is sometimes
relatively easy, achieving synchronization between computers is
challenging and di�cult.
– The reason is that most computers communicate with each
other in a very restricted way.
, , SCS (September 3, 2021)
¥
”
÷ ¥ ¥” :*
→÷÷÷÷:!
..e) Duplication
of Results f- ( in ) I 2) €3 ) ly > 15 )
2) Receives
old
value
3) Does not receive a value
Distributed Computing, COMP 4001 12
Algorithms and Programs
• An algorithm is just the recipe upon which a problem is solved.
• Originally used in the context of solving mathematical
problems.
– Euclida invented sometime between 400 and 300 B.C., an
algorithm for finding the greatest common divisor of two
possible integers.
– The word algorithm is derived from the name of
Mohammed al-Khowarizmb
• On a computer, an algorithm is expressed as a computer
program which specifies, in the exact syntax of some
programming language, the computation one expects a
computer to perform.
aFamous Greek mathematician who lived in Alexandria.
bPersian mathematician who lived in Baghdad during the 9th century AD.
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 13
Concurrent/Distributed Algorithms (1/2)
• A concurrent or distributed algorithm is the recipe upon which
a problem is solved by multiple computing elements.
1. The term distributed algorithms refers to algorithms where
the computing elements are physically far away from each
other and communicate by sending and receiving messages
via a backbone network (as done on the Internet).
2. The term concurrent algorithms refers to algorithms where
the computing elements are physically very close to each
other and communicate by reading from and writing to
shared memory locations (as done inside a multiprocessor
computer).
• Distributed computing is concerned with both types of
algorithms.
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 14
Concurrent and Distributed Algorithms (2/2)
• When a processor executes a computer program (such as a web
browser), the execution itself is called a process.
– A process runs on a processor, which is the physical
hardware.
• The physical location of the di↵erent processes or processors
involved in a single concurrent activity can be anywhere from
the same computer to di↵erent computers anywhere.
• There are two main technological underpinnings of the rapid
developments of computer networks and multiprocessor
computers.
1. Advances in the design of faster hardware.
2. The development of e�cient concurrent and distributed
algorithms for supporting complex interactions between
processors and computers.
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 15
Acting as a Unit
• The Cloud is not a single thing!
• Gmail, Facebook, Dropbox are made possible by the teamwork
of a large number of physically dispersed components
– a distributed system.
• But we can think of it as a single entity!
• Distributed systems apply whenever we see many small things
working independently but cooperatively to produce the
illusion of a single unified experience.
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 16
Illusion of Unity
• Examples
– A large corporation that is able to release new products and
make public announcements as though it were a single
actor, when we know that at a more detailed level it
consists of tens of thousands of employees.
– A massive ant colony engaged in coordinated exploration.
– The neurons of your brain creating your experience of the
present moment.
• The challenge for a distributed system is to achieve the illusion
of a single unified behavior in the face of so much underlying
complexity.
, , SCS (September 3, 2021)
• Some Bees search for new food sources
o they return and communicate what
they found to the rest of the
colony
o
”
wiggle Danse
”
•.
– -→
Distributed Computing, COMP 4001 17
Possibility/Impossibility Results
• Distributed computing also explores the inherent capabilities
and limitations of distributed systems:
– what problems can and cannot be solved in particular
systems?
• Identifying features of a specific distributed system architecture
that make it adequate/inadequate for solving certain problems
is crucial for the design of better systems which can overcome
such limitations.
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 18
Impossibility Results
• Help us understand the crucial limitations of real systems, why
certain systems are (computationally) weak while others are
powerful, when should we stop looking for a solution for a
given problem, and how to adjust a problem statement or a
system model to overcome an impossibility result.
• They usually depend on assumptions about:
– how the computing elements communicate with one
another?
– what kinds of failures may occur?
– can randomization be used?
• Impossibility results are usually hard to prove!
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 19
E Pluribus Unum
(One from Many)
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 20
Consistency from Ability to Communicate
• Di↵erent parts of the system can develop views of the world
that are mutually inconsistent.
– If a handheld device doesn’t sync with e-mail, you act
without realizing that there’s already been a reply to a
message.
– If there is no concurrency control, two people across the
country may both reserve seat 5F (same seat) on the same
flight at the same time.
– If an executive in an organization “didn’t get the memo”
she may stray o↵-message.
– If no time control, a platoon may attack too soon and alert
the enemy.
• Inconsistency means each component of a distributed system
has di↵erent views.
, , SCS (September 3, 2021)
%?§?¥
°
µtonswIlktindependent.
Distributed Computing, COMP 4001 21
Attaining Consistency
• Can “attain” consistency by enforcing a single global view of
the world, and requiring all parts of the system to constantly
refer to this global view before acting.
• But this undercuts many of the reasons why one uses a
distributed system in the first place.
• The component providing the global view becomes a massive
bottleneck, and a highly dangerous single point of potential
failure:
– could not function if the President had
to sign o↵ on every decision.
, , SCS (September 3, 2021)
}
Leader
Election
Rendez-vous
Distributed Computing, COMP 4001 22
Co-operate for Consistency (1/3)
• A wealthy pirate has a hidden treasure and five children.
– For convenience, number the children 1, 2, 3, 4, 5
• The children know they will inherit the treasure after the death
of the pirate.
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 23
Co-operate for Consistency (2/3)
• The pirate tells the children the treasure is hidden at the
southernmost tip of an unknown circle on the surface of the
earth.
Treasure
• The secret circle is only known to the pirate.
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 24
Co-operate for Consistency (3/3)
• The pirate tells each child a di↵erent point on this circle.
Treasure
5
4
3
2
1
• How many children need to co-operate to find the treasure?
, , SCS (September 3, 2021)
⑨ have a
☐
÷①
Three points uniquely determine
a circle
Can you
still solve the
problem if one of the
children is a liar ?
Distributed Computing, COMP 4001 25
The Challenge of “Unum”
• The principles of distributed systems give us a way to reason
about the di�culties inherent in complex systems built from
many interacting parts.
• To the extent that we sometimes are fortunate enough to get
the impression of a unified
– Web
– global banking system, or
– sensory experience
we should think about the myriad challenges involved in
keeping these experiences whole.
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 26
Two Lovers Problem
(Coordination)
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 27
Coordination (1/3)
• The problem illustrates the di�culties arising when
communication channels are unreliable.
• Two lovers have to coordinate a time for meeting at a romantic
restaurant for dinner.
• If they simultaneously arrive at the restaurant, they are
assured to end up marrying and live happily ever after.
• If only one arrives, their relationship will come to an end.
– So, it is important they arrive simultaneously
• As a result, neither lover wants to arrive without a guarantee
that the other will arrive at the same time.
– In particular, neither lover will arrive without
communicating first with the other.
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 28
Coordination (2/3)
• The lovers can communicate only by sending messages.
– However, every time a message is sent it stands some chance
of getting lost, in other words there is a reliability issue.
• The problem is to find an algorithm that allows the lovers to
coordinate a time for a meeting even though some messages
may get lost.
, , SCS (September 3, 2021)
Unreliable Communication !
Distributed Computing, COMP 4001 29
Coordination (3/3)
• To prevent a situation where both lovers, fearing that their
relationship will come to an end, simply refrain from arriving, it
is required that if everything goes smoothly and no message is
lost, the lovers must be able to coordinate a time for a meeting.
• If enough messages are lost, however, the lovers may refrain
from arriving, but both must do so.
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 30
Applications
• Consider two computers (the two lovers) that are trying to
perform a database transaction over an unreliable
communication line, and need to decide whether to commit or
abort the transaction.
– Example: transferring $ 100 between two bank accounts
which reside in di↵erent banks. Two clerks (the lovers), who
are responsible for the accounts, need to update that
balance in the two accounts simultaneously. They should do
it, even though their computers are connected by an
unreliable communication channel.
• The di�culty of solving computer network problems when the
communication channels are unreliable (i.e., the channels can
lose messages) emphasizes how important a reliable
communication channel is (e.g., TCP).
, , SCS (September 3, 2021)
$XIII
9
TransmissionG¥Il
Distributed Computing, COMP 4001 31
Impossibility Result
• Theorem 1 There is no solution for the two lovers problem,
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 32
Impossibility Result (1/3)
• Neither lover will decide to arrive unless s/he is sure that the
other will arrive with her/him.
• Let’s assume that everything goes smoothly and the messages
do not get lost.
• How long does it take the two lovers to coordinate a time for a
meeting?
• Let’s call the two lovers Alice and Bob.
• If Alice decides to arrive at a certain moment in the future, she
can send a message to alert Bob.
• Would this be enough?
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 33
Impossibility Result (2/3)
• As the message could be lost on its way, Alice cannot be sure
that Bob will arrive with her and hence will not try to arrive.
• Thus Bob, in his turn, has to send another message back to
inform Alice that the message has been delivered.
• Would this be enough for them?
• Well, now Bob needs to know whether his message has been
delivered, to make sure that he will not arrive alone.
• Hence, Bob will not try to arrive until his message is
acknowledged.
, , SCS (September 3, 2021)
acanuseaholherenttyc §÷÷÷§A¥B
→ p
x x
: :
Distributed Computing, COMP 4001 34
Impossibility Result (3/3)
• To solve this problem Alice, when she gets the message from
Bob, has to send another message back to inform Bob that the
message has been delivered, and so on.
• This scenario will never bring enough information to Alice and
Bob, and hence we reach the following surprising conclusion:
• Hence, there is no solution for the two lovers problem!
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 35
Mathematical Proof of Theorem 1
• Assume that a solution to the two lovers problem exists.
• Among all such possible solutions, let P be a solution that, in
scenarios where no message is lost, sends the minimum number
of messages.
• Now, suppose the last message sent in P gets lost.
• Then either this message is useless or one of the lovers does not
get a needed message.
• By the minimality of P , the last message is not useless so
exactly one of the lovers does not arrive if the last sent message
is lost.
• This means that P does not solve the problem as assumed.
• This contradiction proves that there is no solution to the two
lovers problem.
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 36
The Fundamental Issue
• Common knowledge plays an important role in coordination
problems which are central in decision making.
• Another example of this is the “e-mail game” which provides a
rather useful illustration of the logic by which higher-order
beliefs and knowledge might influence decision outcomes.
• It seems that in order to analyze coordination in the face of
uncertainty, the model has to specify the participants’ beliefs,
their beliefs about the beliefs of the opponents, beliefs about
their opponents’ beliefs about their own beliefs, and so on.
• In practice it is di�cult to make precise a specification of the
participants’ higher-order beliefs.a
aA. Rubinstein (1989): “The Electronic Mail Game: the Strategic Behavior
under Almost Common Knowledge,” American Economic Review, 79, 385-391.
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 37
Exercises
a
1. Can the solution be adapted so that it will work even if one of
the pirate’s children is a liar? Give details of how this done.
2. Formulate a version of the treasure problem discussed in the
lecture in which the secrets are planes in 3 dimensional space
and show that your scheme is correct.
3. Suppose the only possible meeting times are either 21:00 or
22:00. Is the two lovers problem solvable in this case?
4. Is the two lovers problem solvable when it is required that the
lovers must be able to coordinate a time for a meeting only
when no message is lost, and in all other cases they both
should not show up?
5. (Must know TCP) Is it possible to solve the two lovers problem
if there is a reliable TCP channel between the two lovers?
aDo not hand in!
, , SCS (September 3, 2021)
}
lec-dcooa.pe/-@cla-dcOOa.pdf@
Distributed Computing, COMP 4001 38
6. The Two Generals’ Problemb illustrate the challenges of
attempting to coordinate an action by communicating over an
unreliable channel. Two generals are only able to communicate
with one another by sending a messenger through enemy
territory. How can they reach an agreement on the time to
launch an attack, while knowing that any messenger they send
could be captured?
(a) What are the similarities and di↵erences with the two lovers
problem discussed in class?
(b) Is there a deterministic algorithm to solve the problem?
(c) Is there a non-deterministic algorithm to solve the problem?
(d) Can the problem be solved if the general could send
multiple messages of variable length?
bAlso known as Byzantine Generals Problem.
, , SCS (September 3, 2021)
Distributed Computing, COMP 4001 39
Sources
• E. W. Dijkstra. Co-operating sequential processes. In F.
Genuys, Ed., Programming Languages, pages 43?112.
Academic Press, , 1968.
• , E Pluribus Unum. “This Will Make You
Smarter: Concepts to Improve Your Thinking”
(J. Brockman, editor), 2012.
• J. F. Kurose and K. W. Ross. Computer Networking: A
Top-Down Approach, 7th ed., Pearson, 2016.
• , “How to share a secret”, Communications of the
ACM 22 (11): 612-613, 1979.
• , Distributed Computing Pearls, Synthesis
Lectures on Distributed Computing Theory May 2018, 123
pages,
, , SCS (September 3, 2021)