COMP3221: Distributed Systems
Introduction
Dr Nguyen Tran
School of Computer Science
The University of Sydney
Page 1
Outline
– Why this course ?
– What this course is about?
– Definitions, Examples and Challenges of Distributed Systems
– Course Logistics
– Lectures/Tutorials – Assessments
– Expectation and Outcomes
– Resources The University of Sydney
Page 23
The University of Sydney
Page 24
Why this Course ?
COMP3221: Distributed Systems
What is a Distributed System?
“A collection of independent computers that appears to its users as a single coherent system.”
The University of Sydney
Page 25
Cloud computing
• “Anything-as-a-Service”
• Software-as-a-Service
• Platform-as-a-Service
• Infrastructure-as-a-Service
• Mobile Backend as-a-Service
The University of Sydney
Page 26
Cloud computing
The University of Sydney
Page 27
https://cloud.google.com/about/locations/#regions-tab
Cluster
CSIRO Bracewell
114 PowerEdge C4130 servers with Nvidia Tesla P100 GPUs, Nvlink, dual Intel Xeon processors, and 100Gbps EDR InfiniBand interconnect. 1,634,304 Cuda compute cores, 3192 Xeon compute cores, and 29TB of RAM, and runs both Linux and Windows.
https://www.csiro.au/en/Research/Technology/Scientific-computing/Bracewell
USYD Artemis
ACCESS TO ARTEMIS
https://sydney.edu.au/research_support/hpc/access/index.shtml The University of Sydney
Page 28
Social Networks
The University of Sydney Page 29
Chip Multiprocessors
The University of Sydney Page 30
Sensor Networks
The University of Sydney Page 31
Personal Area Networks
The University of Sydney Page 32
Blockchain
https://www.pwc.com/us/en/industries/financial-services/fintech/bitcoin-blockchain-cryptocurrency.html The University of Sydney Page 33
Blockchain applications
The University of Sydney Page 34
Distributed Machine Learning
The University of Sydney Page 35
Why this course ?
– Distributed Computing Systems are everywhere ! – Practically you can not avoid them.
– Knowledge and experience in Distributed Systems will be useful;
– For your final year thesis project – To improve your productivity
– Pursue your passion as a hobby – JustforFun!
– Improve your chances of getting a better job
– COMP3221 is about;
– What is a distributed system? – How it works?
– How to run yours?
The University of Sydney
Page 36
COMP3221 – Course Description
– The unit will provide a broad introduction to the principles of distributed systems and their design; provide students the fundamental knowledge required to analyse and construct various types of distributed systems; explain the common architectural principles and approaches used in the design of networks at different scales (e.g. shared medium access and routing); introduce the programming skills required for developing distributed applications, and will cover the use of Python class libraries and APIs; cover common approaches and techniques in distributed resource management (e.g. task scheduling, machine learning)
The University of Sydney Page 37
What is a Distributed System?
“A collection of independent computers that appears to its users as a single coherent system.”
– Transparency helps the users observe a single coherent system – The different forms of transparency in a distributed systems
The University of Sydney Page 38
Challenges of Distributed Systems
Understanding the associated challenges to learn how it works and how to run yours.
– Networkcommunication – Scalability
– Consistency
– Fault-tolerance
– MachineLearning – Security
The University of Sydney
Page 39
Scalability
– Scalability of a distributed system: the ability for the system to preserve some properties as the system grows in
• the number of requests or participants,
• the distance between resources and users, or • the heterogeneity.
The University of Sydney Page 40
Scalability Example – DNS
Hostname to IP address translation
Why not centralize DNS?
– single point of failure
– traffic volume
– distant centralized database
– maintenance A: doesn‘t scale!
The University of Sydney
Page 41
Scalability Example – Twitter
– Burst of load:
– 456 tweets per second (TPS) when Michael Jackson
died (June 25, 2009).
– 6,939 TPS after midnight in Japan on 2011 New Year’s day.
– Increase in participation:
– +182%: Increase in number of mobile users in 2010. – >500,000 new accounts created on a single day.
The University of Sydney
Page 42
Source: http://blog.twitter.com/2011/03/numbers.html
Consistency
– Consistency; a property applying to a collection of data items that are accessed by distributed participants.
– Examples of inconsistencies: As a participant, I observe that Djokovic lost against Nadal but then Djokovic won against Federer in the 2012 French Open.
The University of Sydney Page 43
Consistency Example
Commodity clusters
Network
What happened during the 2012 French open?
Network
The University of Sydney
Page 44
Network
Network
Consistency
Commodity clusters
Network
1. Djokovic defeats Federer
Network
The University of Sydney
Page 45
Network
Network
Consistency
Commodity clusters
Network
2. Nadal defeats Djokovic
1. Djokovic defeats Federer
Network
The University of Sydney
Page 46
Network
Network
Consistency
Commodity clusters
Network
3. Nadal defeats Djokovic
1. Djokovic defeats Federer
Network
The University of Sydney
Page 47
Network
Network
Consistency
Commodity clusters
Network
Network
4. Djokovic defeats Federer
The University of Sydney
Page 48
Network
Network
Consistency
Commodity clusters
Network
Did Djokovic win or lose?
Network
The University of Sydney
Page 49
Network
Network
Fault-Tolerance
– Fault-tolerance of a distributed system: the ability for the system to recover from partial failures.
– How to keep the distributed system up and running, thereby appearing as a single running system to its users?
The University of Sydney Page 50
Fault-Tolerance
– Fault-tolerance of a distributed system: the ability for the system to recover from partial failures.
– How to keep the distributed system up and running, thereby appearing as a single running system to its users?
The University of Sydney Page 51
Distributed Machine Learning
The University of Sydney Page 52
COMP3221: Resources
– Canvas: https://canvas.sydney.edu.au/
– Login using Unikey and password
– Link to Units website: https://sydney.edu.au/units/
• Official schedule, list of learning outcomes, etc – Copies of slides
– Lab instructions
– Assignment instructions
– Lecture videos
• We will record the lectures, but the technology is not reliable – Submit official assignment work here;
– see your grades; etc
– Discussion forum: on Edstem, link from Canvas site
The University of Sydney Page 54
Prerequisites
– Python Programming
– (INFO1103 or INFO1113) OR (INFO1105 or INFO1905)
– Algorithms and Data Structures – COMP2123 OR COMP2823
– Prohibitions
– COMP2121 – Older version of this course
– You need to go through Special Permission to enroll if you do not have the above requirements
The University of Sydney Page 55
COMP3221 – Schedule
Week
Lectures
Labs/Tutorials
1
Introduction
–
2
Architecture & Processes
Multithreading
3
Communication (Routing)
Routing
4
Communication (TCP) & Naming
Client – Server
5
Synchronization
Time
6
Consistency
Consistency
7
Blockchain
Mid-term Quiz
8
Fault tolerance
Consensus
9
Distributed Linear Regression
Linear Regression
10
Distributed Optimization
Distributed Optimization
11
Distributed Logistic Regression
Logistic Regression
12
Security
Security
13
Course Review
–
The University of Sydney Page 57
Schedule may Change
Assessment
Task
When
Marks
Mid-term Quiz
Week 07 Tutorial Time
15%
Final Exam
TBA
50%
– The mid-semester quiz and final exam will test the students’ understanding of the theoretical material and concepts and ability to put it in the appropriate context of solving problems.
– What I hear, I forget;
– What I read, I remember;
– What I do, I understand.
— Confucius
The University of Sydney
Page 59
Academic Dishonesty & Plagiarism
– Academic Integrity – Plagiarism: NO
– Outsourcing: NO
– Academic dishonesty means seeking to obtain or obtaining academic advantage for oneself or for others (including in the assessment or publication of work) by dishonest or unfair means.
– Plagiarism means presenting another person’s work as one’s own work by presenting, copying or reproducing it without appropriate acknowledgement of the source.” [from site below]
– Submitted work is compared against other work (from students, the internet, etc.)
– Turnitin for textual tasks (through eLearning), other systems for code
– Penalties for academic dishonesty or plagiarism can be severe
– “The University of Sydney is unequivocally opposed to, and intolerant of, plagiarism and academic dishonesty.
– University Policy:
– https://sydney.edu.au/students/academic-dishonesty.html
The University of Sydney Page 62
Resources
Textbook
Distributed Systems – Principles and paradigms by Tanenbaum and Van Steen. 2nd Edition.
– This and other relevant works can be found in the university library and also in the bookshop.
The University of Sydney Page 63
Resources
– Canvas UoS website: https://canvas.sydney.edu.au/
– Login using Unikey and password
– Submit official assignment work here or on PASTA
– Copies of slides and tutorials
– Assignment instructions
– Lecture recording
– Discussion forum is linked on the elearning website (invitations sent):
– Ed
– Post questions online (on this forum)
– Everyone is welcome to answer and rate answers
The University of Sydney Page 64
Expectations
– Students attend scheduled classes, and devote an extra 6-9 hours per week
– doing assessments
– preparing and reviewing for classes – revising and integrating the ideas
– practice and self-assessment
– Pre-requisites
– All programming will be done in Python and knowledge and experience
in Python programming is expected
The University of Sydney Page 65
Expectations
– Students are responsible learners
– Participate in classes, constructively
• Respect for one another (criticize ideas, not people)
• Humility: none of us knows it all; each of us knows valuable things
– Check Ed and Canvas sites at least once a week!
– Notify academics whenever there are difficulties
– Notify group partners honestly and promptly about difficulties
The University of Sydney
Page 66
Get help… !
– Consultation
– By appointment
– Tutors:
– Check on Canvas
The University of Sydney
Page 67
Advice
– Lectures notes are for help
– You should understand in-depth
– Practice your reasoning by re-doing the examples at home
– Think about implications, ask questions
– Re-read your notes or the lecture notes at home after the class to memorize easily
The University of Sydney Page 68
What’s Next ?
– Time management
– Watch the due dates
– Start work early, submit early
– Networking and community-formation
– Make friends and discuss ideas with them
– Know your tutor, lecturer, coordinator
– Keep them informed, especially if you fall behind
• Don’t wait to get help – Enjoy the learning!
The University of Sydney
Page 69