CS计算机代考程序代写 scheme javascript Java file system distributed system concurrency cache COMP3221: Distributed Systems

COMP3221: Distributed Systems
Architectures & Processes
Dr Nguyen Tran
School of Computer Science
The University of Sydney
Page 1

Previously…
– Basic definition of a distributed system
– “A collection of independent computers that appears to its users as a single
coherent system.”
– Real-world examples for distributed systems
– Transparency helps the users observe a single coherent system
The University of Sydney Page 2

Outline – Architectures
– Software Architectures
– System Architectures
– Centralized architectures
• The Client-Server Model
• The Layered Organization
– Decentralized architectures
• The Peer-to-Peer Organization
– Hybrid architectures
• Edge server systems
• Collaborative distributed systems
The University of Sydney
Page 3

Software Architectures
Architectures & Processes Week 2, COMP3221
The University of Sydney Page 4

Software Architectures
Logical organization of components in distributed systems
– Component: A modular unit with well-defined interfaces that is replaceable
– Connector: a mechanism that mediates communication, coordination among
components.
Layer-based Architectures vs. Object-based Architectures
– In (a), requests (resp. responses) go downward (resp. upward)
– In (b), objects communicate through Remote Procedure Calls (RPCs)
The University of Sydney Page 5

Software Architectures
Event-based Architectures vs. Data-centered Architectures
– –
(a) Event-based architecture: communication through events, that optionally carry data (subscribers get their desired events delivered)
(b) Data-centered architecture: through a shared repository, that contains data (e.g., files in a distributed file system, web-based distributed system, Twitter)
The University of Sydney Page 6

Centralized Architectures
Architectures & Processes Week 2, COMP3221
The University of Sydney Page 7

Centralized Architectures
The basic client-server model
– The client/server requests/provides the services
– The client and the server can be hosted on different machines.
– The communication follows a request-reply model.
– Examples: ??
The University of Sydney Page 8

The Client-Server Model
Stateless vs Stateful server
?
Hi, I’m comp1, may I have the lines 21-40 of file 5? Sure, you have the credentials, attached are the lines
Client
Stateless server
The University of Sydney
Page 9

The Client-Server Model
Stateless vs Stateful server
Hi, I’m comp1
I know, how are you doing today?
May I have the next 20 lines of my file Sure, here they are
Client
Stateful server
The University of Sydney
Page 10

The Client-Server Model
Stateless vs Stateful server
– Stateless server: does not record the state of its clients
– Stateful server: maintains persistent information about its clients
(client->file)
Stateless server
Stateful server
State
No info kept
Persistent info
Request
Self-contained
Can be split, generally faster
Upon failure
No recovery needed
State recovery needed (explicit deletion)
Example
Network file system (NFSv3)
Andrew file system (AFS)
The University of Sydney Page 11

The Layered Organization
Architectures & Processes Week 2, COMP3221
The University of Sydney Page 12

Application layering
Traditional three-layered view:
1. Theuserinterfacelayer
– contains the feature to control the application
2. The processing layer
– contains the function of the application
3. The data layer
– contains the data of the application
The University of Sydney
Page 13

Application layering (cont’d)
Example: a search engine request spanning the traditional three layers
The University of Sydney Page 14

Multi-tiered architectures
Physical two-tiered architecture
The University of Sydney Page 15

Multi-tiered architectures (cont’d)
A single machine can act both as a client and a server
Example: Cloud computing
– Cloud computing: the delivery of computation or storage as a service to end-users.
– The servers handle most of the computation upon request and sends back the results to the client
– One server asks data to another server – Another does the computation
–…
The University of Sydney Page 16

Decentralized Architectures
Architectures & Processes Week 2, COMP3221
The University of Sydney Page 17

The peer-to-peer model
Every machine acts similarly
– Every machine is both a client and a server
– No centralized control: the responsibility is distributed evenly – Even the program executing on each machine is similar
– Forms an overlay network,
– nodes are processors,
– links possible communication channels.
The University of Sydney
Page 18

Structured Peer-to-Peer Model
The overlay network is constructed using a deterministic procedure
– Chord is an example of a Distributed Hash Table (DHT)
As a node:
– I have a successor peer
– I have a predecessor peer
– I have some shortcuts to other nodes to speedup delivery of requests
– I am responsible of a subset of the system data items (based on my unique identifier)
The University of Sydney
Page 19

Hybrid Architectures
Architectures & Processes Week 2, COMP3221
The University of Sydney Page 20

Edge-Server Systems
– Servers are placed at the edge of the network.
– Edge servers serve content, possibly after applying filtering
and transcoding functions
– One edge server acts as an origin server for all content, while taking advantage of other servers for optimized delivery of content, e.g. through replication.
The University of Sydney Page 21

Collaborative Distributed Systems
– How do you first get started ?
– Often a traditional client-server models is used.
– Once joined fully decentralized schemes can be used.
– Example: BitTorrent, a file sharing application
– 35% of world-wide internet traffic in 2015.
– Used for Linux distribution, software patches, distributing movies
– Goal: quickly replicate large files to large number of clients
The University of Sydney
Page 22

BitTorrent
– Web server hosts a .torrent file (w/ file length, hash, tracker’s URL…)
– A tracker (server or a DHT) tracks downloaders/owners of a file
– Files are divided into chunks (256kb-1MB)
– Downloaders download chunks from themselves (and owners)
– Tit-for-tat: the more one shares (server), the faster it can
download (client)
The University of Sydney Page 23

Conclusion
– Client and server are used to identify the role of communication participant
– Client and server roles may run on:
– The same machine
– Distinct machines with very different resources – Distinct machines with similar resources
– Most moderns distributed systems are hybrid architectures The University of Sydney
Page 24

Processes
Architecture & Processes Week 2, COMP3221
Dr. Nguyen Tran
School of Computer Science
The University of Sydney
Page 25

Introduction
– Previously: a distributed system gives the illusion of a single system to a user thanks to transparency
– Now: a single system gives the illusion of multiple resources to multiple users thanks to concurrency
The University of Sydney Page 26

Outline – Processes
– A Very Brief History
– UNIX Processes and Threads – Multi-threading
– Multi-threaded Server
The University of Sydney
Page 27

A Very Brief History
Architectures & Processes Week 2, COMP3221
The University of Sydney Page 28

Evolution of Operating Systems (OSes)
– The hardware evolved
– Hardware was expensive, humans were cheap (e.g., Multics 1960’s)
– Hardware was cheaper, humans were expensive (e.g., desktop computer 1980’s)
– Hardware are very cheap, humans are very expensive (e.g., handheld devices 2000’s)
– The OS interaction had to adapt
– Batch: one execution at a time
– Multiprogramming: multiple program executions simultaneously
– Timeshare: split time into slots allocated for different program execution – GUI: multiple interfaces to access a system from different end-points
– Ubiquitous devices: each user possesses her own computational device
The University of Sydney Page 29

Uniprogramming vs. Multiprogramming
– Uniprogramming: one program execution at a time – MS/DOS
– early Macintosh
– Batch processing
– No longer acceptable!
– Multiprogramming: more than one program execution at a time
– Multics, UNIX/Linux, Windows NT, Mac OS X
– Many program executions on personal computers: – Browsing the web
– Sending emails
– Typing a letter
– Burning a DVD The University of Sydney
Page 30

Concurrency
– Assume a single central processing unit (CPU)
– Problem: How to give the illusion to the users of multiple CPUs?
CPU
exec1
exec2
exec3
CPU1
CPU2
CPU3
Time
– Solution: scheduling program executions, one after the other during small fractions of the time
CPU
exec1.1
exec2.1
exec3.1
exec1.2
exec2.2
exec3.2
The University of Sydney Time Page 31

Concurrency
– Multi/many-cores on a single machine
CPU1
cache
CPU2
cache
CPU3
cache
Network on chip
shared memory
– Distributed operating systems
– This is a single system image, the system maintains a single copy of the resources
Distributed applications
Distribution service
Kernel
Network
Kernel
Kernel
The University of Sydney
Page 32

Concurrency
– Network operating systems – Machines provide resources to
other machines (e.g., UNIX rlogin)
– Middleware
– A layer over the network services providing general services to applications in a very transparent manner (systems can differ)
stributed applications
Dist. service
Kernel
Network
stributed applications Middleware
Network services
Kernel
Dist. service
Kernel
Di
Dist. service
Kernel
Di
Network services
Kernel
Network services
Kernel
The University of Sydney
Page 33
Network

Processes and Threads
Architectures & Processes Week 2, COMP3221
The University of Sydney Page 34

Processes
Process: defined as a program in execution
– Operating system creates a virtual CPU for each process – OS makes sure that processes cannot maliciously or
inadvertently affect each other’s behaviour, i.e. Secure ! – This concurrent transparency comes at a very high cost:
– For every process (virtual CPU), OS must create a completely independent address space.
The University of Sydney
Page 35

Address space
Address space: A unit of management of a process’s virtual memory – Process address space:
– Stack: temporary data extensible towards lower virtual addresses
– Heap: memory allocated dynamically extensible to higher virtual addresses
– Data section: global variable
– Text region: program code
Stack
Heap
Data
Text
The University of Sydney
Page 36

Process
Process: an abstraction representing a program execution – A process switches from a state to another:
exec1
exec1.1
exec2.1
– When there is timer interrupt goes off, explicit yield, I/O, etc.
Time
new admitted ready
interrupt
exit terminated
running
I/O or event wait
I/O or event completion
Scheduler dispatch waiting
The University of Sydney
Page 37

Process management
Process creation:
– Process allocation: the act of choosing the host of the process
– The system provides command to execute a process at an idle workstation – The system chooses a host from a pool of processors to execute it
– Execution environment: an address space w/ initialized content & open files
– Static: program text, heap and stack regions created from a list
– Dynamic: UNIX fork shares program text and copies stack and heap
regions
The University of Sydney Page 38

Process management
CPU
Time
exec1.1
exec2.1
exec3.1
exec1.2
exec2.2
exec3.2
– Switching from one process to another has become one of the most expensive operations due to Context-Switches
– CPU context: register values, program counter, stack pointer, etc.
– Modify registers of MMU (Memory management unit)
– Invalidate address translation caches such as in TLB (Translation lookaside buffer)
Inter-Process Communication (IPC):
– Pipes/Message queues/Shared memory segments – Requires costly context switches
The University of Sydney
Page 39

Context-switch
IPC requires kernel intervention
– S1. Process A switches from user to kernel mode
– Changing the memory map in the MMU (memory management unit). – Flushing the TLB (translation look aside buffer)
– S2. Process context-switch (swapping processes from A to B) – S3. Process B switches from kernel to user mode
– Again requires changing the MMU map and flushing the TLB
The University of Sydney
Page 40

Thread
Threads: smallest unit of CPU utilization – Multiple threads per process
– Portable OS Interface (POSIX) Threads
– Current activity: program counter
– Stack: temporary data
– No data, code, files (but it shares them with other threads)
code data files
registers
stack
code data files
reg reg reg
stack stack stsatacckk
The University of Sydney
Page 41
Single-threaded process
Multithreaded process
thread

Thread
– Portable OS Interface (POSIX) Threads
– Current activity: program counter
– Stack: temporary data
– No data, code, files (but it shares them with other threads)
– Communication between threads always through memory
– Does not need IPC
– No context switches required (when purely at user-level)
The University of Sydney Page 42

Thread Implementation
User-level threads library vs. kernel-scheduled thread

User-level thread library:
1. Cheap to create and destroy
2. Blocking system call freezes the entire process of the thread (and other
threads)
3. Cheap context-switch:
• FewinstructionstostoreandreloadCPUregistervalues • NoneedtochangememorymapinMMU
• NoneedtoflushtheTLB
Kernel-scheduled threads:
1. Costly to create and destroy
2. Do not block the current process (and other threads) upon blocking system calls
(I/O)
3. Costly context-switch (similar to process context switch)

The University of Sydney
Page 43


NeedstochangememorymapinMMU NeedstoflushtheTLB

Processes vs. threads
– Processes
– Isolated: prevents one process from interfering with another
– Inefficient: starting/terminating a process and context switches are costly
– Threads
– Non-isolated: avoiding incorrect interferences makes
programming harder
– Efficient: a thread is a lightweight version of a process – Multiple threads of control per process
The University of Sydney
Page 44

Client multi-threading
Example: The Google Chrome web browser
– Multithreading: each tab runs its own thread
– A JavaScript error does not crash the main Chrome process:
Only one tab freezes, the user can close it independently of others
– Takes benefit of multicore architectures, each thread runs on a separate core
The University of Sydney Page 45
one distinct thread running each tab

Conclusion
– Concurrency has been used for decades since hardware was powerful enough to address multiple human needs
– Processes are heavy but protected whereas threads are lightweight but non-protected
– Threads require synchronization to be protected from each other
The University of Sydney
Page 46

What’s Next ?
– Tutorial on Wednesday. – Multithreading
– Read Chapter 1, 2 and 3 of the textbook
– Next week: How do we communicate in distributed systems ? – See you all next week !
The University of Sydney
Page 47