Intro to Parallel Computing
Topic 3 – Parallel Concepts and Threads
COSC 407: Intro to Parallel Computing
Topic 3: Parallel Concepts and Threads COSC 407: Intro to Parallel Computing
Copyright By PowCoder代写 加微信 powcoder
Previous pre-recorded lecture (Students’ led Q/As):
More on C programming:
– Pointers(intro,memoryallocation,2Darrays,functions) – ErrorHandling,Stringprocessing
– struct,typedef
– Preprocessors,CompilingCprograms
Today’s topics:
– Introtoparallelcomputing – IntrotoPOSIXThreads
Next Lecture:
– MorePOSIXThreads
Topic 3: Parallel Concepts and Threads
COSC 407: Intro to Parallel Computing
Sequential Systems
a system in which each activity must complete execution before another activity can start.
Benefit: simplicity of design and use. Cost: might be slower in many cases
– one job at a time (Not making use of the multiple cores in the system.)
– the system must wait for its slower components if it needs to do more than one job (e.g. reading from a hard- drive (slow). Even if you want to do some processing that are not related to the data being read, you will still have to wait!).
1Job 1 idle Job 1 I/O request
COSC 407: Intro to Parallel Computing
I/O request
Topic 3: Parallel Concepts and Threads
Concurrency vs. Parallelism
§ Concurrency
– multiple operations are making
progress during the same time period (but not necessarily running at the same time
– Interleaving the execution steps of each task via time-sharing slices
• i.e. Two ‘programs’ making progress on a single core processor
§ Parallelism
– Multiple operations are making progress at the exact same time.
– i.e. two threads on a dual-core machine means that both threads can be executed at the same time
Topic 3: Parallel Concepts and Threads
COSC 407: Intro to Parallel Computing
Concurrent Systems
– Asystemthatcanrunseveralofactivitiesatthesametime. • Benefit: efficient use resources, and probably faster.
• Cost: complexity of hardware and software:
I/O request
– Twotypesofconcurrentsystems:
1. Parallel systems
2. Pseudo-parallel systems
Topic 3: Parallel Concepts and Threads
I/O request
COSC 407: Intro to Parallel Computing
Concurrent Systems, cont’d
1. Parallel systems
• More than one processor that can actually carry
out several activities simultaneously
2. Pseudo-parallel systems
• Share processor time between a number of activities (e.g., if we have two activities running on a single processor, then there is only one that is actually running at any time)
Topic 3: Parallel Concepts and Threads COSC 407: Intro to Parallel Computing
What is a “Process”
§ An instance of a computer program that is being executed
§ Components of a process:
– The executable machine language program
– A block of memory
– Descriptors of resources the OS has allocated to the process
– Security information
– Information about the state of the process
Topic 3: Parallel Concepts and Threads COSC 407: Intro to Parallel Computing
Multitasking
§ Gives the illusion that a single processor system is running multiple programs simultaneously
§ Each process takes turns running
– time slice
– After its time is up, it waits until it has a turn again
– blocks – nothing proceeds with this program until the next time it
has a turn
Copyright © 2010, Elsevier Inc. All rights Reserved
Topic 3: Parallel Concepts and Threads COSC 407: Intro to Parallel Computing
§ Threads are contained within processes
§ They allow programmers to divide their programs into (more or less) independent tasks
– A stream of instructions that can be scheduled to run independently from its main program
§ The hope is that when one thread blocks because it is waiting on a resource, another will have work to do and can run
Topic 3: Parallel Concepts and Threads
Attribution Cburnett Creative Commons Attribution-Share Alike 3.0 Unported https://upload.wikimedia.org/wikipedia/commons/a/a5/Multithreaded_process.s vg
Copyright © 2010, Elsevier Inc. All rights Reserved
COSC 407: Intro to Parallel Computing
Process and Threads
the “master” thread (heavy-weight)
starting a thread Is called forking (light weight)
Topic 3: Parallel Concepts and Threads
terminating a thread Is called joining
Copyright © 2010, Elsevier Inc. All rights Reserved
COSC 407: Intro to Parallel Computing
Processes vs. Threads
§ Threads exist within a process
– everyprocesshasatleastonethread
– Aprocessismultithreadedwhenitcontainsmorethanone
thread of execution.
§ Both threads and processes are units of execution (tasks)
– Bothareitemsthatcanbescheduledforexecution § But
– Processeswill,bydefault,notsharememory.
– Threadsofthesameprocesswill,bydefault,haveaccessto
the same shared memory (the process memory)
• Data can be shared or private
• Private data is available only to the thread that owns it
Topic 3: Parallel Concepts and Threads COSC 407: Intro to Parallel Computing
Processes vs. Threads
§A process is multithreaded when it contains more than one thread of execution.
§Memory used by a process and shared by its threads
Topic 3: Parallel Concepts and Threads
COSC 407: Intro to Parallel Computing
Multi-thingys
§ Multithreaded process:
– a process having more than one thread.
– Athreadofexecutionisasequenceofinstructionsthatcan
be managed independently (executed, stopped, etc.).
§ Multitasking: the ability of a platform to run more than one task (thread or a process) concurrently.
– Two multitasking operating systems:
1. Pre-emptive multitasking
– The OS decides when a task should give way to another to allow sharing of resources (e.g. Windows 95 and later)
2. Cooperative multitasking
– The process is coded such that it decides when to allow other processes to run from time to time (e.g. Windows 3.1 and prior).
Topic 3: Parallel Concepts and Threads COSC 407: Intro to Parallel Computing
How are Things Shared?
§ Shared Memory
– In general, allows processors to have access to global address space
– Multiple processors/processes can operate independently but share the same memory resources
– Changes in a memory location effected by one processor are visible to all other processors
Topic 3: Parallel Concepts and Threads
https://hpc.llnl.gov/training/tutorials/introduction-parallel-computing-tutorial#MemoryArch
Copyright © 2010, Elsevier Inc. All rights Reserved
COSC 407: Intro to Parallel Computing
How are Things Shared?
(uniform memory access)
§ Time to access all the memory locations will be the same for all the cores
Topic 3: Parallel Concepts and Threads
(non-uniform memory access)
§ A memory location a core is directly connected to can be accessed faster than a memory location that must be accessed through another chip.
Copyright © 2010, Elsevier Inc. All rights Reserved
COSC 407: Intro to Parallel Computing
How are Things Shared?
§ Distributed memory (more later on this)
– Clusters (most popular) • ie a group machines
– A collection of commodity systems
– Connected by a commodity
interconnection
– Nodes of a cluster are
individual computations units joined by a communication network
Topic 3: Parallel Concepts and Threads
Copyright © 2010, Elsevier Inc. All rights Reserved
COSC 407: Intro to Parallel Computing
Back to Threads.. Thread Model
§ POSIX/openMP
§ Type of shared memory
programming
§ a.out is scheduled to run by the
native operating system and acquires all of the necessary system and user resources to run
– “heavy weight” process
§ Does some serial work, and then
creates a number of tasks (threads) that can be scheduled and run by the operating system concurrently
§ Programmer is responsible for determining the parallelism
https://hpc.llnl.gov/training/tutorials/introduction-parallel-computing-tutorial
Topic 3: Parallel Concepts and Threads
COSC 407: Intro to Parallel Computing
Back to Threads.. Thread Model
§ Each thread has local data and
shares the entire resources of a.out
– Saves the overhead associated with replicating a program’s resources for each thread (“light weight”)
– Each thread also benefits from a global memory view because it shares the memory space of a.out.
§ Threads communicate with each other through global memory (updating address locations)
§ Requires synchronization to ensure that more than one thread is not updating the same global address at any time
§ Threads can come and go, but a.out remains present to provide the necessary shared resources until the application has completed
https://hpc.llnl.gov/training/tutorials/introduction-parallel-computing-tutorial
Topic 3: Parallel Concepts and Threads
COSC 407: Intro to Parallel Computing
Task Scheduling
§ A scheduler is a computer application that uses a scheduling policy to decide which processes should run next.
– The scheduler uses a selection function to make the decision.
– The selection function considers several factors such as:
• Resources the processes requires (and whether they are available)
• Time processes have been waiting and/or been executing
• The processes priority
§ The scheduling policy should try to optimize many
characteristics such as:
– Responsiveness of interactive processes
– Turnaround: the time the user waits for the processes to finish
– Resource utilization (e.g., keep the CPU busy)
– Fairness: dividing the CPU time fairly
• This is opposite to ‘starvation’ where a processes is not given the chance to run for a long time
Topic 3: Parallel Concepts and Threads COSC 407: Intro to Parallel Computing
Some Scheduling Policies
Non-pre-emptive policies (each task runs to completion before the next one can run)
– First-In-First-Out (FIFO)
• Tasks are placed in a queue as they arrive.
– Shortest-Job-First (SJF)
• The process that requires the least execution time is picked next
Pre-emptive
– Round-Robin (RR)
• Each task is assigned a fixed time to run before it is required to give
way to the next task and move back to the queue (e.g., at the end of the queue). RR is pre-emptive policy.
– Earliest-Deadline-First (EDF)
• The process with the closes deadline is picked next. (pre-emptive)
– Shortest-Remaining-Time-First (SRTF)
• The process with the shortest remaining time is picked next (pre-
Topic 3: Parallel Concepts and Threads COSC 407: Intro to Parallel Computing
Shortest-Job-First
§ Consider the SJF scheduling policy on the processes in the table below and find finish and turnaround times according to the given values
– Turnaround time is the total amount of time spent by the process from coming in the ready state for the first time to its completion (Turnaround time = Exit time – Arrival time)
T50 20 40 60 80 100 120 140160
T1 T2 T3 T4 T5
Topic 3: Parallel Concepts and Threads COSC 407: Intro to Parallel Computing
Arrival time
Service time
Finish time
Turnaround
Some Key Terms
§ Shared resource: a resource available to (shared by) all processes in a concurrent program
– e.g., a shared data-structure
§ Critical section: section of code with a process that requires
access to shared resources [1]
– Cannot be executed while another process is in a corresponding
section of code
– e.g. updating a shared resource
– Critical sections should be executed as serial code
§ Mutual Exclusion: requirement that when one process is in a
critical section that accesses a shared resource, no other process may be in a critical section that accesses any of those shared resources [1]
– a mechanism to ensure that no two concurrent processes access a critical section at the same time
[1] Stallings, William. (2005). Operating systems : internals and design principles. Upper Saddle River, N.J. : ,
Topic 3: Parallel Concepts and Threads COSC 407: Intro to Parallel Computing
Some Key Terms
§ Condition synchronization: a mechanism to make sure that a process does not proceed (i.e., blocked) until a certain condition is satisfied
– For example, a consumer process cannot read from a buffer unless there is some data written to the butter. Similarly, a producer process cannot write to a full buffer
– Synchronization lock mechanism
§ Deadlock: a situation in which two or more processes are
unable to proceed because each is waiting for one of the other
processes to do something [1]
§ Livelock: a situation in which two or more processes
continuously change their state in response to changes in other processes without making any process (no useful work)[1]
[1] Stallings, William. (2005). Operating systems : internals and design principles. Upper Saddle River, N.J. : ,
Topic 3: Parallel Concepts and Threads COSC 407: Intro to Parallel Computing
Some Key Terms
§ Race Condition: a situation where multiple threads/processes read/write a shared data item and the result depends on the relative timing of their execution
§ Starvation: a situation in which a runnable process is overlooked indefinitely by the scheduler; although it is able to proceed, it is never chosen [1]
– Occurs when a thread is unable to gain access to a shared resource and thus cannot make progress
[1] Stallings, William. (2005). Operating systems : internals and design principles. Upper Saddle River, N.J. : ,
Topic 3: Parallel Concepts and Threads COSC 407: Intro to Parallel Computing
Example (Synch/Mutal Ex)
§ Assume we have an object named carParkControl:
– attributes: capacity and spaces
– methods: depart() and arrive()
§ Now assume three threads are using carParkControl:
– T1 first invokes the depart method
– Then T2 and T3 both invoke the arrive method.
– Assume the car park is initially empty (numCars=0) – i.e. no cars to depart !!
§ Two important aspects to insure the code works properly:
– condition synchronization and mutual exclusion.
carParkControl
Topic 3: Parallel Concepts and Threads
COSC 407: Intro to Parallel Computing
A possible scenario
1. T1 invokes the depart method and acquires the synchronization lock.
2. T1 examines the condition and determines that the data is not in the desired state.
3. T1 invokes a wait method, which frees the lock; T1 is in WAITING state now.
4. Scheduler allows T2 to run (RUNNING state)
5. T2 invokes the arrive method, acquiring the synchronization lock.
6. If T3 gets a chance to run, and attempts to invoke arrive, it’ll end up in the BLOCKED state waiting for the lock.
7. T2 continues to run, checks the condition and finds that it can continue without having to call wait. It then updates the data.
8. T2 finishes the arrive method, frees the lock, and notifies all threads it is done.
9. Both T1 and T3 receives the notification and become RUNNABLE (not running yet)
10. T1 and T3 now both try to get hold of the lock.
11. Assume T1 obtains the lock now, it checks the condition and finds that the condition is satisfied, so it processes the data.
12. T1 exits depart method and gives up the lock, then notifies everyone it is done.
13. T3 now has a chance to run, acquires the lock and checks the condition, which is satisfied. It changes the data and sends a notification and completes the method.
14. The lock is freed up.
Topic 3: Parallel Concepts and Threads COSC 407: Intro to Parallel Computing
Dead or Alive(lock)….
§ Concurrent programs must satisfy two properties:
– Safety: the program doesn’t enter a bad state (e.g. wrong output)
• i.e., performs according to its specification. – Liveness: the program must perform progress.
§ Two problems related to liveness: Deadlock (see key terms):
A process is waiting for a resource that will never be available. This resource may be is held by another process that waits for the first process to finish.
– E.g.,Diningphilosopher’sproblem Livelock (see key terms):
A process is continuously running but without making any progress.
– e.g.1processP1actsinresponsetoP2,andthenP2reversesthe action of P1, bringing both to their initial state before the action.
– e.g.2,twopeoplemeetinginacorridorandmovingcontinuously
sideways without making progress.
Topic 3: Parallel Concepts and Threads COSC 407: Intro to Parallel Computing
For a deadlock to happen, four conditions must hold :
1. Mutual Exclusion
– Theprograminvolvesasharedresourceprotectedby
mutual exclusion (only one process can have that resources)
2. Hold while waiting
– Aprocesscanholdaresourcewhileitiswaitingforother
3. No pre-emption
– TheOScannotforceaprocesstodeallocatetheresourceit holds (the process must deallocate it voluntarily)
4. Circular wait
– P1iswaitingforaresourceheldbyP2,andP2iswaitingfor
a resource held by P1
Topic 3: Parallel Concepts and Threads COSC 407: Intro to Parallel Computing
Dealing with Deadlock
1. Prevention
– Preventdeadlockbypreventingoneofthefourdeadlock conditions from occurring.
2. Detection
– Numerousalgorithmsexistwhichenablethedetectionof deadlock
– Adatastructure,suchasatable,isusedtorecord:
• all objects in the system + the requests for locks on objects + the
allocations
• Through analysis of this data it is possible to detect whether or not
deadlock exists
• If deadlock is detected, the processes involved in the deadlock can be
3. Ignoring deadlock
– Ifdeadlocksrarelyhappenandthedatalossincurredeach
time is tolerable
Topic 3: Parallel Concepts and Threads COSC 407: Intro to Parallel Computing
Data Race and Race Condition
§ Data Race (see key terms)
– Occurs when ALL of the following 3 conditions are satisfied:
• 2 or more concurrent threads are accessing same memory location
• at least one thread is writing to the shared location
• there is no synchronization that is enforces any particular
order among these accesses
– The computation may give different results from run to run.
– The discovery of data races can be automated.
§ Race Condition:
– Program or output state depends on the ordering of events (by
– Many race conditions can be caused by data races, but this is not
necessary.
– Most race conditions are data races
• But you can have a race condition without data race
Topic 3: Parallel Concepts and Threads COSC 407: Intro to Parallel Computing
Discussion
§ Assuming x is a shared resource initially equal to 10. Two threads are accessing x using the following code (note that we are enforcing mutual exclusion when accessing x). Is there data race? Discuss
(the number beside lock/unlock indicates we are using the same lock for ensuring mutual exclusion of x)
unlock(l) unlock(l)
What do you mean by data race??
Topic 3: Parallel Concepts and Threads COSC 407: Intro to Parallel Computing
POSIX Threads
§ Historically, hardware vendors have implemented their own proprietary versions of threads
– These implementations differed substantially from each other making it difficult for programmers to develop portable threaded applications
§ In order to take full advantage of the capabilities provided by threads, a standardized programming interface was required.
– For UNIX systems, this interface has been specified by the IEEE POSIX 1003.1c standard (1995).
– Implementations adhering to this standard are referred to as POSIX threads, or Pthreads.
– Most hardware vendors now offer Pthreads in addition to their proprietary API’s.
§ The POSIX standard has continued to evolve and undergo revisions, including the Pthreads specification
§ Pthreads are defined as a set of C language programming types and procedure calls
– pthread.h header/include file and a thread library (comment on windows)
– this libra
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com