Microsoft PowerPoint – programmingmodel-3 [Compatibility Mode]
High Performance Computing
Course Notes
Shared Memory Parallel
Programming – II
Dr Ligang He
2Computer Science, University of Warwick
Techniques
Thread creation and running in OpenMP is implemented
by calling the multi-threading APIs in POSIX operating
systems
For example, during the compilation, if OpenMP compiler
realizes that the threads need to be generated, it inserts the
instructions of invoking the relevant API provided by OS
In this part, we are going to learn the multithreading
support in POSIX OS, including
Multiprocessing
Multithreading: user- and kernel-space multithreading
3Computer Science, University of Warwick
Multiprocessing
4Computer Science, University of Warwick
What is a process?
A process is a running instance of a program
A process records the running state of the program
5Computer Science, University of Warwick
Generate a new process: the Fork function
fork() is used to create a child process
6Computer Science, University of Warwick
How a new process is created
7Computer Science, University of Warwick
Generate a new process: the Fork function
fork() is used to create a child process
input and output of fork()
The child process is exactly the same as the parent except the
returned value of fork()
Use parent and child to do different tasks
8Computer Science, University of Warwick
Load a new program after fork
main () {
int pid, ret;
pid=fork(); /*generate a child process*/
if(pid==0) { /*run by the new process*/
ret=execl(“/bin/ls”, “/”); /*load a new program*/
}
else { /*run by the parent process*/
perform whatever operation
ret=wait(&status); /* wait the child process exit */
}
}
9Computer Science, University of Warwick
Scheduling Processes
When to switch the processes (timing for scheduling)
time slice runs out
System call
trap
Overhead of switching processes is relatively high,
have to save and load the following information, for
example,
Three segments
Open File descriptors
Signal handler table
program counter
10Computer Science, University of Warwick
Each entry in the process table (in kernel space) contains the
following:
Process ID
Parent process ID
Real and effective user and group IDs
State
Pending signals
Code segment
Data segment (static data)
Stack segment (temporary data)
User area
Signal handler table
Open file descriptors
Recent CPU usage
Hardware register contents (unless running)
Page table
11Computer Science, University of Warwick
User Space Thread
12Computer Science, University of Warwick
Create a new thread
Thread – short for thread of execution/control
13Computer Science, University of Warwick
void * Calc(void *)
main() {
int ret, param, tid;
/*create a new thread*/
ret=pthread_create(&tid, NULL, Calc, (void *) param);
continue to do other tasks
/*wait for the new thread to return*/
ret=pthread_join(tid, NULL);
}
void *Calc(void *param) {
int a, b; b=a+(int) param;
}
create – create new thread of execution that runs specified procedure
with specified arguments
join – used to wait for the return from a specified thread
Two key functions in C: pthread_create
and pthread_join
14Computer Science, University of Warwick
• User threads exist within a process; they are managed
by the process
• Unlike process switching, there is no time slice for
each thread; a thread needs to call the thread
switching explicitly
• When a thread calls an explicit switch, another thread
gets the opportunity to run
• A thread can hog the CPU so as to starve other
threads
• User space threads usually switch fast
Scheduling User Space Threads
15Computer Science, University of Warwick
Kernel Space(OS-managed) Thread
16Computer Science, University of Warwick
void *Calc(void *)
main() {
int ret, param;
pthread_t tid;
pthread_attr_t attr; pthread_attr_init(&attr);
pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM);
ret=pthread_create(&tid, &attr, Calc, (void *) param);
ret=pthread_join(tid, NULL); int a, b; b=a+(int) param;
}
Calc(void *param){
int a, b;
b = a+(int) param;
}
OS-supported Multithreading
17Computer Science, University of Warwick
Time slicing
System calls
Traps
The switching overhead stands between processes
and user space threads
Scheduling OS-supported Threads
18Computer Science, University of Warwick
Dealing with Multithreading
19Computer Science, University of Warwick
Race conditions – threads trying to update the same
data at the same time
thread 1 thread 2
A writes X B writes X
Deadlock – to avoid race conditions threads lock
data
thread 1 thread 2
A locks X B locks Y
A runs Y B runs X
A unlocks X B unlocks Y
Starvation – low priority threads don’t get scheduled
Problems with concurrency
Time
Time
20Computer Science, University of Warwick
In parallel computing, threads run
asynchronously (i.e., at their own pace),
But, they synchronize at certain points, for
example accessing global shared memory
locations.
Two types of synchronisation: mutual exclusion
and cooperation
Techniques for synchronisation
Mutex (semaphore, lock) to address mutual
exclusion
wait and notify to address cooperation
Coordination among threads
21Computer Science, University of Warwick
• Critical section – section of code that access the global
shared data and therefore should be accessed by one
thread at a time
• Mutex (or lock) – used to enforce mutual exclusion of
threads in a critical section
• Mutex has two states: locked and unlocked.
Initially unlocked; entering the critical section locks the
mutex, unlocks the mutex when exiting the critical section
If another thread attempts to access a locked mutex, it
blocks until the mutex is unlocked
o so only one thread can access the critical section at a
time
Mutual exclusion
22Computer Science, University of Warwick
pthread_mutex_my_mutex; /* declared in global area*/
void *Calc(void *);
main(){
pthread_mutex_init(&my_mutex, NULL) /* before pthread_create*/
for(ith=0; ith