UNIS Template
Concurrency with Threads and Synchronization
Shuaiwen
Future System Architecture Lab (FSA)
Sep 17th, 2021
To learn what a thread is
To understand the difference between processes and threads
To learn about programming with threads using the pthread library
Data Race and Race condition
Protecting shared resource: Synchronization, Atomic Operations, Immutable Data
Synchronization: Semaphores, Mutex, Conditional Variables, read-write locks, spin locks, etc.
Objectives
Clean Sharing Model
Global variables (address spaces) are not shared
File tables (file descriptors) are shared
Simple and straightforward
Forking a child process is easy
Reaping child processes is simple
Processes are isolated for protection
Process Pros
Forking a process is expensive
Need to clone parent’s memory space
Need to setup a separate context for child
Process Cons
Note: the concept of context switch and why it has high overhead: https://en.wikipedia.org/wiki/Context_switch#:~:text=In%20computing%2C%20a%20context%20switch,of%20a%20multitasking%20operating%20system.
Context switch concept
4
Forking a process is expensive
Need to clone parent’s memory space
Need to setup a separate context for child
Non-trivial to share data between processes
Need to use files/pipes to share data
Can use shared memory
Lots of overhead!
~20K cycles to create and reap a process
Process Cons
Process = context + code, data, and stack
shared libraries
0
Program context: Data registers Condition codes Stack pointer (SP)
Program counter (PC)
Kernel context:
VM structures Descriptor table
Code, data, and stack
run-time heap read/write data read-only code/data
stack
SP
PC
Process context
View of a Process
Process = context + code, data, and stack
shared libraries
0
Program context: Data registers Condition codes Stack pointer (SP)
Program counter (PC)
Kernel context:
VM structures Descriptor table
Code, data, and stack
run-time heap read/write data read-only code/data
stack
SP
PC
Process context
View of a Process
0
Thread context: Data registers Condition codes
Stack pointer (SP) Program counter (PC)
run-time heap read/write data read-only code/data
stack
SP
PC
Process = thread + code, data, and kernel context
Thread (main thread) Code and Data
shared libraries
Kernel context:
VM structures Descriptor table
Alternate View of a Process
Registers: for record the call stack, hat stores the address of the last program request in a stack
Small register: to record the address of the currently executed instruction . Will increase by 1 everytime an instruction is feteched.
8
Multiple threads can be associated with a process
– Each thread has its own logical control flow
shared libraries
0
Thread 1 context: Data registers Condition codes SP1
PC1
Shared code and data
run-time heap read/write data read-only code/data
stack 1
Thread 1 (main thread)
Kernel context:
VM structures Descriptor table
Thread 2 context: Data registers Condition codes SP2
PC2
stack 2
Thread 2 (peer thread)
A Process With Multiple Threads
Multiple threads can be associated with a process
Each thread has its own logical control flow
Each thread shares the same code, data, and kernel context
shared libraries
0
Thread 1 context: Data registers Condition codes SP1
PC1
Shared code and data
run-time heap read/write data read-only code/data
stack 1
Thread 1 (main thread)
Kernel context:
VM structures Descriptor table
Thread 2 context: Data registers Condition codes SP2
PC2
stack 2
Thread 2 (peer thread)
A Process With Multiple Threads
Multiple threads can be associated with a process
Each thread has its own logical control flow
Each thread shares the same code, data, and kernel context
Each thread has its own stack for local variables
shared libraries
0
Thread 1 context: Data registers Condition codes SP1
PC1
Shared code and data
run-time heap read/write data read-only code/data
stack 1
Thread 1 (main thread)
Kernel context:
VM structures Descriptor table
Thread 2 context: Data registers Condition codes SP2
PC2
stack 2
Thread 2 (peer thread)
A Process With Multiple Threads
Multiple threads can be associated with a process
Each thread has its own logical control flow
Each thread shares the same code, data, and kernel context
Each thread has its own stack for local variables
but not protected from other threads
shared libraries
0
Thread 1 context: Data registers Condition codes SP1
PC1
Shared code and data
run-time heap read/write data read-only code/data
stack 1
Thread 1 (main thread)
Kernel context:
VM structures Descriptor table
Thread 2 context: Data registers Condition codes SP2
PC2
stack 2
Thread 2 (peer thread)
A Process With Multiple Threads
Multiple threads can be associated with a process
Each thread has its own logical control flow
Each thread shares the same code, data, and kernel context
Each thread has its own stack for local variables
but not protected from other threads
Each thread has its own thread id (TID)
0
Thread 1 context: Data registers Condition codes SP1
PC1
run-time heap read/write data read-only code/data
stack 1
Thread 1 (main thread)
Kernel context:
VM structures Descriptor table
Thread 2 context: Data registers Condition codes SP2
PC2
stack 2
Shared code and data Thread 2 (peer thread)
shared libraries
A Process With Multiple Threads
(typically functions)
Process
Processes
Represent an entire program
Threads
Represents smaller chunks of code
Threads
OS: Concept of a Thread
(typically functions)
Process
Processes
Represent an entire program
High Overhead context switch
Threads
Represents smaller chunks of code
Lower Overhead context switch
Threads
OS: Concept of a Thread
Threads associated with process form a pool of peers
– Unlike processes which form a tree hierarchy
P0
P1
sh
sh
sh
foo
bar
T1
Process hierarchy
Threads associated with process foo
T2
T4
T5
T3
shared code, data and kernel context
Logical View of Threads
How threads and processes are similar
Each has its own logical control flow
Each can run concurrently with others
Each is context switched
Threads vs. Processes
How threads and processes are similar
Each has its own logical control flow
Each can run concurrently with others
Each is context switched
How threads and processes are different
Threads share code and some data
Processes (typically) do not
Threads vs. Processes
How threads and processes are similar
Each has its own logical control flow
Each can run concurrently with others
Each is context switched
How threads and processes are different
Threads share code and some data
Processes (typically) do not
Threads are somewhat less expensive than processes
Process control is twice as expensive as thread control
Linux numbers:
~20K cycles to create and reap a process
~10K cycles (or less) to create and reap a thread
Threads vs. Processes
19
Two threads are concurrent if their flows overlap in time
Otherwise, they are sequential
Concurrent Threads
Two threads are concurrent if their flows overlap in time
Otherwise, they are sequential
Examples:
Time
Thread A
Thread B
Thread C
Concurrent Threads
Two threads are concurrent if their flows overlap in time
Otherwise, they are sequential
Examples:
Time
Thread A
Thread B
Thread C
Concurrent Threads
Two threads are concurrent if their flows overlap in time
Otherwise, they are sequential
Examples:
Time
Thread A
Thread B
Thread C
Concurrent Threads
Two threads are concurrent if their flows overlap in time
Otherwise, they are sequential
Examples:
Time
Thread A
Thread B
Thread C
Concurrent Threads
Two threads are concurrent if their flows overlap in time
Otherwise, they are sequential
Examples:
Time
Thread A
Thread B
Thread C
Concurrent Threads
Two threads are concurrent if their flows overlap in time
Otherwise, they are sequential
Examples:
Time
Thread A
Thread B
Thread C
Concurrent Threads
Two threads are concurrent if their flows overlap in time
Otherwise, they are sequential
Examples:
Concurrent: A & B, A&C
Sequential: B & C
Time
Thread A
Thread B
Thread C
Concurrent Threads
Single Core Processor
– Simulate parallelism by time slicing
Time
Thread A Thread B Thread C
Multi-Core Processor
– Can have true parallelism
Thread A Thread B Thread C
Run 3 threads on 2 cores
Concurrent Threads
Right: Memory stalls
28
Pthreads: Standard interface for ~60 functions that manipulate threads from C programs
Posix Threads (Pthreads) Interface
Pthreads: Standard interface for ~60 functions that manipulate threads from C programs
Creating and reaping threads
pthread_create()
pthread_join()
Posix Threads (Pthreads) Interface
Pthreads: Standard interface for ~60 functions that manipulate threads from C programs
Creating and reaping threads
pthread_create()
pthread_join()
Can only wait for a specific thread to terminate
Posix Threads (Pthreads) Interface
Pthreads: Standard interface for ~60 functions that manipulate threads from C programs
Creating and reaping threads
pthread_create()
pthread_join()
Determining your thread ID
pthread_self()
Posix Threads (Pthreads) Interface
Pthreads: Standard interface for ~60 functions that manipulate threads from C programs
Creating and reaping threads
pthread_create()
pthread_join()
Determining your thread ID
pthread_self()
Terminating threads
pthread_cancel()
Posix Threads (Pthreads) Interface
Pthreads: Standard interface for ~60 functions that manipulate threads from C programs
Creating and reaping threads
pthread_create()
pthread_join()
Determining your thread ID
pthread_self()
Terminating threads
pthread_exit() [terminates current threads]
exit() [terminates all threads]
Posix Threads (Pthreads) Interface
/*
* hello.c – Pthreads “hello, world” program
*/
#include “csapp.h”
void *thread(void *vargp); int main() {
pthread_t tid;
thread,
NULL);
Pthread_create(&tid, NULL, Pthread_join(tid, NULL); exit(0);
}
/* thread routine */
void *thread(void *vargp) { printf(“Hello, world!\n”); return NULL;
}
The Pthreads “hello, world” Program
program
/*
* hello.c – Pthreads “hello, world”
*/
#include “csapp.h”
void *thread(void *vargp);
int main() { pthread_t tid;
thread,
NULL);
Pthread_create(&tid, NULL, Pthread_join(tid, NULL); exit(0);
}
/* thread routine */
void *thread(void *vargp) { printf(“Hello, world!\n”); return NULL;
}
thread routine that
the new thread executes
The Pthreads “hello, world” Program
/*
* hello.c – Pthreads “hello, world” program
*/
#include “csapp.h”
void *thread(void *vargp);
int main() { pthread_t tid;
thread,
NULL);
Pthread_create(&tid, NULL, Pthread_join(tid, NULL); exit(0);
}
/* thread routine */
void *thread(void *vargp) { printf(“Hello, world!\n”); return NULL;
}
thread id
The Pthreads “hello, world” Program
/*
* hello.c – Pthreads “hello, world” program
*/
#include “csapp.h”
void *thread(void *vargp);
int main() { pthread_t tid;
Pthread_create(&tid, NULL, thread, NULL); Pthread_join(tid, NULL);
exit(0);
}
/* thread routine */
void *thread(void *vargp) { printf(“Hello, world!\n”); return NULL;
}
Thread attributes (usually NULL)
The Pthreads “hello, world” Program
/*
* hello.c – Pthreads “hello, world” program
*/
#include “csapp.h”
void *thread(void *vargp);
int main() { pthread_t tid;
Pthread_create(&tid, NULL, thread, NULL); Pthread_join(tid, NULL);
exit(0);
}
/* thread routine */
void *thread(void *vargp) { printf(“Hello, world!\n”); return NULL;
}
Thread arguments (void *p)
The Pthreads “hello, world” Program
/* thread routine */
void *thread(void *vargp) { printf(“Hello, world!\n”); return NULL;
}
program
/*
* hello.c – Pthreads “hello, world”
*/
#include “csapp.h”
void *thread(void *vargp);
int main() { pthread_t tid;
thread,
NULL);
Pthread_create(&tid, NULL, Pthread_join(tid, NULL); exit(0);
}
return value
(void **p)
The Pthreads “hello, world” Program
/*
* hello.c – Pthreads “hello, world” program
*/
#include “csapp.h”
void *thread(void *vargp); int main() {
pthread_t tid;
thread,
NULL);
Pthread_create(&tid, NULL, Pthread_join(tid, NULL); exit(0);
}
/* thread routine */
void *thread(void *vargp) { printf(“Hello, world!\n”); return NULL;
}
The Pthreads “hello, world” Program
main thread
Execution of Threaded“hello, world”
main thread
call Pthread_create()
Execution of Threaded“hello, world”
main thread
call Pthread_create()
peer thread
Execution of Threaded“hello, world”
main thread
call Pthread_create()
peer thread
call Pthread_join()
main thread waits for peer thread to terminate
Execution of Threaded“hello, world”
main thread
call Pthread_create()
peer thread
main thread waits for peer thread to terminate
call Pthread_join()
printf() return NULL;
(peer thread terminates)
Execution of Threaded“hello, world”
main thread
call Pthread_create()
peer thread
main thread waits for peer thread to terminate
call Pthread_join()
printf() return NULL;
(peer thread terminates)
Pthread_join() returns
Execution of Threaded“hello, world”
main thread
call Pthread_create()
peer thread
main thread waits for peer thread to terminate
call Pthread_join()
printf() return NULL;
(peer thread terminates)
Pthread_join() returns
exit()
terminates main thread and any peer threads
Execution of Threaded“hello, world”
int main()
{
int i; pthread_t tid;
for (i = 0; i < 2; i++)
pthread_create(&tid, NULL, thread, (void *)&i); pthread_exit(NULL);
}
void *thread(void *vargp)
{
int myid = (int)vargp; printf("hello from [%d]\n", myid); return NULL;
}
Passing argument
int main()
{
int i;
pthread_t tid;
for (i = 0; i < 2; i++)
pthread_create(&tid, NULL, thread, (void *)&i); pthread_exit(NULL);
}
void *thread(void *vargp)
{
int myid = *((int *)vargp); printf("hello from [%d]\n", myid); return NULL;
}
Passing argument
int main (int argc, char *argv[]) { pthread_t tid;
int thread_arg = 0xDEADBEEF;
printf("Main thread creating peer thread.\n"); pthread_create(&tid, NULL, thread, &thread_arg); printf("Main thread waiting for peer thread %u to
complete...\n", (unsigned int) tid); pthread_join(tid, NULL);
printf("Main thread joined with peer thread.\n");
return 0;
}
void *thread (void *vargp) { int arg = *((int*) vargp);
printf("hello from thread %X!\n", arg); return NULL;
}
Passing argument
int main (int argc, char *argv[]) { pthread_t tid;
int thread_arg = 0xDEADBEEF;
printf("Main thread creating peer thread.\n"); pthread_create(&tid, NULL, thread, &thread_arg); printf("Main thread waiting for peer thread %u to
complete...\n", (unsigned int) tid); pthread_join(tid, NULL);
printf("Main thread joined with peer thread.\n");
return 0;
}
void *thread (void *vargp) { int arg = *((int*) vargp);
printf("hello from thread %X!\n", arg); return NULL;
}
Passing argument
int main (int argc, char *argv[]) { pthread_t tid;
int thread_arg = 0xDEADBEEF;
printf("Main thread creating peer thread.\n"); pthread_create(&tid, NULL, thread, &thread_arg); printf("Main thread waiting for peer thread %u to
complete...\n", (unsigned int) tid); pthread_join(tid, NULL);
printf("Main thread joined with peer thread.\n");
return 0;
}
void *thread (void *vargp) { int arg = *((int*) vargp);
printf("hello from thread %X!\n", arg); return NULL;
}
Passing argument
int main (int argc, char *argv[]) { pthread_t tid;
int ret_value;
pthread_create(&tid, NULL, thread, NULL); pthread_join(tid, (void **)(&ret_value)); printf("Main thread joined with peer thread: %X.\n",
ret_value);
return 0;
}
void *thread (void *vargp) { return (int *) 0xDEADBEEF;
}
Returning value
void *thread (void *vargp) { int arg = *((int*)vargp);
printf("hello from thread: %X!\n", arg); int *ret = (int*)malloc(sizeof(int));
*ret = arg; return ret;}
Returning pointer
int main (int argc, char *argv[]) { pthread_t tid;
int thread_arg = 0xDEADBEEF; int* ret_value;
printf("Main thread creating peer thread.\n"); pthread_create(&tid, NULL, thread, &thread_arg);
printf("Main thread waiting for peer thread %u to
complete...\n", (unsigned int)(tid)); pthread_join(tid, (void **)(&ret_value));
printf("Main thread joined with peer thread:%X.\n",
*ret_value); free(ret_value); return 0; }
void *thread (void *vargp) { int arg = *((int*)vargp);
printf("hello from thread: %X!\n", arg); int *ret = (int*)malloc(sizeof(int));
*ret = arg; return ret;}
Returning pointer
int main (int argc, char *argv[]) { pthread_t tid;
int thread_arg = 0xDEADBEEF; int* ret_value;
printf("Main thread creating peer thread.\n"); pthread_create(&tid, NULL, thread, &thread_arg);
printf("Main thread waiting for peer thread %u to
complete...\n", (unsigned int)(tid)); pthread_join(tid, (void **)(&ret_value));
printf("Main thread joined with peer thread:%X.\n",
*ret_value); free(ret_value); return 0; }
void *thread (void *vargp) { int arg = *((int*)vargp);
printf("hello from thread: %X!\n", arg); int *ret = (int*)malloc(sizeof(int));
*ret = arg; return ret;}
Returning pointer
int main (int argc, char *argv[]) { pthread_t tid;
int thread_arg = 0xDEADBEEF; int* ret_value;
printf("Main thread creating peer thread.\n"); pthread_create(&tid, NULL, thread, &thread_arg);
printf("Main thread waiting for peer thread %u to
complete...\n", (unsigned int)(tid)); pthread_join(tid, (void **)(&ret_value));
printf("Main thread joined with peer thread:%X.\n",
*ret_value); free(ret_value); return 0; }
void *thread (void *vargp) { int arg = *((int*)vargp);
printf("hello from thread: %X!\n", arg); int *ret = (int*)malloc(sizeof(int));
*ret = arg; return ret;}
Returning pointer
int main (int argc, char *argv[]) { pthread_t tid;
int thread_arg = 0xDEADBEEF; int* ret_value;
printf("Main thread creating peer thread.\n"); pthread_create(&tid, NULL, thread, &thread_arg);
printf("Main thread waiting for peer thread %u to
complete...\n", (unsigned int)(tid)); pthread_join(tid, (void **)(&ret_value));
printf("Main thread joined with peer thread:%X.\n",
*ret_value); free(ret_value); return 0; }
void *thread (void *vargp) { int arg = *((int*)vargp);
printf("hello from thread: %X!\n", arg); int *ret = (int*)malloc(sizeof(int));
*ret = arg; return ret;}
Returning pointer
int main (int argc, char *argv[]) { pthread_t tid;
int thread_arg = 0xDEADBEEF; int* ret_value;
printf("Main thread creating peer thread.\n"); pthread_create(&tid, NULL, thread, &thread_arg);
printf("Main thread waiting for peer thread %u to
complete...\n", (unsigned int)(tid)); pthread_join(tid, (void **)(&ret_value));
printf("Main thread joined with peer thread:%X.\n",
*ret_value); free(ret_value); return 0; }
void *thread (void *vargp) { int arg = *((int*)vargp);
printf("hello from thread: %X!\n", arg); int *ret = (int*)malloc(sizeof(int));
*ret = arg; return ret;}
Returning pointer
int main (int argc, char *argv[]) { pthread_t tid;
int thread_arg = 0xDEADBEEF; int* ret_value;
printf("Main thread creating peer thread.\n"); pthread_create(&tid, NULL, thread, &thread_arg);
printf("Main thread waiting for peer thread %u to
complete...\n", (unsigned int)(tid)); pthread_join(tid, (void **)(&ret_value));
printf("Main thread joined with peer thread:%X.\n",
*ret_value); free(ret_value); return 0; }
void *thread (void *vargp) { int arg = *((int*)vargp);
printf("hello from thread: %X!\n", arg); int *ret = (int*)malloc(sizeof(int));
*ret = arg; return ret;}
%X.\n",
What if I don’t use
malloc?
Returning pointer
int main (int argc, char *argv[]) { pthread_t tid;
int thread_arg = 0xDEADBEEF; int* ret_value;
printf("Main thread creating peer thread.\n"); pthread_create(&tid, NULL, thread, &thread_arg);
printf("Main thread waiting for peer thread %u to
complete...\n", (unsigned int)(tid)); pthread_join(tid, (void **)(&ret_value));
printf("Main thread joined with peer thread:%X.\n",
*ret_value); free(ret_value); return 0; }
i-clicker: What is the output?
void *thread (void *vargp) { int arg = *((int*)vargp); return &arg;}
int main (int argc, char *argv[]) { pthread_t tid;
int thread_arg = 0xDEADBEEF; int* ret_value;
pthread_create(&tid, NULL, thread, &thread_arg); pthread_join(tid, (void **)(&ret_value)); printf("%X\n", *ret_value);
return 0; }
A. 0xDEADBEEF
Segfault
0
-1
Returning pointer
i-clicker solution: What is the output?
void *thread (void *vargp) { int arg = *((int*)vargp); return &arg;}
int main (int argc, char *argv[]) { pthread_t tid;
int thread_arg = 0xDEADBEEF; int* ret_value;
pthread_create(&tid, NULL, thread, &thread_arg); pthread_join(tid, (void **)(&ret_value)); printf("%X\n", *ret_value);
return 0; }
A. 0xDEADBEEF
Segfault
0
-1
Returning pointer
void *thread (void *vargp) { pthread_detach(pthread_self()); printf("hello from thread world!\n");
}
int main (int argc, char *argv[]) {
// This is used to record the thread id: pthread_t tid;
printf("Main thread creating peer thread.\n"); pthread_create(&tid, NULL, thread, NULL);
pthread_exit(NULL); return 0;
}
Detached mode
void *thread (void *vargp) { pthread_detach(pthread_self()); printf("hello from thread world!\n");
}
int main (int argc, char *argv[]) {
// This is used to record the thread id: pthread_t tid;
printf("Main thread creating peer thread.\n"); pthread_create(&tid, NULL, thread, NULL);
pthread_exit(NULL); return 0;
}
Detached mode
Detached mode
Run thread in “detached” mode.
Runs independently of other threads
Reaped automatically (by kernel) when it terminates
without the need for another thread to join with the terminated thread
void *thread (void *vargp) { pthread_detach(pthread_self()); printf("hello from thread world!\n");
}
Detached mode
Ohterwise, parent thread exits while child threads are still executing. This will report errors.
66
Easy to share global data structure
Easy to implement a cache that all the threads can use
Concurrent Programming is Nice!
The human mind tends to be sequential
The notion of time is often misleading
Concurrent Programming is Hard!
The human mind tends to be sequential
The notion of time is often misleading
Thinking about all possible sequences of events in a computer system is at least error prone and frequently impossible
Concurrent Programming is Hard!
Classical problem classes of concurrent programs:
Races: outcome depends on arbitrary scheduling decisions elsewhere in the system
Example: who gets the last seat on the airplane?
Concurrent Programming is Hard!
Threads share the heap
Threads have separate stacks
But, C has pointers
which can point to anywhere, even the stack!
What if…
I pass a pointer to a peer thread that points to a location on the stack
Unintended Sharing!
Unintended Sharing
int main (int argc, char *argv[]) { pthread_t tid;
for(int i=0; i<10; i++) { pthread_create(&tid, NULL, thread, &i);
}
pthread_exit(NULL); return 0;
}
void *thread (void *vargp) { pthread_detach(pthread_self()); int arg = *((int*) vargp);
printf("hello from thread %d!\n", arg); return NULL;
}
Unintended sharing
int main (int argc, char *argv[]) { pthread_t tid;
for(int i=0; i<10; i++) { pthread_create(&tid, NULL, thread, &i);
}
pthread_exit(NULL); return 0;
}
void *thread (void *vargp) { pthread_detach(pthread_self()); int arg = *((int*) vargp);
printf("hello from thread %d!\n", arg); return NULL;
}
Unintended sharing
main thread
for(int i=0; i<10; i++) { pthread_create(&tid, NULL, thread, &i);
}
Main thread stack
i
Potential Form of Unintended Sharing
main thread
peer1
for(int i=0; i<10; i++) { pthread_create(&tid, NULL, thread, &i);
}
i
Main thread stack
vargp
Peer1 stack
i = i1
Potential Form of Unintended Sharing
main thread
peer1
for(int i=0; i<10; i++) { pthread_create(&tid, NULL, thread, &i);
}
i
Main thread stack
vargp
Peer1 stack
i = i1
peer2
i = i2
Peer2 stack
vargp
Potential Form of Unintended Sharing
main thread
peer1
for(int i=0; i<10; i++) { pthread_create(&tid, NULL, thread, &i);
}
i
Main thread stack
vargp
Peer1 stack
i = i1
peer2
i = i2
Peer2 stack
vargp
*vargp
Potential Form of Unintended Sharing
main thread
peer1
for(int i=0; i<10; i++) { pthread_create(&tid, NULL, thread, &i);
}
i
Main thread stack
vargp
Peer1 stack
i = i1
peer2
i = i2
vargp
Peer2 stack
*vargp
*vargp
Potential Form of Unintended Sharing
int main (int argc, char *argv[]) { pthread_t tid;
int *argp;
for(int i=0; i<10; i++) {
argp = (int*)malloc(sizeof(int));
*argp = i;
pthread_create(&tid, NULL, thread, argp);
}
pthread_exit(NULL); return 0;
}
void *thread (void *vargp) { pthread_detach(pthread_self()); int arg = *((int*) vargp);
printf("hello from thread %d!\n", arg); free(vargp); return NULL;}
Fix unintended sharing
A data race occurs when two or more threads can access shared data and one of them is a write. Because the thread scheduling algorithm can swap between threads at any time, you don’t know the order in which the threads will attempt to access the shared data.
Data Race
Race Condition
81
A race condition is a semantic error. It is a flaw that occurs in the timing or the ordering
of events that leads to erroneous program behavior.
Consider the following simple example where x is a shared variable:
Tools that can help you with Synchronization Problems
82
Given the non-deterministic, (i.e. extremely hard) nature of multithreaded applications, people have developed interesting tools to detect errors and potential pitfalls in concurrent code. Projects like Google's TSan or Helgrind are just a few of them.
83
Global variables
Def: Variable declared outside of a function
Virtual memory contains exactly one instance of any global variable
Local variables
Def: Variable declared inside function without static attribute
Each thread stack contains one instance of each local variable
Local static variables
Def: Variable declared inside function with the static attribute
Virtual memory contains exactly one instance of any local static variable.
Mapping Variable Instances to Memory
char **ptr; /* global */
int main()
{
"Hello "Hello
from from
int i; pthread_t tid;
char *msgs[2] = {
foo", bar"
};
ptr = msgs;
for (i = 0; i < 2; i++) Pthread_create(&tid,
NULL,
thread, (void *)i);
Pthread_exit(NULL);
}
/* thread routine */
void *thread(void *vargp)
{
int myid = (int)vargp; static int cnt = 0;
printf("[%d]: %s (svar=%d)\n", myid, ptr[myid], ++cnt);
}
Global var: 1 instance (ptr [data])
Local vars: 1 instance (i.m, msgs.m)
Local var: 2 instances (
myid.p0 [peer thread 0’s stack],
myid.p1 [peer thread 1’s stack]
)
Local static var: 1 instance (cnt [data])
84
Mapping Variable Instances to Memory
Is this program 100% correct?
I and myid are not shared.
84
85
Shared variables are handy...
…but introduce the possibility of nasty synchronization
errors.
Synchronizing Threads
Data Race
86
/* Global shared variable */
long cnt = 0; /* Counter */
int main(int argc, char **argv)
{
long niters; pthread_t tid1, tid2;
niters = atoi(argv[1]); pthread_create(&tid1, NULL,
thread, &niters); pthread_create(&tid2, NULL,
thread, &niters); pthread_join(tid1, NULL); pthread_join(tid2, NULL);
/* Thread routine */ void *thread(void *vargp)
{
long i, niters =
*((long *)vargp);
for (i = 0; i < niters; i++) cnt++;
return NULL;
/* Check result */
if (cnt != (2 * niters)) printf("BOOM! cnt=%ld\n", cnt);
else
printf("OK cnt=%ld\n", cnt); exit(0);
badcnt.c
}
}
linux> ./main 1000000 BOOM! cnt=1069894
What went wrong?
cnt should equal 2,000,000.
for (i = 0; i < niters; i++) cnt++;
C code for counter loop in thread i
cnt(%rip),%rdx
$1, %rdx
%rdx, cnt(%rip)
$1, %rax
%rcx, %rax
.L3
movq (%rdi), %rcx testq %rcx,%rcx jle .L2
movl $0, %eax
.L3:
movq addq movq addq cmpq jne
.L2:
Hi : Head
i
T : Tail
Li : Load cnt Ui : Update cnt Si : Store cnt
Asm code for thread i
87
Assembly Code for Counter Loop
movq addq movq
88
cnt(%rip),%rdx
$1, %rdx
%rdx, cnt(%rip)
Suppose that cnt starts with value 0, and that two threads each execute the code below once. What are the possible values for cnt afterward?
Only 2
1 or 2
0 or 1 or 2
None of the above
Question
movq addq movq
89
cnt(%rip),%rdx
$1, %rdx
%rdx, cnt(%rip)
Suppose that cnt starts with value 0, and that two threads each execute the code below once. What are the possible values for cnt afterward?
Only 2
1 or 2
0 or 1 or 2
None of the above
Question
90
Key idea: In general, any sequentially consistent interleaving is possible, but some give an unexpected result!
Concurrent Execution
atomicity — if your code contains instructions that operate on data shared across multiple threads, an unregulated concurrent access to that data might trigger a data race. The code segment that contains those instructions is called critical section. You want to make sure that critical sections are executed atomically: an atomic operation can't be broken into smaller ones, so that while a thread is executing it no other thread can slip through;
ordering — sometimes you want two or more threads to perform their job in a predictable order, or put a restriction on how many threads can access a specific resource. Normally you don't have control over these things, which might be the root cause of race conditions. With synchronization you can orchestrate your threads to perform according to a plan.
91
We must synchronize the execution of the threads so that they can never have overlap critical sections.
i.e., need to guarantee mutually exclusive access for each critical section.
Enforcing Mutual Exclusion
92
We must synchronize the execution of the threads so that they can never have overlap critical sections.
i.e., need to guarantee mutually exclusive access for each critical section.
Solutions:
Semaphores
Mutex locks
Condition variables
Atomic operations
Enforcing Mutual Exclusion
93
Semaphore: non-negative global integer synchronization variable.
Unlike mutex (a locking mechanism), it is a signaling mechanism. It guarantees both ordering and atomicity.
Unlike a mutex, any thread can release a semaphore, not only the one that has acquired it in the first place.
Semaphores
94
A semaphore controls the access to a shared resource: the counter determines the maximum number of threads that can simultaneously access it. At the beginning of your program, when the semaphore gets initialized, you choose that number according to your needs. Then, a thread that wants to access a shared resource calls acquire:
if the counter is greater than zero the thread can proceed. The counter gets reduced by one right away, then the current thread starts doing its job. When done, it calls release which in turn increases the counter by one.
if the counter is equal to zero the thread cannot proceed: other threads have already filled up the available space. The current thread is put to sleep by the operating system and will wake up when the semaphore counter becomes greater than zero again (that is when any other thread calls release once its job is done).
Unlike a mutex, any thread can release a semaphore, not only the one that has acquired it in the first place.
Semaphore: non-negative global integer synchronization variable.
semaphore
95
Semaphores
Semaphore: non-negative global integer synchronization variable.
semaphore
2
96
Semaphores
Semaphore: non-negative global integer synchronization variable.
semaphore
1
97
Semaphores
Semaphore: non-negative global integer synchronization variable.
semaphore
0
98
Semaphores
Semaphore: non-negative global integer synchronization variable.
semaphore
1
99
Semaphores
100
Semaphore: non-negative global integer synchronization variable. Manipulated by P and V operations.
P(s) ---- sem_wait ()
If s is nonzero, then decrement s by 1 and return immediately.
Test and decrement operations occur atomically (indivisibly)
If s is zero, then suspend thread until s becomes nonzero and the thread is restarted by a V operation.
After restarting, the P operation decrements s and returns control to the caller.
V(s) ---- sem_post()
Increment s by 1.
Increment operation occurs atomically
If there are any threads blocked in a P operation waiting for s to become non- zero, then restart exactly one of those threads, which then completes its P operation by decrementing s.
Semaphore invariant: (s >= 0)
Semaphores
C Semaphore Operations
Pthreads functions:
#include
int sem_init(sem_t *s, 0, unsigned int val);} /* s = val */
101
int sem_wait(sem_t *s); /* P(s) */
int sem_post(sem_t *s); /* V(s) */
/* Global shared variable */
long cnt = 0; /* Counter */
int main(int argc, char **argv)
{
long niters; pthread_t tid1, tid2;
niters = atoi(argv[1]); Pthread_create(&tid1, NULL,
thread, &niters); Pthread_create(&tid2, NULL,
thread, &niters); Pthread_join(tid1, NULL); Pthread_join(tid2, NULL);
/* Check result */
if (cnt != (2 * niters)) printf(“BOOM! cnt=%ld\n”, cnt);
else
printf(“OK cnt=%ld\n”, cnt); exit(0);
}
/* Thread routine */ void *thread(void *vargp)
{
long i, niters =
*((long *)vargp);
for (i = 0; i < niters; i++) cnt++;
return NULL;
}
102
How can we fix this using semaphores?
Improper Synchronization
103
Basic idea:
Associate a unique semaphore mutex, initially 1, with each shared variable (or related set of shared variables).
Surround corresponding critical sections with P and V operations.
Using Semaphores for Mutual Exclusion
Terminology:
Binary semaphore: semaphore whose value is always 0 or 1
Mutex: binary semaphore used for mutual exclusion
P operation: “locking” the mutex
V operation: “unlocking” or “releasing” the mutex
“Holding” a mutex: locked and not yet unlocked.
Counting semaphore: used as a counter for set of available resources.
Proper Synchronization
Define and initialize a mutex for the shared variable cnt:
long cnt = 0; /* Counter */
sem_t mutex; /* Semaphore that protects cnt */
sem_init(&mutex, 0, 1); /* mutex = 1 */
Surround critical section with P and V:
for (i = 0; i < niters; i++) { sem_wait(&mutex);
cnt++; sem_post(&mutex);
} goodcnt.c
104
linux> ./goodcnt 1000000 OK cnt=1000000
linux> ./goodcnt 1000000 OK cnt=1000000
Binary Semaphore
A semaphore whose counter is restricted to the values 0 and 1 is called binary semaphore: only one thread at a time can access the shared resource. Wait: this is basically a mutex protecting a critical section! You can actually replicate the mutex behavior with a binary semaphore. However there are two important points to keep in mind:
a mutex can be unlocked only by thread that had locked it first, while a semaphore can be released from any other thread. This could lead to confusion and subtle bugs if what you want is just a locking mechanism;
semaphores are signaling mechanisms that orchestrate threads, while mutexes are locking mechanisms that protects shared resources. You should not use semaphores to protect shared resources, nor mutexes for signaling: your intent will be more clear to you and to anyone who will read your code.
105
Producer-Consumer with Semaphore
The producer-consumer scenario imposes the following constraints:
The producer thread must not overwrite the shared buffer when the previous task has not been picked up by a consumer thread.
The consumer threads must not pick up tasks until there is something present in the shared buffer.
Individual consumer threads should pick up tasks one at a time.
106
Producer
Producer
Producer
Buffer Queue
Consumer
Consumer
Consumer
Producer-Consumer with Semaphore
107
sem_t empty; // empty slot in buffer
sem_t occupied; // occupied slot in buffer
sem_t p_lock; // mutex for producer function
sem_t c_lock; // mutex for consumer function
void *producer(void *producer_thread_data) {
while (!done()) {
sem_wait(&empty);
sem_wait(&p_lock);
insert_into_queue();
sem_post(&p_lock);
sem_post(&occupied);
}
}
void *consumer(void *consumer_thread_data) {
while (!done()) {
sem_wait(&occupied);
sem_wait(&c_lock);
my_task = extract_from_queue();
sem_post(&c_lock);
sem_post(&empty);
process_task(my_task);
}
}
void main() {
/* declarations and initializations */
sem_init(&p_lock, 0, 1); /* mutex = 0 */
sem_init(&c_lock, 0, 1); /* mutex = 0 */
sem_init(&empty, 0, BUFFER_SIZE); /* mutex = 0 */
sem_init(&occupied, 0, 0); /* mutex = 0 */
……
}
Thread-safe
107
107
Programmers need a clear model of how variables are shared by threads.
Variables shared by multiple threads must be protected to ensure mutually exclusive access.
Semaphores are a fundamental mechanism for enforcing mutual exclusion.
Summary
/docProps/thumbnail.jpeg