ECS 150 – Synchronization
Prof. Joël Porquet-Lupine
UC Davis – 2020/2021
Copyright © 2017-2021 Joël Porquet-Lupine – CC BY-NC-SA 4.0 International License /
1 / 36
Threads (recap)
Memory sharing
Private processor registers Private stack
Shared global memory
Type of sharing Independent
Cooperating
Threads work on same areas of shared data
Threads work on distinct areas of shared data
int a[N], b[N], c[N], d[N], e[N];
do_mult1()
{
for (i = 0; i < n; i++)
a[i] += b[i] * c[i];
} do_mult2()
{
for (i = 0; i < n; i++)
a[i] += d[i] * e[i];
}
int main(void)
{
...
thread_create(do_mult1);
thread_create(do_mult2);
...
}
int a[N], b[N], c[N], d[N], e[N];
do_mult(p, m)
{
for (i = p; i < m; i++)
a[i] += b[i] * c[i] + d[i] * e[i];
}
int main(void)
{
...
thread_create(do_mult, 0, N/2);
thread_create(do_mult, N/2, N);
...
}
2 / 36
/
Concurrency issues
Example
Execution
Typical output:
Also possible (yet probably very rare)...
int x = 0;
void thread_a(void)
{
x = x + 1;
}
void thread_b(void)
{
x = x + 2;
}
int main(void)
{
thread_create(thread_a); thread_create(thread_b); thread_join(thread_a); thread_join(thread_b); printf("%d\n", x); return 0;
}
$ ./a.out 3
$ ./a.out 2
$ ./a.out 1
3 / 36
/
Concurrency issues
Indeterministic interleavings Thread scheduling
Indeterministic scheduling
A B
A B
Sequential execution, concurrent A
execution, parallelism, etc.
Instruction reordering
Compiler instruction reordering Hardware instruction reordering
B
void thread1(void) { ...
p = init();
init = true;
...
}
void thread2(void) { ...
while (init == false); q = docomputation(p); ...
}
Multi-word operations
Multi-word operations are not atomic
uint64_t num = 2; struct {
char c;
short int b; } s = { 'j', 23 };
4 / 36
/
Concurrency issues
Race conditions Definition
Race condition when the output of a concurrent program depends on the order of operations between threads.
Difficulties
Number of possible "interleavings" can be huge Some interleavings are good
Vast majority of them usually are Some interleavings are bad
They may even be extremely rare...
Solution?
Synchronization!
void thread_a(void)
{
x = x + 1;
}
void thread_b(void)
{
x = x + 2;
}
5 / 36
/
Too Much Milk!
Example
Roommate 1
Roommate 2
Arrive home
Look in fridge, out of milk
Leave for store
Arrive home
Arrive at store
Look in fridge, out of milk
Buy milk
Leave for store
Arrive home, put milk away
Buy milk
Arrive home, put milk away
Oh, no!
Required correctness properties
Safety: at most one person buys milk at a time Liveness: someone buys milk if needed
6 / 36
/
Too Much Milk!
1. Leaving a note
Roommate 1 (thread 1) Roommate 2 (thread 2)
if (note == 0) {
if (milk == 0) {
note = 1;
milk++;
note = 0;
} }
if (note == 0) {
if (milk == 0) {
note = 1;
milk++;
note = 0;
} }
Safety and liveness
Not safe if threads are descheduled right after the two tests
They would both put a note and get milk, resulting in two bottles of milk!
7 / 36
/
Too Much Milk!
2. Using two notes
Roommate 1 (thread 1) Roommate 2 (thread 2)
note1 = 1;
if (note2 == 0) {
if (milk == 0) { milk++;
} }
note1 = 0;
note2 = 1;
if (note1 == 0) {
if (milk == 0) { milk++;
} }
note2 = 0;
Safety and liveness
Not live if threads are descheduled right after setting their personal notes
They both think the other is on their way getting milk but no does eventually does
8 / 36
/
Too Much Milk!
3. Using asymmetric notes Roommate 1 (thread 1)
Roommate 2 (thread 2)
note1 = 1;
while (note2 == 1)
;
if (milk == 0) {
milk++; }
note1 = 0;
note2 = 1;
if (note1 == 0) {
if (milk == 0) { milk++;
} }
note2 = 0;
Safety and liveness
Yes, safe and live!
9 / 36
/
Too Much Milk!
3. Using asymmetric notes: explained
Roommate 1 (thread 1)
Roommate 2 (thread 2)
note1 = 1;
if (milk == 0) { milk++;
}
note1 = 0;
while (note2 == 1) ;
note2 = 1;
if (note1 == 0) {
if (milk == 0) { milk++;
} }
note2 = 0;
Issues
1. Way too over-engineered! 2. Asymmetrical, non-scalable 3. Involves busy-waiting
Peterson's algorithm
Share a single resource using only shared memory
Symmetric and scalable But still quite complex... More here
10 / 36
/
Critical section
Definition
Piece of code where the shared resource is accessed Needs to be properly protected to avoid race conditions Cannot be executed by more than one thread at a time
Thread 1
Thread 2
...
CS_enter();
Critical section
CS_exit();
...
...
CS_enter();
Critical section
CS_exit();
...
Correctness properties
1. Safety
2. Liveness
3. Bounded waiting 4. Failure atomicity
Mutual exclusion
Property of concurrency control
Requirement that only one thread can enter critical section at a time
Active thread excludes its peers
11 / 36
/
Critical section
Formalizing "Too Much Milk!"
Shared variable Safety property Liveness property
Roommate 1 (thread 1)
Roommate 2 (thread 2)
note1 = 1; /*
while (note2 == 1) * CS_enter()
; */ if (milk == 0) { /*
milk++; * CS
} */
note1 = 0; /* CS_exit() */
note2 = 1; /* CS_enter()
if (note1 == 0) { */
if(milk==0){ /* milk++; * CS
} */ }
note2 = 0; /* CS_exit() */
12 / 36
/
ECS 150 - Synchronization
Prof. Joël Porquet-Lupine
UC Davis - 2020/2021
Copyright © 2017-2021 Joël Porquet-Lupine - CC BY-NC-SA 4.0 International License /
13 / 36
Recap
Race condition
Output of concurrent program depends on order of operations between threads
Critical section
void thread_1(void) {
note1 = 1; // v
while (note2 == 1) // CS_enter()
;
if (milk == 0) {
milk++; }
note1 = 0;
}
void thread_2(void) {
// ^
// v
// CS
// ^
// CS_exit()
}
milk++; }
}
note2 = 0;
// CS // ^
// CS_exit()
note2 = 1; if(note1==0){ //^
// CS_enter() if(milk==0){ //v
void thread_a(void) void thread_b(void) {{
x = x + 1; x = x + 2; }}
$ ./a.out $ ./a.out 32
$ ./a.out 1
Indeterministic concurrent execution
Thread scheduling
A B
A B
A B
Instruction reordering Multi-word operations
Shared variable Safety property Liveness property Mutual exclusion
14 / 36
/
Locks
Definition
A lock is a synchronization variable that provides mutual exclusion Two states: locked and free (initial state is generally free)
API
lock() or acquire()
Wait until lock is free, then grab it
unlock() or release()
Unlock, and allow one of the threads waiting in acquire to proceed
int milk;
int main(void)
{
thread_create(roommate_fcn);
thread_create(roommate_fcn);
...
}
void roommate_fcn(void)
{
...
lock();
/* Critical section */
if (!milk) milk++;
unlock();
... }
15 / 36
/
Locks
Simple uniprocessor implementation
Race conditions are coming from indeterministic scheduling
Breaks atomicity of instruction sequence
Caused by preemption (i.e. timer interrupt) Solution: disable the interrupts!
void lock() {
disable_interrupts();
}
void unlock() {
enable_interrupts();
}
Issues
Only works on uniprocessor systems Dangerous to have unpreemptable code Cannot be used by user applications
16 / 36
/
Multiprocessor spinlocks
Hardware support
Test-and-set hardware primitive to provide mutual exclusion a.k.a Read-modify-write operation
Typically relies on a multi-cycle bus operation that atomically reads and updates a memory location
Multiprocessor support
/* Equivalent of a test&set hardware tion in software */
ATOMIC int test_and_set(int *mem)
{
int oldval = *mem; *mem = 1;
return oldval;
}
instruc
CPU
CPU
Mem
Implementation
void spinlock_lock(int *lock)
{
while (test_and_set(lock) == 1);
}
void spinlock_unlock(int *lock)
{
*lock = 0;
}
17 / 36
/
Multiprocessor spinlocks
Revisiting "Too Much Milk!"
int lock = 0; Thread 1
Thread 2
spinlock_lock(&lock); if (milk == 0) {
milk++; }
spinlock_unlock(&lock);
spinlock_lock(&lock); if (milk == 0) {
milk++; }
spinlock_unlock(&lock);
Thread 3
Thread 4
spinlock_lock(&lock); if (milk == 0) {
milk++; }
spinlock_unlock(&lock);
spinlock_lock(&lock); if (milk == 0) {
milk++; }
spinlock_unlock(&lock);
18 / 36
/
Multiprocessor spinlocks
Issue
Busy-waiting wastes cpu cycles
Only to reduce latency
Solution
"Cheap" busy-waiting
Yield/sleep when unable to get the lock, instead of looping
Better primitives
Block waiting threads until they can proceed
void spinlock_lock(int *lock)
{
while (test_and_set(lock) == 1);
}
void lock(int *lock)
{
while (test_and_set(lock) == 1)
}
yield(); //or, sleep(N);
void lock(int *lock)
{
while (test_and_set(lock) == 1)
block(lock);
}
Cons
Yielding still wastes cpu cycles Sleeping impacts latency as well
Examples
Semaphores
Mutexes (equivalent to binary semaphore with the notion of ownership)
19 / 36
/
Semaphores
Definition
Invented by Dijkstra in the 60's
A semaphore is a generalized lock
Used for different types of synchronization (including mutual exclusion) Keeps track an arbitrary resource count
Queue of threads waiting to access resource
One resource to share
Multiple resources to share
20 / 36
/
Semaphores
API
Initial count value (but not a maximum value) sem = sem_create(count);
down() or P()
Decrement by one, or block if already 0
up() or V()
Increment by one, and wake up one of the waiting threads if any
Possible implementation:
void sem_down(sem)
{
spinlock_lock(sem->lock); while (sem->count == 0) {
/* Block self */
… }
sem->count -= 1;
spinlock_unlock(sem->lock);
}
void sem_up(sem)
{
spinlock_lock(sem->lock);
sem->count += 1;
/* Wake up first in line */
/* (if any) */
…
spinlock_unlock(sem->lock);
}
21 / 36
/
Semaphores
Binary semaphore
Semaphore which count value is either 0 or 1 Can be used similarly as a lock
But no busy waiting, waiting thread are blocked until they can get the lock Guarantees mutually exclusive access to a shared resource
Initial value is generally 1 (ie free)
Example
sem = sem_create(1);
Thread 1 Thread 2
…
down(sem);
Critical section
up(sem);
…
…
down(sem);
Critical section
up(sem);
…
Mutexes are a similar concept except that they also have the notion of ownership /
22 / 36
Semaphores
Counted semaphore
Semaphore which count value can be any positive integer Represents a resource with many “units” available
Initial count is often the number of initial resources (if any)
Allows a thread to continue as long as enough resources are available Used for synchronization
Example
sem_packet = sem_create(0); Thread 1
Thread 2
while (1) {
x = get_network_packet(); enqueue(packetq, x); up(sem_packet);
}
while (1) { down(sem_packet);
x = dequeue(packetq); process_contents(x);
}
23 / 36
/
ECS 150 – Synchronization
Prof. Joël Porquet-Lupine
UC Davis – 2020/2021
Copyright © 2017-2021 Joël Porquet-Lupine – CC BY-NC-SA 4.0 International License /
24 / 36
Recap
Locks
Semaphores
Internal count
down() decrements count by one, or blocks if count is 0
up() increments count by one, and wakes up first blocked thread if any
Binary semaphore
void thread(void) { lock();
/* Critical section */
…
unlock();
}
Atomic spinlocks
Based on atomic test-and-set instruction
Compatible with multiprocessor systems
Accessible to user processes
sem = sem_create(1); void thread(void) {
sem_down(sem);
/* Critical section */
…
sem_up(sem);
}
void spinlock_lock(int *lock) {
while (test_and_set(lock) == 1); }
Counted semaphore
sem = sem_create(0); void thread1(void) {
x = get_packet();
enqueue(q, x);
up(sem);
}}
void thread2(void) { down(sem);
x = dequeue(q);
process_packet(x);
void spinlock_unlock(int *lock) {
*lock = 0; }
But based on busy-waiting
25 / 36
/
Producer-consumer problem
Definition
Two or more threads communicate through a circular data buffer: some threads produce data that others consume.
P0 C0
P1 C1
PN CN
Bounded buffer of size N Producers write data to buffer
Write at in and moves rightwards
Don’t write more than the amount of available space Consumers read data from buffer
Read at out and moves rightwards
Don’t consume if there is no data
Allow for multiple producers and consumers
0
N-1
in
out
26 / 36
/
…
…
Producer-consumer problem
Solution 1: no protection
int buf[N], in, out;
void produce(int item) {
buf[in] = item;
in = (in + 1) % N;
}
int consume(void) {
int item = buf[out]; out = (out + 1) % N; return item;
}
Issues
Unprotected shared state
Race conditions on all shared variables
No synchronization between consumers and producers
27 / 36
/
Producer-consumer problem
Solution 2: Lock semaphores
Add protection of share state
Mutual exclusion around critical sections
Guarantees one producer and one consumer at a time
int buf[N], in, out;
sem_t lock_prod = sem_create(1), lock_cons = sem_create(1);
void produce(int item) {
sem_down(lock_prod);
buf[in] = item;
in = (in + 1) % N;
sem_up(lock_prod);
}
int consume(void) {
sem_down(lock_cons);
int item = buf[out]; out = (out + 1) % N; sem_up(lock_cons); return item
}
28 / 36
/
Producer-consumer problem
Solution 3: Communication semaphores
Add synchronization between producers and consumers
Producers wait if buffer is full Consumers wait if buffer is empty
int buf[N], in, out;
sem_t lock_prod = sem_create(1), lock_cons = sem_create(1); sem_t empty = sem_create(N), full = sem_create(0);
void produce(int item)
{
sem_up(full); //new item avail
}
sem_down(empty); //need empty spot
sem_down(lock_prod);
buf[in] = item;
in = (in + 1) % N;
sem_up(lock_prod);
int consume(void)
{
}
sem_down(full); //need new item sem_down(lock_cons);
int item = buf[out];
out = (out + 1) % N; sem_up(lock_cons); sem_up(empty); //empty slot avail return item
29 / 36
/
Readers-writers problem
Definition
Multiple threads access the same shared resource, but differently
Many threads only read from it Few threads only write to it
Readers vs writers
Multiple concurrent readers at a time Or single writer at a time
Examples
Airline ticket reservation File manipulation
Reader
Reader
43 seats avail
Airline booking
SFO -> Paris 43 seats available
Airline booking
Buy a seat!
Writer
Reader
SFO -> Paris 42 seats available
42 seats avail
Airline booking
SFO -> Paris 42 seats available
30 / 36
/
Readers-writers problem
Solution 1: Protect resource
sem_t rw_lock = sem_create(1);
void writer(void) {
sem_down(rw_lock);
…
/* perform write */
…
sem_up(rw_lock);
}
int reader(void) {
sem_down(rw_lock);
…
/* perform read */
…
sem_up(rw_lock);
}
Analysis
Mutual exclusion between readers and writers: Yes
Only one writer can access the critical section: Yes
Multiple readers can access the critical section at the same time: No!
31 / 36
/
Readers-writers problem
Solution 2: Enable multiple readers
int rcount = 0;
sem_t rw_lock = sem_create(1);
void writer(void) {
sem_down(rw_lock);
…
/* perform write */
…
sem_up(rw_lock);
}
int reader(void) {
sem_down(rw_lock);
…
/* perform read */
…
sem_up(rw_lock);
}
rcount++;
if (rcount == 1)
rcount–;
if (rcount == 0)
Issue
Race condition between readers on variable rcount!
32 / 36
/
Readers-writers problem
Solution 3: Protect multiple readers
int rcount = 0;
sem_t rw_lock = sem_create(1), count_mutex = sem_create(1);
void writer(void) {
sem_down(rw_lock);
…
/* perform write */
…
sem_up(rw_lock);
}
int reader(void) {
sem_down(count_mutex);
rcount++;
if (rcount == 1)
sem_down(rw_lock);
sem_up(count_mutex);
…
/* perform read */
…
sem_down(count_mutex);
rcount–;
if (rcount == 0)
sem_up(rw_lock);
sem_up(count_mutex);
}
Analysis
Correct solution
But suffers from potential starvation of writers
33 / 36
/
Semaphores
Concluding notes
Semaphores considered harmful (Dijkstra, 1968)
Simple algorithms can require more than one semaphore
Increase of complexity to manage them all Semaphores are low-level primitives
Easy to make programming mistakes (e.g. down() followed by down()) Programmer must keep track of the order of all semaphores operations
Avoid deadlocks
Semaphores are used for both mutual exclusion and synchronization between threads
Difficult to determine which meaning a given semaphore has
Need for another abstraction
Clear distinction between mutual exclusion and synchronization aspects Concept of monitor developed in early 70’s
34 / 36
/
Synchronization barriers
Concept
Enables multiple threads to wait until all threads have reached a particular point of execution before being able to proceed further
void main(void)
{
barrier_t b = barrier_create(3); thread_create(thread_func, b); thread_create(thread_func, b); thread_create(thread_func, b);
}
void thread_func(barrier_t b)
{
while (/* condition */) {
/* … do some computation … */
…
/* Wait for the other threads */
barrier_wait(b);
}
}
T3 T2 T1
Implementation
Using semaphores
time
Using condition variables and the broadcast() feature
35 / 36
/
Synchronization: the big picture
Semaphores
Locks
Concurrent applications
Shared Objects
Synchronization Variables
Atomic Instructions
Hardware
Bounded buffer
Barrier
Monitor
Condition variables
Interrupt disabling
Hardware interrupts Multiprocessors
Basic Threads Programming:Standards and Strategy
The 12th Commandments of Synchronization (Cornell University, 2011)
Test-and-set instructions
Best practices
36 / 36
/