CS代考计算机代写 compiler algorithm concurrency ECS 150 – Synchronization

ECS 150 – Synchronization
Prof. Joël Porquet-Lupine
UC Davis – 2020/2021
Copyright © 2017-2021 Joël Porquet-Lupine – CC BY-NC-SA 4.0 International License /
1 / 36

Threads (recap)
Memory sharing
Private processor registers Private stack
Shared global memory
Type of sharing Independent
Cooperating
Threads work on same areas of shared data
Threads work on distinct areas of shared data
int a[N], b[N], c[N], d[N], e[N];
do_mult1()
{
for (i = 0; i < n; i++) a[i] += b[i] * c[i]; } do_mult2() { for (i = 0; i < n; i++) a[i] += d[i] * e[i]; } int main(void) { ... thread_create(do_mult1); thread_create(do_mult2); ... } int a[N], b[N], c[N], d[N], e[N]; do_mult(p, m) { for (i = p; i < m; i++) a[i] += b[i] * c[i] + d[i] * e[i]; } int main(void) { ... thread_create(do_mult, 0, N/2); thread_create(do_mult, N/2, N); ... } 2 / 36 / Concurrency issues Example Execution Typical output: Also possible (yet probably very rare)... int x = 0; void thread_a(void) { x = x + 1; } void thread_b(void) { x = x + 2; } int main(void) { thread_create(thread_a); thread_create(thread_b); thread_join(thread_a); thread_join(thread_b); printf("%d\n", x); return 0; } $ ./a.out 3 $ ./a.out 2 $ ./a.out 1 3 / 36 / Concurrency issues Indeterministic interleavings Thread scheduling Indeterministic scheduling A B A B Sequential execution, concurrent A execution, parallelism, etc. Instruction reordering Compiler instruction reordering Hardware instruction reordering B void thread1(void) { ... p = init(); init = true; ... } void thread2(void) { ... while (init == false); q = docomputation(p); ... } Multi-word operations Multi-word operations are not atomic uint64_t num = 2; struct { char c; short int b; } s = { 'j', 23 }; 4 / 36 / Concurrency issues Race conditions Definition Race condition when the output of a concurrent program depends on the order of operations between threads. Difficulties Number of possible "interleavings" can be huge Some interleavings are good Vast majority of them usually are Some interleavings are bad They may even be extremely rare... Solution? Synchronization! void thread_a(void) { x = x + 1; } void thread_b(void) { x = x + 2; } 5 / 36 / Too Much Milk! Example Roommate 1 Roommate 2 Arrive home Look in fridge, out of milk Leave for store Arrive home Arrive at store Look in fridge, out of milk Buy milk Leave for store Arrive home, put milk away Buy milk Arrive home, put milk away Oh, no! Required correctness properties Safety: at most one person buys milk at a time Liveness: someone buys milk if needed 6 / 36 / Too Much Milk! 1. Leaving a note Roommate 1 (thread 1) Roommate 2 (thread 2) if (note == 0) { if (milk == 0) { note = 1; milk++; note = 0; } } if (note == 0) { if (milk == 0) { note = 1; milk++; note = 0; } } Safety and liveness Not safe if threads are descheduled right after the two tests They would both put a note and get milk, resulting in two bottles of milk! 7 / 36 / Too Much Milk! 2. Using two notes Roommate 1 (thread 1) Roommate 2 (thread 2) note1 = 1; if (note2 == 0) { if (milk == 0) { milk++; } } note1 = 0; note2 = 1; if (note1 == 0) { if (milk == 0) { milk++; } } note2 = 0; Safety and liveness Not live if threads are descheduled right after setting their personal notes They both think the other is on their way getting milk but no does eventually does 8 / 36 / Too Much Milk! 3. Using asymmetric notes Roommate 1 (thread 1) Roommate 2 (thread 2) note1 = 1; while (note2 == 1) ; if (milk == 0) { milk++; } note1 = 0; note2 = 1; if (note1 == 0) { if (milk == 0) { milk++; } } note2 = 0; Safety and liveness Yes, safe and live! 9 / 36 / Too Much Milk! 3. Using asymmetric notes: explained Roommate 1 (thread 1) Roommate 2 (thread 2) note1 = 1; if (milk == 0) { milk++; } note1 = 0; while (note2 == 1) ; note2 = 1; if (note1 == 0) { if (milk == 0) { milk++; } } note2 = 0; Issues 1. Way too over-engineered! 2. Asymmetrical, non-scalable 3. Involves busy-waiting Peterson's algorithm Share a single resource using only shared memory Symmetric and scalable But still quite complex... More here 10 / 36 / Critical section Definition Piece of code where the shared resource is accessed Needs to be properly protected to avoid race conditions Cannot be executed by more than one thread at a time Thread 1 Thread 2 ... CS_enter(); Critical section CS_exit(); ... ... CS_enter(); Critical section CS_exit(); ... Correctness properties 1. Safety 2. Liveness 3. Bounded waiting 4. Failure atomicity Mutual exclusion Property of concurrency control Requirement that only one thread can enter critical section at a time Active thread excludes its peers 11 / 36 / Critical section Formalizing "Too Much Milk!" Shared variable Safety property Liveness property Roommate 1 (thread 1) Roommate 2 (thread 2) note1 = 1; /* while (note2 == 1) * CS_enter() ; */ if (milk == 0) { /* milk++; * CS } */ note1 = 0; /* CS_exit() */ note2 = 1; /* CS_enter() if (note1 == 0) { */ if(milk==0){ /* milk++; * CS } */ } note2 = 0; /* CS_exit() */ 12 / 36 / ECS 150 - Synchronization Prof. Joël Porquet-Lupine UC Davis - 2020/2021 Copyright © 2017-2021 Joël Porquet-Lupine - CC BY-NC-SA 4.0 International License / 13 / 36 Recap Race condition Output of concurrent program depends on order of operations between threads Critical section void thread_1(void) { note1 = 1; // v while (note2 == 1) // CS_enter() ; if (milk == 0) { milk++; } note1 = 0; } void thread_2(void) { // ^ // v // CS // ^ // CS_exit() } milk++; } } note2 = 0; // CS // ^ // CS_exit() note2 = 1; if(note1==0){ //^ // CS_enter() if(milk==0){ //v void thread_a(void) void thread_b(void) {{ x = x + 1; x = x + 2; }} $ ./a.out $ ./a.out 32 $ ./a.out 1 Indeterministic concurrent execution Thread scheduling A B A B A B Instruction reordering Multi-word operations Shared variable Safety property Liveness property Mutual exclusion 14 / 36 / Locks Definition A lock is a synchronization variable that provides mutual exclusion Two states: locked and free (initial state is generally free) API lock() or acquire() Wait until lock is free, then grab it unlock() or release() Unlock, and allow one of the threads waiting in acquire to proceed int milk; int main(void) { thread_create(roommate_fcn); thread_create(roommate_fcn); ... } void roommate_fcn(void) { ... lock(); /* Critical section */ if (!milk) milk++; unlock(); ... } 15 / 36 / Locks Simple uniprocessor implementation Race conditions are coming from indeterministic scheduling Breaks atomicity of instruction sequence Caused by preemption (i.e. timer interrupt) Solution: disable the interrupts! void lock() { disable_interrupts(); } void unlock() { enable_interrupts(); } Issues Only works on uniprocessor systems Dangerous to have unpreemptable code Cannot be used by user applications 16 / 36 / Multiprocessor spinlocks Hardware support Test-and-set hardware primitive to provide mutual exclusion a.k.a Read-modify-write operation Typically relies on a multi-cycle bus operation that atomically reads and updates a memory location Multiprocessor support /* Equivalent of a test&set hardware tion in software */ ATOMIC int test_and_set(int *mem) { int oldval = *mem; *mem = 1; return oldval; } instruc CPU CPU Mem Implementation void spinlock_lock(int *lock) { while (test_and_set(lock) == 1); } void spinlock_unlock(int *lock) { *lock = 0; } 17 / 36 / Multiprocessor spinlocks Revisiting "Too Much Milk!" int lock = 0; Thread 1 Thread 2 spinlock_lock(&lock); if (milk == 0) { milk++; } spinlock_unlock(&lock); spinlock_lock(&lock); if (milk == 0) { milk++; } spinlock_unlock(&lock); Thread 3 Thread 4 spinlock_lock(&lock); if (milk == 0) { milk++; } spinlock_unlock(&lock); spinlock_lock(&lock); if (milk == 0) { milk++; } spinlock_unlock(&lock); 18 / 36 / Multiprocessor spinlocks Issue Busy-waiting wastes cpu cycles Only to reduce latency Solution "Cheap" busy-waiting Yield/sleep when unable to get the lock, instead of looping Better primitives Block waiting threads until they can proceed void spinlock_lock(int *lock) { while (test_and_set(lock) == 1); } void lock(int *lock) { while (test_and_set(lock) == 1) } yield(); //or, sleep(N); void lock(int *lock) { while (test_and_set(lock) == 1) block(lock); } Cons Yielding still wastes cpu cycles Sleeping impacts latency as well Examples Semaphores Mutexes (equivalent to binary semaphore with the notion of ownership) 19 / 36 / Semaphores Definition Invented by Dijkstra in the 60's A semaphore is a generalized lock Used for different types of synchronization (including mutual exclusion) Keeps track an arbitrary resource count Queue of threads waiting to access resource One resource to share Multiple resources to share 20 / 36 / Semaphores API Initial count value (but not a maximum value) sem = sem_create(count); down() or P() Decrement by one, or block if already 0 up() or V() Increment by one, and wake up one of the waiting threads if any Possible implementation: void sem_down(sem) { spinlock_lock(sem->lock); while (sem->count == 0) {
/* Block self */
… }
sem->count -= 1;
spinlock_unlock(sem->lock);
}
void sem_up(sem)
{
spinlock_lock(sem->lock);
sem->count += 1;
/* Wake up first in line */
/* (if any) */

spinlock_unlock(sem->lock);
}
21 / 36
/

Semaphores
Binary semaphore
Semaphore which count value is either 0 or 1 Can be used similarly as a lock
But no busy waiting, waiting thread are blocked until they can get the lock Guarantees mutually exclusive access to a shared resource
Initial value is generally 1 (ie free)
Example
sem = sem_create(1);
Thread 1 Thread 2

down(sem);
Critical section
up(sem);


down(sem);
Critical section
up(sem);

Mutexes are a similar concept except that they also have the notion of ownership /
22 / 36

Semaphores
Counted semaphore
Semaphore which count value can be any positive integer Represents a resource with many “units” available
Initial count is often the number of initial resources (if any)
Allows a thread to continue as long as enough resources are available Used for synchronization
Example
sem_packet = sem_create(0); Thread 1
Thread 2
while (1) {
x = get_network_packet(); enqueue(packetq, x); up(sem_packet);
}
while (1) { down(sem_packet);
x = dequeue(packetq); process_contents(x);
}
23 / 36
/

ECS 150 – Synchronization
Prof. Joël Porquet-Lupine
UC Davis – 2020/2021
Copyright © 2017-2021 Joël Porquet-Lupine – CC BY-NC-SA 4.0 International License /
24 / 36

Recap
Locks
Semaphores
Internal count
down() decrements count by one, or blocks if count is 0
up() increments count by one, and wakes up first blocked thread if any
Binary semaphore
void thread(void) { lock();
/* Critical section */

unlock();
}
Atomic spinlocks
Based on atomic test-and-set instruction
Compatible with multiprocessor systems
Accessible to user processes
sem = sem_create(1); void thread(void) {
sem_down(sem);
/* Critical section */

sem_up(sem);
}
void spinlock_lock(int *lock) {
while (test_and_set(lock) == 1); }
Counted semaphore
sem = sem_create(0); void thread1(void) {
x = get_packet();
enqueue(q, x);
up(sem);
}}
void thread2(void) { down(sem);
x = dequeue(q);
process_packet(x);
void spinlock_unlock(int *lock) {
*lock = 0; }
But based on busy-waiting
25 / 36
/

Producer-consumer problem
Definition
Two or more threads communicate through a circular data buffer: some threads produce data that others consume.
P0 C0
P1 C1
PN CN
Bounded buffer of size N Producers write data to buffer
Write at in and moves rightwards
Don’t write more than the amount of available space Consumers read data from buffer
Read at out and moves rightwards
Don’t consume if there is no data
Allow for multiple producers and consumers
0
N-1
in
out
26 / 36
/

Producer-consumer problem
Solution 1: no protection
int buf[N], in, out;
void produce(int item) {
buf[in] = item;
in = (in + 1) % N;
}
int consume(void) {
int item = buf[out]; out = (out + 1) % N; return item;
}
Issues
Unprotected shared state
Race conditions on all shared variables
No synchronization between consumers and producers
27 / 36
/

Producer-consumer problem
Solution 2: Lock semaphores
Add protection of share state
Mutual exclusion around critical sections
Guarantees one producer and one consumer at a time
int buf[N], in, out;
sem_t lock_prod = sem_create(1), lock_cons = sem_create(1);
void produce(int item) {
sem_down(lock_prod);
buf[in] = item;
in = (in + 1) % N;
sem_up(lock_prod);
}
int consume(void) {
sem_down(lock_cons);
int item = buf[out]; out = (out + 1) % N; sem_up(lock_cons); return item
}
28 / 36
/

Producer-consumer problem
Solution 3: Communication semaphores
Add synchronization between producers and consumers
Producers wait if buffer is full Consumers wait if buffer is empty
int buf[N], in, out;
sem_t lock_prod = sem_create(1), lock_cons = sem_create(1); sem_t empty = sem_create(N), full = sem_create(0);
void produce(int item)
{
sem_up(full); //new item avail
}
sem_down(empty); //need empty spot
sem_down(lock_prod);
buf[in] = item;
in = (in + 1) % N;
sem_up(lock_prod);
int consume(void)
{
}
sem_down(full); //need new item sem_down(lock_cons);
int item = buf[out];
out = (out + 1) % N; sem_up(lock_cons); sem_up(empty); //empty slot avail return item
29 / 36
/

Readers-writers problem
Definition
Multiple threads access the same shared resource, but differently
Many threads only read from it Few threads only write to it
Readers vs writers
Multiple concurrent readers at a time Or single writer at a time
Examples
Airline ticket reservation File manipulation
Reader
Reader
43 seats avail
Airline booking
SFO -> Paris 43 seats available
Airline booking
Buy a seat!
Writer
Reader
SFO -> Paris 42 seats available
42 seats avail
Airline booking
SFO -> Paris 42 seats available
30 / 36
/

Readers-writers problem
Solution 1: Protect resource
sem_t rw_lock = sem_create(1);
void writer(void) {
sem_down(rw_lock);

/* perform write */

sem_up(rw_lock);
}
int reader(void) {
sem_down(rw_lock);

/* perform read */

sem_up(rw_lock);
}
Analysis
Mutual exclusion between readers and writers: Yes
Only one writer can access the critical section: Yes
Multiple readers can access the critical section at the same time: No!
31 / 36
/

Readers-writers problem
Solution 2: Enable multiple readers
int rcount = 0;
sem_t rw_lock = sem_create(1);
void writer(void) {
sem_down(rw_lock);

/* perform write */

sem_up(rw_lock);
}
int reader(void) {
sem_down(rw_lock);

/* perform read */

sem_up(rw_lock);
}
rcount++;
if (rcount == 1)
rcount–;
if (rcount == 0)
Issue
Race condition between readers on variable rcount!
32 / 36
/

Readers-writers problem
Solution 3: Protect multiple readers
int rcount = 0;
sem_t rw_lock = sem_create(1), count_mutex = sem_create(1);
void writer(void) {
sem_down(rw_lock);

/* perform write */

sem_up(rw_lock);
}
int reader(void) {
sem_down(count_mutex);
rcount++;
if (rcount == 1)
sem_down(rw_lock);
sem_up(count_mutex);

/* perform read */

sem_down(count_mutex);
rcount–;
if (rcount == 0)
sem_up(rw_lock);
sem_up(count_mutex);
}
Analysis
Correct solution
But suffers from potential starvation of writers
33 / 36
/

Semaphores
Concluding notes
Semaphores considered harmful (Dijkstra, 1968)
Simple algorithms can require more than one semaphore
Increase of complexity to manage them all Semaphores are low-level primitives
Easy to make programming mistakes (e.g. down() followed by down()) Programmer must keep track of the order of all semaphores operations
Avoid deadlocks
Semaphores are used for both mutual exclusion and synchronization between threads
Difficult to determine which meaning a given semaphore has
Need for another abstraction
Clear distinction between mutual exclusion and synchronization aspects Concept of monitor developed in early 70’s
34 / 36
/

Synchronization barriers
Concept
Enables multiple threads to wait until all threads have reached a particular point of execution before being able to proceed further
void main(void)
{
barrier_t b = barrier_create(3); thread_create(thread_func, b); thread_create(thread_func, b); thread_create(thread_func, b);
}
void thread_func(barrier_t b)
{
while (/* condition */) {
/* … do some computation … */

/* Wait for the other threads */
barrier_wait(b);
}
}
T3 T2 T1
Implementation
Using semaphores
time
Using condition variables and the broadcast() feature
35 / 36
/

Synchronization: the big picture
Semaphores
Locks
Concurrent applications
Shared Objects
Synchronization Variables
Atomic Instructions
Hardware
Bounded buffer
Barrier
Monitor
Condition variables
Interrupt disabling
Hardware interrupts Multiprocessors
Basic Threads Programming:Standards and Strategy
The 12th Commandments of Synchronization (Cornell University, 2011)
Test-and-set instructions
Best practices
36 / 36
/