代写代考 ITER 1000000000

Thread Synchronization

• Lock-based Thread Synchronization
§ Semaphores

Copyright By PowCoder代写 加微信 powcoder

• Potential problems
§ Deadlocks
• Lock contention & lock granularity
§ Livelocks § Starvation
• Examples
• More Lock-based Thread Synchronization § Barriers
§ Conditionvariables
• Interleavings of Threads § Race conditions

Example: Bank account
void withdraw_money(int withdrawal) {
int total = GetTotalFromAccount(); /* total funds in account */
/* check whether the user has enough funds in account */
if (total < withdrawal) error("You do not have that much money!"); /* OK, the user has enough money, deduct the withdrawal amount from the total */ total -= withdrawal; update_total_funds(total); Bank account void withdraw_money(int withdrawal) { int total = GetTotalFromAccount(); /* total funds in account */ /* check whether the user has enough funds in account */ if (total < withdrawal) error("You do not have that much money!"); /* OK, the user has enough money, deduct the withdrawal amount from the total */ withdraw_money(100) Total = 205; Total = 205-100=105; Funds = 105; total -= withdrawal; update_total_funds(total); withdraw_money(10) Total = 105; Total = 105-10=95; Funds = 95; Serialized execution of thread 1 and 2 yields correct result. Bank account (cont.) void withdraw_money(int withdrawal) { int total = GetTotalFromAccount(); /* total funds in account */ /* check whether the user has enough funds in account */ if (total < withdrawal) error("You do not have that much money!"); /* OK, the user has enough money, deduct the withdrawal amount from the total */ withdraw_money(100) Total = 205; Total = 205-100=105; Funds = 105; withdraw_money(10) Total = 205; Total = 205-10=195; Funds = 195; total -= withdrawal; update_total_funds(total); Race condition: Interleaved execution of process 1 and 2 yields incorrect result because of unsynchronized access to the “Funds” variable. Mutual Exclusion Only one thread must be in the critical section at any time. • Once a thread is in the critical section, no other thread can enter that critical section until the first thread has left that critical section. • Serializes access to critical section. Race Conditions Race condition: “a situation in which multiple threads read and write a shared data item and the final result depends on the relative timing of their execution”. critical section void withdraw_money(int withdrawal) { int total = GetTotalFromAccount(); /* total funds in account */ /* check whether the user has enough funds in account */ if (total < withdrawal) error("You do not have that much money!"); /* OK, the user has enough money, deduct the withdrawal amount from the total */ total -= withdrawal; update_total_funds(total); } • Critical section: two or more code-parts that access and manipulate shared data (aka a shared resource). • To prevent races, it must not happen that 2 threads concurrently execute within a critical section (mutual exclusion between threads). The Smallest Shared Resource • Modifying a shared variable such as a single bit or an integer is already a critical section. • Example: i++; The compiler translates this statement to machine code, resembling the following pseudo code: 1) read the current value of i from memory and copy it into a register 2) add 1 to the value stored in the register 3) write back the new value of i from the register to memory • Executing the critical section concurrently leads to a race condition (read-modify-write). § See example on next slides. § Note: processors like the Intel x86 cannot directly modify data in memory. Instead, they have to load data from memory to a register, modify it, and write it back to memory. § Called “load/store architectures” The Smallest Shared Resource (cont.) • Serialized execution yields the correct result (9): Thread 1: i++ get i (7) increment i (7à8) write back i (8) Thread 2: i++ - get i (8) increment i (8à9) write back i (9) The Smallest Shared Resource (cont.) • Interleaved execution (both threads simultaneously executing the critical section) results in i=8 instead of i=9 because of race condition: Thread 1: i++ get i (7) increment i (7à8) - write back i (8) Thread 2: i++ get i (7) increment i (7à8) write back sum (8) • Race conditions are hard to find, because they are hard to reproduce! Another example... #define MAX 2 #define MAX_ITER 1000000000 long counter = 0; void * thread_function(void * arg) { long l; § Here the critical section is the statement where variable counter is read and written. § If both threads read the value of counter at the same time, they will both add 1 to this value. § Writing back the value to counter, one thread overwrites the result of the other thread. § Thereby one counter increment gets lost. § This is again a race condition. for (l = 0; l < MAX_ITER; l++) counter = counter + 1; // critical section int main(void) { pthread_t mythread[MAX]; int t = 0; for (t = 0; t < MAX; t++) pthread_create( &mythread[t], NULL, thread_function, NULL); for (t = 0; t < MAX; t++) pthread_join ( mythread[t], NULL); printf("Expected counter value: %lld, observed counter value: %lld\n", MAX_ITER * MAX, counter); printf("You experienced %lld race conditions, or %d%% of all accesses to the shared variable.\n", MAX_ITER * MAX - counter, 100 - 100 * counter / (MAX_ITER * MAX)); Another example... (cont.) • If you run the code from the previous slide you can observe race conditions. • Access to variable counter is not synchronized. § The expected counter value with 2 threads is 2000000000. § The real counter value is always much smaller, because of race conditions between the two threads simultaneously updating the counter. pthread1]$ gcc -lpthread parallucpu0count.c pthread1]$ ./a.out Expected counter value: 2000000000, observed counter value: 1010509326 You experienced 989490674 race conditions, or 50% of all accesses to the shared variable. pthread1]$ ./a.out Expected counter value: 2000000000, observed counter value: 1018887394 You experienced 981112606 race conditions, or 50% of all accesses to the shared variable. pthread1]$ ./a.out Expected counter value: 2000000000, observed counter value: 1025761570 You experienced 974238430 race conditions, or 49% of all accesses to the shared variable. • Lock-based Thread Synchronization § Semaphores • Potential problems § Lockcontention&lockgranularity § Deadlocks § Livelocks § Starvation • Examples • Interleavings of Threadsü § Race conditionsü #define MAX_ITER 1000000000 long counter = 0; pthread_mutex_t mylock = PTHREAD_MUTEX_INITIALIZER; void * thread_function(void * arg) { long l; for (l=0; l < MAX_ITER; l++) { pthread_mutex_lock(&mylock); counter = counter + 1; //critical section pthread_mutex_unlock(&mylock); § The Pthread library provides mutexes to synchronize access to shared global variables (shared between threads). § (Variable counter in the above example is a shared global variable.) § In line 3, a mutex named “mylock” is declared. § Before a thread enters the critical section, it locks the mutex (Line 8). § When a thread leaves the critical section, it unlocks the mutex (Line 10). § The mutex ensures that at any time, only one thread will be allowed into the critical section. All other threads have to wait until the critical section becomes free again. counter = counter + 1;14 Mutexes (cont.) § 1 pthread_mutex_t mylock = PTHREAD_MUTEX_INITIALIZER;§ Line 1: Static declaration of a mutex. Mutexes have attributes. Statically declared mutexes can be initialized to default attributes using PTHREAD_MUTEX_INI TIALIZER. 23 void * 4 thread_function(void * arg) { ... § pthread_mutex_lock(&mylock); counter = counter + 1; //critical section pthread_mutex_unlock(&mylock); § A mutex protects shared data (by allowing only 1 thread into critical section) § Initially the mutex is free (or unlocked): T1 Critical section counter += 1; § By calling pthread_mutex_lock(), a thread can acquire the lock: Critical section §mylock is locked. §Thread T1 holds mylock. counter += 1; Mutexes (cont.) § By calling pthread_mutex_unlock(), a thread can relinquish the mutex: Critical section Critical section counter += 1; counter += 1; pthread_mutex _unlock() § While a mutex is locked, other threads that call pthread_mutex_lock() are blocked. § The call to pthread_mutex_lock() will not return before the calling thread was given the lock. § Blocked threads are kept in a waiting queue. They have to wait until the thread in the critical section relinquishes the lock. Critical section counter += 1; waiting queue Mutexes (cont.) § When the thread in the critical section relinquishes the lock, a thread from the waiting queue will be allowed into the critical section. § The Pthread library does not guarantee fairness: § Not necessarily the thread being blocked for the longest time will be selected. Critical section Thread T1 holds mylock. counter += 1; waiting queue waiting queue Critical section Thread T4 holds mylock. counter += 1; Mutexes (cont.) Assume we do not want a thread to block at a mutex: § pthread_mutex_trylock() attempts to acquire the lock. § If the lock is free, then the calling thread will acquire it. § If the lock is not free, then the call to pthread_mutex_trylock() will return immediately. § Instead of block in the waiting queue. § The return value of pthread_mutex_trylock() is used to distinguish the above cases (see Line 5 in above example). pthread_mutex_t mylock = PTHREAD_MUTEX_INITIALIZER; thread_function(void * arg) { if ( 0 != pthread_mutex_trylock(&mylock) ) { printf("Unsuccessful attempt to acquire lock\n"); }else{ // critical section goes here: ... pthread_mutex_unlock(&mylock); Mutexes (cont.) thread_function(void * arg) { if ( 0 != pthread_mutex_trylock(&mylock) ) { printf("Unsuccessful attempt to acquire lock\n"); pthread_mutex_unlock(&mylock); // Error: mutex not acquired! // critical section goes here: ... pthread_mutex_unlock(&mylock); Only the thread that owns a mutex should unlock it! § If a thread unlocks a mutex that it does not own, the behavior is undefined (anything can happen, don’t do it!). § The thread could unlock the mutex for the thread which is currently in the critical section! (a race condition may arise if another thread acquires the mutex). § Unlocking an unlocked mutex also results in undefined behavior. § A frequent error is to unlock a mutex after an unsuccessful pthread_mutex_trylock() attempt (see Line 5 above). Dynamic creation of mutexes Mutexes can also be created at run-time. § In Line 3, memory is allocated to hold a variable of type phtread_mutex_t. § pthread_mutex_init() is used to initialize the mutex. § The second parameter is NULL, meaning that we use default attributes. § Sufficient for our use. § When the mutex is no longer used: § pthread_mutex_destroy() destroys the mutex. § The memory allocated for the mutex must be free-ed by the programmer (Line 7)! 20 6 pthread_mutex_destroy(mylock); 7 free (mylock); pthread_mutex_t * mylock; mylock = (pthread_mutex_t *) malloc(sizeof(pthread_mutex_t)); pthread_mutex_init(mylock, NULL); Dynamic creation of mutexes (cont.) • Assume a linked list dynamic data structure: • Assume a list of nodes where each node represents a counter, and multiple threads may access the same counter. We need to synchronize access to counters. § Assume the following list node struct and node allocation: struct Node { pthread_mutex_t node_lock; // mutex to lock this list node int counter; // data in this list node struct Node * next; // pointer to next node in the list }; struct Node * root = NULL; // declare a pointer to a node root = malloc (sizeof(struct Node)); // create the node root->counter = 0; // set data in node to 0
root->next = NULL; // set the next pointer in node to NULL pthread_mutex_init (&root->node_lock, NULL); // init the mutex

Programming with Mutexes
void * thread_function_1(void * arg) { …
pthread_mutex_lock(&mylock);
counter += 1; //critical section
pthread_mutex_unlock(&mylock);
void * thread_function_2(void * arg) { …
counter += 1; //critical section
Here the programmer forgot to protect the critical section by the corresponding mutex!
• Mutexes are advisory and voluntary.
§ Problem: Nothing prevents a programmer to access a shared data-structure
without acquiring the mutex!
§ Such bad practice will again result in race conditions and corruption of shared data.
• Especially a problem with large programs like the Linux kernel
• Many developers work on kernel code, e.g., drivers.
• Easy to get locking wrong when accessing other people’s data structures!
§ Example: in thread_function_2() above, the programmer forgot to protect the critical section by mutex mylock.
§ One way around this problem is to encapsulate access to shared data-structures in a so-called monitor (can be class in C++ or module in C, or a language-primitive).
• (see next slide)

Programming with Mutexes (cont.)
static long counter = 0;
pthread_mutex_t mylock = PTHREAD_MUTEX_INITIALIZER;
void increment_ctr(void) {
pthread_mutex_lock(&mylock);
counter += 1; //critical section
pthread_mutex_unlock(&mylock);
long read_ctr(void) {
long ret; pthread_mutex_lock(&mylock);
ret = counter; //critical section pthread_mutex_unlock(&mylock); return ret;
Counter-component in C,
aka MONITOR MONITOR = ADT + Synchronization
Concept of monitor:
§At any time, at most one thread can execute any monitor function (mutual exclusion!).
§Pthread library does not support monitors, so we made our own using mutexes.
§Modern programming languages support monitors: Java, C#, Ruby…
• This solution hides the counter variable from direct access by the programmer.
• The only way to access the counter is via the provided functions increment_ctr()
and read_ctr().
§ Problem: Now every access is via a function call, which is less efficient than direct access.
• Optimizing compilers might be able to inline the above functions, thereby reducing the function call overhead.
• Locking-responsibility is shifted from the user of the counter to the maintainer (programmer) of the counter-component. 23

Serialization
unsignedlongcounter=0;
void * thread_function(void * arg) { long l;
for (l=0; l < MAX_ITER; l++) { pthread_mutex_lock(&mylock); counter = counter + LongComputation(); pthread_mutex_unlock(&mylock); § At any one time, only one thread can be in critical section. § All other threads have to wait (block) until critical section becomes free. § Threads execute critical section one after the other, no parallelism! § This is called serialization. thread executing critical section thread blocked counter = counter + LongComputation(); Serialization (cont.) unsignedlongcounter=0; void * thread_function(void * arg) { long l, tmp; for (l=0; l < MAX_ITER; l++) { tmp = LongComputation(); pthread_mutex_lock(&mylock); counter=counter+tmp;//crit.sect. pthread_mutex_unlock(&mylock); § We can place computations that do not access shared data outside of the critical section. § Reduces serialization, increases the amount of parallel computation. § Example: tmp=LongComputation(); § Note: this is not possible if LongComputation()accesses the shared counter! thread executing LongComputation() thread executing critical section thread blocked Programming with Mutexes • What data needs protection? Ask those questions: § Is data shared between several threads ? • A race condition happens when two threads access a variable simultaneously, and (at least) one access is a write. § Thus: simultaneous reads of a variable are not a problem! int tmp1; tmp1 = ctr; ctr = tmp1 + 1; tmp2 = ctr; ctr = tmp2 + 1; int ctr; // shared data serial access (correct result by luck) simultaneous access (race condition, incorrect result) tmp1 = ctr; ctr = tmp1 + 1; tmp2 = ctr; ctr = tmp2 + 1; int tmp1; tmp1 = ctr; ctr = tmp1 + 1; int tmp2; tmp2 = ctr; ctr = tmp2 + 1; Acquiring multiple mutexes • For a given programming problem a thread might have to acquire more than one mutex. • Example: splitting the counter value between a node S and his predecessor node P § Node S, plus Node P (the node before S) must be locked! • Otherwise we have a race condition if a thread tries to update S and/or P in parallel! § Pseudocode to split S’s counter value: 1) lock(pPtr->nlock);
2) lock(sPtr->nlock);
3) pPtr->ctr+=sPtr->ctr/2; 4) sPtr->ctr-=sPtr->ctr/2; 5) unlock(pPtr->nlock);
6) unlock(sPtr->nlock);
struct Node {
pthread_mutex_t nlock;
struct Node * next; };
§ Acquiring multiple mutexes can result in deadlock. § See example on next slide.

• Lock-based Thread Synchronization
§ Mutexesü
§ Semaphores
• Potential problems
§ Lockcontention&lockgranularity § Deadlocks
§ Livelocks
§ Starvation next
• Examples
• Interleavings of Threadsü § Race conditionsü

• A deadlock is a condition involving one or more threads and one or more resources (e.g., mutexes), such that each thread is waiting for one of the resources, but all the resources are already heldàsystem locks, no more progress possible.

Self-Deadlock: a thread attempts to acquire a mutex that it already holds. While waiting for the mutex, it does not release it à deadlock.
pthread_mutex_lock(&mylock); // Acquire mylock pthread_mutex_lock(&mylock); // Try to acquire mylock again!
// Get blocked and wait for mylock // to become available…
ABBA-Deadlock: 2 threads attempting to acquire 2 mutexes A and B. One in the order AàB, the other in the order BàA.
acquire mutex A
try to acquire mutex B wait (block) for mutex B
acquire mutex B
try to acquire mutex A wait (block) for mutex A

Multiple mutexes must always be obtained in the same order!
• Assume 3 mutexes: cat, dog, fox that secure three data- structures.
• Assume a function that needs to work on all three data- structures at the same time.
§ This function must acquire mutexes cat, dog and fox.
• If this function acquires the mutexes in the order catàdogà fox, then every other function must obtain these locks (or a subset of them) in the SAME ORDER.
• Otherwise a deadlock may result: Thread 0:
acquire mutex cat acquire mutex dog
try to acquire mutex fox wait for mutex fox
acquire mutex fox
try to acquire mutex dog wait for mutex dog

Necessary conditions for a deadlock There are four necessary conditions for a deadlock:
1. Mutual exclusion: a resource can be assigned to at most one thread. § Example: a chopstick is a resource. It can be assigned to at most one
philosopher at any time.
2. Hold and wait: threads both hold resources and request other resources.
§ Example: a philosopher holds the left chopstick and requests the right chopstick.
3. No preemption: a resource can only be released by the thread that holds it. § Example: it is not possible to forcibly take away a chopstick from a philosopher.
4. Circular wait: a cycle exists in which each thread waits for a resource that is assigned to another thread.
§ Example (case of 2 dining philosophers, ABBA-deadlock):

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com