UNIS Template
More on Thread-Safe and Synchronization Strategies and Problems
Shuaiwen
Pthread tutorials
https://hpc-tutorials.llnl.gov/posix/
https://www.cs.cmu.edu/afs/cs/academic/class/15492-f07/www/pthreads.html
2
Data Race
3
/* Global shared variable */
long cnt = 0; /* Counter */
int main(int argc, char **argv)
{
long niters; pthread_t tid1, tid2;
niters = atoi(argv[1]); pthread_create(&tid1, NULL,
thread, &niters); pthread_create(&tid2, NULL,
thread, &niters); pthread_join(tid1, NULL); pthread_join(tid2, NULL);
/* Thread routine */ void *thread(void *vargp)
{
long i, niters =
*((long *)vargp);
for (i = 0; i < niters; i++) cnt++;
return NULL;
/* Check result */
if (cnt != (2 * niters)) printf("BOOM! cnt=%ld\n", cnt);
else
printf("OK cnt=%ld\n", cnt); exit(0);
badcnt.c
}
}
linux> ./main 1000000 BOOM! cnt=1069894
What went wrong?
cnt should equal 2,000,000.
movq addq movq
4
cnt(%rip),%rdx
$1, %rdx
%rdx, cnt(%rip)
Suppose that cnt starts with value 0, and that two threads each execute the code below once. What are the possible values for cnt afterward?
Only 2
1 or 2
0 or 1 or 2
None of the above
Question
5
We must synchronize the execution of the threads so that they can never have overlap critical sections.
i.e., need to guarantee mutually exclusive access for each critical section.
Solutions:
Semaphores
Mutex locks
Condition variables
Atomic operations
Barrier
Spinlocks
Read-write locks
deadlocks
Outline
6
Semaphore: non-negative global integer synchronization variable.
Unlike mutex (a locking mechanism), it is a signaling mechanism. It guarantees both ordering and atomicity.
Unlike a mutex, any thread can release a semaphore, not only the one that has acquired it in the first place.
Semaphores
7
A semaphore controls the access to a shared resource: the counter determines the maximum number of threads that can simultaneously access it. At the beginning of your program, when the semaphore gets initialized, you choose that number according to your needs. Then, a thread that wants to access a shared resource calls acquire:
if the counter is greater than zero the thread can proceed. The counter gets reduced by one right away, then the current thread starts doing its job. When done, it calls release which in turn increases the counter by one.
if the counter is equal to zero the thread cannot proceed: other threads have already filled up the available space. The current thread is put to sleep by the operating system and will wake up when the semaphore counter becomes greater than zero again (that is when any other thread calls release once its job is done).
Unlike a mutex, any thread can release a semaphore, not only the one that has acquired it in the first place.
Semaphore: non-negative global integer synchronization variable.
semaphore
8
Semaphores
Semaphore: non-negative global integer synchronization variable.
semaphore
2
9
Semaphores
Semaphore: non-negative global integer synchronization variable.
semaphore
1
10
Semaphores
Semaphore: non-negative global integer synchronization variable.
semaphore
0
11
Semaphores
Semaphore: non-negative global integer synchronization variable.
semaphore
1
12
Semaphores
13
Semaphore: non-negative global integer synchronization variable. Manipulated by P and V operations.
P(s) —- sem_wait ()
If s is nonzero, then decrement s by 1 and return immediately.
Test and decrement operations occur atomically (indivisibly)
If s is zero, then suspend thread until s becomes nonzero and the thread is restarted by a V operation.
After restarting, the P operation decrements s and returns control to the caller.
V(s) —- sem_post()
Increment s by 1.
Increment operation occurs atomically
If there are any threads blocked in a P operation waiting for s to become non- zero, then restart exactly one of those threads, which then completes its P operation by decrementing s.
Semaphore invariant: (s >= 0)
Semaphores
C Semaphore Operations
Pthreads functions:
#include
int sem_init(sem_t *s, 0, unsigned int val);} /* s = val */
14
int sem_wait(sem_t *s); /* P(s) */
int sem_post(sem_t *s); /* V(s) */
/* Global shared variable */
long cnt = 0; /* Counter */
int main(int argc, char **argv)
{
long niters; pthread_t tid1, tid2;
niters = atoi(argv[1]); Pthread_create(&tid1, NULL,
thread, &niters); Pthread_create(&tid2, NULL,
thread, &niters); Pthread_join(tid1, NULL); Pthread_join(tid2, NULL);
/* Check result */
if (cnt != (2 * niters)) printf(“BOOM! cnt=%ld\n”, cnt);
else
printf(“OK cnt=%ld\n”, cnt); exit(0);
}
/* Thread routine */ void *thread(void *vargp)
{
long i, niters =
*((long *)vargp);
for (i = 0; i < niters; i++) cnt++;
return NULL;
}
15
How can we fix this using semaphores?
Improper Synchronization
16
Basic idea:
Associate a unique semaphore mutex, initially 1, with each shared variable (or related set of shared variables).
Surround corresponding critical sections with P and V operations.
Using Semaphores for Mutual Exclusion
Terminology:
Binary semaphore: semaphore whose value is always 0 or 1
Mutex: binary semaphore used for mutual exclusion
P operation: “locking” the mutex
V operation: “unlocking” or “releasing” the mutex
“Holding” a mutex: locked and not yet unlocked.
Counting semaphore: used as a counter for set of available resources.
Proper Synchronization
Define and initialize a mutex for the shared variable cnt:
long cnt = 0; /* Counter */
sem_t mutex; /* Semaphore that protects cnt */
sem_init(&mutex, 0, 1); /* mutex = 1 */
Surround critical section with P and V:
for (i = 0; i < niters; i++) { sem_wait(&mutex);
cnt++; sem_post(&mutex);
} goodcnt.c
17
linux> ./goodcnt 1000000 OK cnt=1000000
linux> ./goodcnt 1000000 OK cnt=1000000
Binary Semaphore
A semaphore whose counter is restricted to the values 0 and 1 is called binary semaphore: only one thread at a time can access the shared resource. Wait: this is basically a mutex protecting a critical section! You can actually replicate the mutex behavior with a binary semaphore. However there are two important points to keep in mind:
a mutex can be unlocked only by thread that had locked it first, while a semaphore can be released from any other thread. This could lead to confusion and subtle bugs if what you want is just a locking mechanism;
semaphores are signaling mechanisms that orchestrate threads, while mutexes are locking mechanisms that protects shared resources. You should not use semaphores to protect shared resources, nor mutexes for signaling: your intent will be more clear to you and to anyone who will read your code.
18
Producer-Consumer with Semaphore
The producer-consumer scenario imposes the following constraints:
The producer thread must not overwrite the shared buffer when the previous task has not been picked up by a consumer thread.
The consumer threads must not pick up tasks until there is something present in the shared buffer.
Individual consumer threads should pick up tasks one at a time.
19
Producer
Producer
Producer
Buffer Queue
Consumer
Consumer
Consumer
Producer-Consumer with Semaphore
20
sem_t empty; // empty slot in buffer
sem_t occupied; // occupied slot in buffer
sem_t p_lock; // mutex for producer function
sem_t c_lock; // mutex for consumer function
void *producer(void *producer_thread_data) {
while (!done()) {
sem_wait(&empty);
sem_wait(&p_lock);
insert_into_queue();
sem_post(&p_lock);
sem_post(&occupied);
}
}
void *consumer(void *consumer_thread_data) {
while (!done()) {
sem_wait(&occupied);
sem_wait(&c_lock);
my_task = extract_from_queue();
sem_post(&c_lock);
sem_post(&empty);
process_task(my_task);
}
}
void main() {
/* declarations and initializations */
sem_init(&p_lock, 0, 1); /* mutex = 0 */
sem_init(&c_lock, 0, 1); /* mutex = 0 */
sem_init(&empty, 0, BUFFER_SIZE); /* mutex = 0 */
sem_init(&occupied, 0, 0); /* mutex = 0 */
……
}
Thread-safe
20
Examples
21
sem_t s;
sem_init (&s, 0, 4)
void* wait ()
{
……
sem_wait (&s);
}
void*post ()
{
sem_post(&s);
……
}
Are there possible?
WWWWWPPWW
WPWPWWWWW
PPWPWPWWWWWW
21
Exercise
22
sem A1Done = 0; sem B1Done = 0;
//Thread A // Thread B
A1 B1
wait(B1Done) wait(A1Done)
post(A1Done) post(B1Done)
A2 B2
Consider two threads A and B that perform two operations each. Let the operations of thread A be A1 and A2;let the operations of thread B be B1 and B2. We require that threads A and B each perform their first operation before either can proceed to the second operation. That is, we require that A1 be run before B2 and B1 before A2. Consider the following solutions based on semaphores for this problem (the code run by threads A and Bis shown in two columns next to each other). Explain whether the solution is correct or not. If it is incorrect, you must also point out why the solution is incorrect. (wait operation for wait semaphore, post operation to increase semaphore)
22
Deadlock
Deadlock is when two or more tasks never make progress because each is waiting for some resource held by another process/thread.
23
Deadlock
Deadlock is when two or more tasks never make progress because each is waiting for some resource held by another process/thread.
24
Deadlock
Deadlock can arise if four conditions hold simultaneously
Mutual exclusion
Only one process at a time can use a resource
Hold and wait
A process holding at least one resource and waiting to acquire
additional resources held by other processes
No preemption
A resource can be released only voluntarily by the process
holding it, after that process has completed its task
Circular wait
Set {P0, P1, …, P0} of waiting processes/threads
P0 → P1, P1 → P2, …, Pn–1 → Pn, and Pn → P0
25
Handling Deadlocks
Deadlock Prevention
Ensure that at least one of four necessary conditions cannot hold
Deadlock Avoidance
Do not allow a resource request → Potential to lead to a deadlock
Requires advance info of all requests
Deadlock Detection
Always allow resource requests
Periodically check for deadlocks
If a deadlock exists → Recover from it
Ignore
Makes sense if the likelihood is very low, say once per year
Cheaper than prevention, avoidance or detection
Used by most common OS
26
107
Programmers need a clear model of how variables are shared by threads.
Variables shared by multiple threads must be protected to ensure mutually exclusive access.
Semaphores are a fundamental mechanism for enforcing mutual exclusion.
Summary
28
Solutions:
Semaphores
Mutex locks
Condition variables
Atomic operations
Barrier
Spinlocks
Read-write locks
deadlocks
Outline
Mutex Locks
Critical sections in Pthreads can be mutually exclusive accessed using mutex locks.
Mutex-locks have two states: locked and unlocked. At any point of time, only one thread can lock a mutex lock. A lock is an atomic operation.
A thread entering a critical section first tries to get a lock. It goes ahead when the lock is granted.
A thread release the lock when leaving a critical section
The API provides the following functions for handling mutex-locks:
Hint: Attempting to initialize an already initialized mutex results in undefined behavior.
29
int pthread_mutex_init (pthread_mutex_t *mutex_lock, const pthread_mutexattr_t *lock_attr);
int pthread_mutex_lock ( pthread_mutex_t *mutex_lock);
int pthread_mutex_unlock (pthread_mutex_t *mutex_lock);
int pthread_mutex_destroy(pthread_mutex_t *mutex);
Types of supports three types of mutexes.
A normal mutex deadlocks if a thread that already has a lock tries a second lock on it.
A recursive mutex allows a single thread to lock a mutex as many times as it wants. It simply increments a count on the number of locks. A lock is relinquished by a thread when the count becomes zero.
An error check mutex reports an error when a thread with a lock tries to lock it again (as opposed to deadlocking in the first case, or granting the lock, as in the second case).
The type of the mutex can be set in the attributes object before it is passed at time of initialization.
Initialize a recursive mutex:
30
pthread_mutex_t Mutex;
pthread_mutexattr_t Attr;
pthread_mutexattr_init(&Attr);
pthread_mutexattr_settype(&Attr, PTHREAD_MUTEX_RECURSIVE);
pthread_mutex_init(&Mutex, &Attr);
Second lock will block it, no one can wake it up.
Recursive locks: need to lock and unlock the same number of times.
30
Attributes Objects for Mutexes
Initialize the attrributes object using function:
pthread_mutexattr_init (pthread_mutexattr_t *attr);
The function pthread_mutexattr_settype can be used for setting the type of mutex specified by the mutex attributes object.
pthread_mutexattr_settype (pthread_mutexattr_t *attr, int type);
Here, type specifies the type of the mutex and can take one of:
PTHREAD_MUTEX_NORMAL
PTHREAD_MUTEX_RECURSIVE
PTHREAD_MUTEX_ERRORCHECK
31
Proper Synchronization using Mutex Lock
32
pthread_mutex_t lock;
volatile long cnt = 0; /* Counter */
int main(int argc, char **argv)
{
long niters; pthread_t tid1, tid2;
if (pthread_mutex_init(&lock, NULL) != 0) {
printf(“\n mutex init has failed\n”);
return 1;
}
niters = atoi(argv[1]);
pthread_create(&tid1, NULL,thread, &niters);
pthread_create(&tid2, NULL,thread, &niters);
pthread_join(tid1, NULL);
pthread_join(tid2, NULL);
/* Check result */
if (cnt != (2 * niters)) printf(“BOOM! cnt=%ld\n”, cnt);
else printf(“OK cnt=%ld\n”, cnt);
pthread_mutex_destroy(&lock);
return 0;
}
void *thread(void *vargp)
{
long i, niters = *((long *)vargp);
pthread_mutex_lock(&lock);
for (i = 0; i < niters; i++) cnt++;
pthread_mutex_unlock(&lock);
return NULL;
}
Overheads of Locking
Locks represent serialization points since critical sections must be executed by threads one after the other.
Encapsulating large segments of the program within locks can lead to significant performance degradation.
It is often possible to reduce the idling overhead associated with locks using
int pthread_mutex_trylock (pthread_mutex_t *mutex_lock);
which attempts to lock mutex_lock, but if unsuccessful, will return immediately with a “busy” error code.
33
Condition Variables
A condition variable allows a thread to block itself until a specified condition becomes true.
When a thread executes pthread_cond_wait(condition_variable), it is blocked until another thread executes following:
pthread_cond_signal(condition_variable) or
pthread_cond_broadcast(condition_variable)
pthread_cond_signal () is used to unblock one of the threads blocked waiting on the condition variable.
pthread_cond_broadcast() is used to unblock all the threads blocked waiting on the condition_variable.
If no threads are waiting on the condition variable, then a pthread_cond_signal() or pthread_cond_broadcast() will have no effect.
34
Condition Variables
A condition variable always has a mutex associated with it. A thread locks this mutex before issuing a wait, a signal or a broadcast on condition variables.
While the thread is waiting on a condition variable, the mutex is automatically unlocked, and when the thread is signaled, the mutex is automatically locked again.
Pthreads provides the following functions for condition variables
int pthread_cond_init (pthread_cond_t *cond, const pthread_condattr_t *attr);
int pthread_cond_wait (pthread_cond_t *cond, pthread_mutex_t *mutex);
int pthread_cond_signal (pthread_cond_t *cond);
int pthread_cond_broadcast (pthread_cond_t *cond);
int pthread_cond_destroy (pthread_cond_t *cond);
35
Condition Variables
Condition variables should be used as a place to wait and be notified.
General usage pattern:
On the other hand, a thread, signaling the condition variable, typically looks like
36
// safely examine the condition, prevent other threads from
// altering it
pthread_mutex_lock (&lock);
while ( SOME-CONDITION is false)
pthread_cond_wait (&cond, &lock);
// Do whatever you need to do when condition becomes true
do_stuff();
pthread_mutex_unlock (&lock);
// ensure we have exclusive access to whatever comprises the condition
pthread_mutex_lock (&lock);
ALTER-CONDITION
// Wakeup at least one of the threads that are waiting on the condition (if any)
pthread_cond_signal (&cond);
// allow others to proceed
pthread_mutex_unlock (&lock)
Producer-Consumer Using Condition Variables
37
pthread_cond_t queue_not_full, queue_not_empty;
pthread_mutex_t queue_cond_lock;
pthread_t p, c;
/* other data structures here */
void main() {
/* declarations and initializations */
task_available = 0;
pthread_mutex_init(&queue_cond_lock, NULL);
pthread_create(&p, NULL, producer, NULL);
pthread_create(&p, NULL, consumer, NULL);
pthread_cond_init(&queue_not_empty, NULL);
pthread_cond_init(&queue_not_full, NULL);
pthread_join(p, NULL);
pthread_join(c, NULL);
/* create and join producer and consumer threads */
}
Main function:
Producer-Consumer Using Condition Variables
38
void *producer(void *producer_thread_data) {
while (!done()) {
pthread_mutex_lock(&queue_cond_lock);
if (task_available == BUFSIZE)
pthread_cond_wait(&queue_not_full, &queue_cond_lock);
task_available++;
insert_into_queue();
pthread_cond_signal(&queue_not_empty);
pthread_mutex_unlock(&queue_cond_lock);
}
}
void *consumer(void *consumer_thread_data) {
while (!done()) {
pthread_mutex_lock(&queue_cond_lock);
if (task_available == 0)
pthread_cond_wait(&queue_not_empty, &queue_cond_lock);
task_available--;
my_task = extract_from_queue();
pthread_cond_signal(&queue_not_full);
pthread_mutex_unlock(&queue_cond_lock);
process_task(my_task);
}
}
Wake up the other thread.
Once it is blocked, the outer layer multex lock wil be released to other threads to grab
Everytime it gets woke up , the thread has to perform context switch, the overhead will increase with the number of waited threads
Similar to semphaore, company with mutex.
38
Synchronization (barriers)
What is the output of the following Pthread program??
39
int main(int argc, char *argv) {
double A[5] , B[5], C[5]; /* global, shared variables*/
for (i = 0; i < 5 ; i++) A[i] = B[i] = 1;
for (i = 0; i < 4 ; i++)
pthread_create( … , DoStuff, int i );
…
for (i = 0; i < 4 ; i++) pthread_join (… , DoStuff, …);
Print the values of C;
}
void DoStuff (int threadID) {
int k;
B[threadID+1] = 2 * A[threadID];
Barrier
C[threadID] = 2 * B[threadID];
}
Barriers - a composite synchronization construct
By design, Pthreads provide support for a basic set of operations.
Higher level constructs can be built using basic synchronization constructs.
We discuss one such constructs – barriers.
A barrier holds a thread until all threads participating in the barrier have reached it.
Barriers can be implemented using a counter, a mutex and a condition variable.
A single integer (counter) is used to keep track of the number of threads that have reached the barrier.
If the count is less than the total number of threads, the threads execute a condition wait.
The last thread entering (and setting the count to the number of threads) wakes up all the threads using a condition broadcast and resets the count to zero (to prepare for the next barrier).
40
Defining your own barrier construct
41
typedef struct {
pthread_mutex_t count_lock; pthread_cond_t ok_to_proceed; int count;
} mylib_barrier_t;
void mylib_init_barrier(mylib_barrier_t *b) {
b->count = 0;
pthread_mutex_init(&(b->count_lock), NULL); pthread_cond_init(&(b->ok_to_proceed), NULL);
}
void mylib_barrier(mylib_barrier_t *b, int thread_count) {
pthread_mutex_lock(&(b->count_lock));
b->count++;
if (b->count == thread_count) {
b->count = 0; pthread_cond_broadcast(&(b->ok_to_proceed));
}
else pthread_cond_wait(&(b->ok_to_proceed), &(b->count_lock));
pthread_mutex_unlock(&(b->count_lock));
}
Using the defined barrier
42
mylib_barrier_t my_barrier ; /*declare a barrier */
int main(int argc, char *argv) {
mylib_init_barrier (my_barrier) ; /* initialize the barrier */
double A[5] , B[5], C[5]; /* global, shared variables*/
for (i = 0; i < 5 ; i++) A[i] = B[i] = 1;
for (i = 0; i < 4 ; i++)
pthread_create( … , DoStuff, int i );
…
for (i = 0; i < 4 ; i++) pthread_join (… , DoStuff, …);
Print the values of C;
}
void DoStuff (int threadID) {
int k;
B[threadID+1] = 2 * A[threadID];
mylib_barrier ( my_barrier, 4) ; /* call the barrier */
C[threadID] = 2 * B[threadID];
}
42
Atomic Operations in C
Recall the cnt++ operation in Lecture04
Need lock to ensure exclusive access on critical section.
43
movq addq movq
cnt(%rip),%rdx
$1, %rdx
%rdx, cnt(%rip)
Atomic Operations in C
Recall the cnt++ operation in Lecture04
Need lock to ensure exclusive access on critical section.
What if cnt++ operation is atomic?
Cannot be divided
No half-done state
All or nothing
44
movq addq movq
cnt(%rip),%rdx
$1, %rdx
%rdx, cnt(%rip)
Atomic Operations in C
Recall the cnt++ operation in Lecture04
Need lock to ensure exclusive access on critical section.
What if cnt++ operation is atomic?
Cannot be divided
No have-done state
All or nothing
Can “critical section” still overlapped? Do we still need locks to prevent race condition?
45
movq addq movq
cnt(%rip),%rdx
$1, %rdx
%rdx, cnt(%rip)
Atomic Operations in C
Concurrent access using atomic operations is guaranteed to not cause data races
Allowing for lockless concurrent programming
46
Atomic Operations in C
Let’s revisit this count cnt program (improper synchronization)
47
volatile long cnt = 0; /* Counter */
int main(int argc, char **argv)
{
long niters; pthread_t tid1, tid2;
niters = atoi(argv[1]);
pthread_create(&tid1, NULL, thread, &niters);
pthread_create(&tid2, NULL, thread, &niters);
pthread_join(tid1, NULL);
pthread_join(tid2, NULL);
/* Check result */
if (cnt != (2 * niters)) printf("BOOM! cnt=%ld\n", cnt);
else
printf("OK cnt=%ld\n", cnt); exit(0);
}
void *thread(void *vargp)
{
long i, niters = *((long *)vargp);
for (i = 0; i < niters; i++) cnt++;
return NULL;
}
How to fix this program using atomic operation?
./main 10000000
BOOM! cnt=13739449
47
Atomic Operations in C
Add cnt program using atomic operation
48
48
Atomic Operations in C
Add cnt program using atomic operation
49
volatile long cnt = 0; /* Counter */
#include
int main(int argc, char **argv)
{
long niters; pthread_t tid1, tid2;
niters = atoi(argv[1]);
pthread_create(&tid1, NULL, thread, &niters);
pthread_create(&tid2, NULL, thread, &niters);
pthread_join(tid1, NULL);
pthread_join(tid2, NULL);
/* Check result */
if (cnt != (2 * niters)) printf(“BOOM! cnt=%ld\n”, cnt);
else
printf(“OK cnt=%ld\n”, cnt); exit(0);
}
void *thread(void *vargp)
{
long i, niters = *((long *)vargp);
for (i = 0; i < niters; i++) __atomic_add_fetch(&cnt, 1, __ATOMIC_SEQ_CST); return NULL; } ./main 10000000 OK cnt=20000000 49 Atomic Operations in C type __atomic_add_fetch (type *ptr, type val, int memorder) 50 *ptr=*prt+val The object ptr point to must be integer or pointer type The memory order. __atomic_add_fetch () returns the value after add; __atomic_fetch_add () returns the value before add; Atomic Operations in C Other built-in functions 51 void __atomic_load (type *ptr, type *ret, int memorder) void __atomic_store (type *ptr, type *val, int memorder void __atomic_exchange (type *ptr, type *val, type *ret, int memorder) bool __atomic_compare_exchange_n (type *ptr, type *expected, type desired, bool weak, int success_memorder, int failure_memorder) type __atomic_{OP}_fetch (type *ptr, type val, int memorder) OP can be {add, sub, and, or, xor, nand} bool __atomic_test_and_set (void *ptr, int memorder) void __atomic_clear (bool *ptr, int memorder) bool __atomic_always_lock_free (size_t size, void *ptr) Atomic Operations in C C programmers should be provided with enough flexibility allowing them to get “close to the machine” when need arises. The atomic operations providing facilities for low-level synchronization operations that will commonly reduce to one or two CPU instructions. 52 Atomic Operations in C C programmers should be provided with enough flexibility allowing them to get “close to the machine” when need arises. The atomic operations providing facilities for low-level synchronization operations that will commonly reduce to one or two CPU instructions. Lock vs atomic: Low-level implementation Atomic operations leverage processor support, no locks at all. Locks are more OS-dependent 53 Atomic Operations in C C programmers should be provided with enough flexibility allowing them to get “close to the machine” when need arises. The atomic operations providing facilities for low-level synchronization operations that will commonly reduce to one or two CPU instructions. Lock vs atomic: Low-level implementation Atomic operations leverage processor support, no locks at all. Locks are more OS-dependent Context switch Locks suspend thread execution, free up CPU resource for other threads. incurring in obvious context switching overhead Atomic operations don't incur in context switching overhead, but suffering from busy-waiting 54 Spinlock A spinlock is a simple synchronization object providing mutual exclusion A thread trying to acquire then lock by simply wait in a loop ("spin") Repeatedly checking if the lock is available. Pseudo-code: 55 void spin_lock(spinlock *lock) { while(lock.try_get_lock()); } Spinlock In mutex, lock/unlock operations both need incur context switch. Which is a relatively slow operation Lock: transfer control from executing thread to the operating system when they block. Allows another thread to run. Unlock: transfer control back to current thread. If the lock-hold time is expected to be short (i.e. busy waiting won’t consume too much CPU), we can consider use spinlock instead mutex. 56 Spinlock Compare-and-swap (CAS) instruction: CAS is an atomic instruction for synchronization. Compare the contents of a memory with a given value. If same, modify the contents of a memory with a new give value If not, nothing happened. We can implement spinlock use CAS instruction Example: 57 // pseudo code memory = 1 CAS(memory, expected = 1, new_val=10) // Swapped, 10 is in memory now. // pseudo code memory = 1 CAS(memory, expected = 2, new_val=10) // memory != expected, nothing happened. Spinlock bool __atomic_compare_exchange_n (type *ptr, type *expected, type desired, bool weak, int success_memorder, int failure_memorder) This compares the contents of *ptr with the contents of *expected If equal, write desired to *ptr. Return true; If not, read and the current contents of *ptr are written into *expected. Return false. Again, CAS is a atomic instruction. 58 #define LOCKED 1 #define UNLOCKED 0 void spin_lock(int *lock) { int expected = UNLOCKED; while(!__atomic_compare_exchange_n(lock, &expected, LOCKED, true, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)) expected = UNLOCKED; } void spin_unlock(int *lock) { __atomic_store_n(lock, UNLOCKED, __ATOMIC_SEQ_CST); } 59 Pseudo Implementation assume the lock is held whenlock == 1, can I implement the spinlocks as the following? while(CAS(&lock, 0, 1)); while(!CAS(&lock, 0, 1)); while(CAS(&lock, 1, 0)); while(!CAS(&lock, 1, 0)); 59 /docProps/thumbnail.jpeg