CONCURRENCY: DEADLOCK + Review
Andrea Arpaci-Dusseau CS 537, Fall 2019
ADMINISTRIVIA
Project 5 available now (xv6 Memory)
– Topic of Discussion Sections tomorrow
– Due Monday 11/4 (5 pm)
– Request new project partner if desired (web form)
– Turn in any of 3 versions:
– v0 (alloc alternating pages, all marked as UNKNOWN PID)
– v1 (alternating pages, some marked UNKNOWN, most known PIDs)
– v2 (contiguous allocations when possible, some marked UNKNOWN, most known PIDs
Midterm 2: Nov 11/6 (Wed) from 7:30-9:30pm
– Practice exams available
– Rooms on Canvas
– Mostly Concurrency
+ Some Virtualization (usually repeated from Midterm1)
RECAP
Concurrency Objectives
Mutual exclusion (e.g.,A and B don’t run at same time) solved with locks
Ordering (e.g., B runs after A does something) solved with condition variables and semaphores
Condition Variables
wait(cond_t *cv, mutex_t *lock)
– assumes the lock is held when wait() is called
– puts caller to sleep + releases the lock (atomically) – when awoken, reacquires lock before returning
signal(cond_t *cv)
– wake a single waiting thread (if >= 1 thread is waiting) – if there is no waiting thread, just return, doing nothing
cv A B C D lock D A
signal(cv) – what happens? release(lock) – what happens?
Motivation
Monitors
– Users can inadvertently misuse locks and semaphores (e.g., never unlock a mutex)
Idea
– Provide language support to automatically lock and unlock monitor lock when in critical section
• Lock is added implicitly; never seen by user
– Provide condition variables for scheduling constraints (zero or more)
Examples
– Mesa language from Xerox
– Java: Acquire monitor lock when call synchronized methods in class
synchronized deposit(int amount) { // language adds lock.acquire()
balance += amount;
// language adds lock.release() }
INTRODUCING Semaphores
Condition variables have no state (other than waiting queue) – Programmer must track additional state
Semaphores have state: track integer value
– State cannot be directly accessed by user program,
but state determines behavior of semaphore operations
SUMMARY: Semaphores
Semaphores are equivalent to locks + condition variables
– Can be used for both mutual exclusion and ordering
Semaphores contain state
– How they are initialized depends on how they will be used – Init to 0: Join (1 thread must arrive first, then other)
– Init to N: Number of available resources
sem_wait(): Waits until value > 0, decrement value
sem_post(): Increment value, if value > 0, wake a single waiter (atomic) Can use semaphores in producer/consumer and for reader/writer locks
CONCURRENCY BUGS
75 60 45 30 15
0
Concurrency Study
Atomicity
Order
Deadlock
Other
My SQL
Lu etal. [ASPLOS 2008]:
Ap ach e
Mo zil l a
Ope nOff ic e
For four major projects, search for concurrency bugs among >500K bug reports. Analyze small sample to identify common types of concurrency bugs.
Bugs
Deadlock Theory
Deadlocks can only happen with these four conditions: 1. mutual exclusion
2. hold-and-wait
3. no preemption
4. circular wait
Can eliminate deadlock by eliminating any one condition
B
D
A
C
STOP
STOP
STOP
STOP
1. Mutual Exclusion
Problem: Threads claim exclusive control of resources that they require Strategy: Eliminate locks!
Try to replace locks with atomic primitive:
int CompAndSwap(int *addr, int expected, int new) Returns 0 fail, 1 success
Wait-Free Algorithms
void add (int *val, int amt)
{
Mutex_lock(&m);
*val += amt;
Mutex_unlock(&m);
}
void add (int *val, int amt) { do {
int old = *val;
} while(!CompAndSwap(val, old, old+amt);
}
Wait-Free Algorithms
void add (int *val, int amt) { do {
int old = *val;
} while(!CompAndSwap(val, old, old+amt);
}
Old = 10;
Val = 10;
T1: add(&val, 2);
T2: add(&val, 3);
Old = 10;
* Val == 10 à *val = 13; return true;
*Val != 10àreturn false;
Old = 13;
*Val == 13 -> *val=15; return true
Wait-Free Algorithm: Linked List Insert
void insert (int val) {
node_t *n = Malloc(sizeof(*n)); n->val = val;
lock(&m);
n->next = head;
head = n;
unlock(&m);
void insert (int val) {
node_t *n = Malloc(sizeof(*n)); n->val = val;
do {
n->next = head;
} while (!CompAndSwap(&head,
n->next, n));
}}
Wait-Free Algorithm: Linked List Insert
void insert (int val) {
node_t *n = Malloc(sizeof(*n));
n->val = val; h do {
n->next = head;
} while (!CompAndSwap(&head, n->next, n));
}
*Head != n->next so return false *Head == n->next so *head=n; return true *Head == n->next so *head=n; return true
5
null
T1: insert(2); T2: insert(3);
3
2
3
null
3
2
null
2
2. Hold-and-Wait
Problem:Threads hold resources while waiting for additional resources
Strategy: Acquire all locks atomically once. Can release locks over time, but cannot acquire any again until all have been released
How? Use a meta lock:
lock(&meta);
lock(&L1); lock(&L2); lock(&L3);
… unlock(&meta); // CS1 unlock(&L1); // CS 2 Unlock(&L2);
lock(&meta);
lock(&L2); lock(&L1); unlock(&meta);
// CS1
unlock(&L1);
// CS2
Unlock(&L2);
lock(&meta);
lock(&L1);
unlock(&meta);
// CS1
unlock(&L1);
Disadvantages?
2. Hold-and-Wait
Must know ahead of time which locks will be needed Must be conservative (acquire any lock possibly needed) Degenerates to just having one big lock
lock(&meta);
lock(&L1); lock(&L2); lock(&L3);
… unlock(&meta); // CS1 unlock(&L1); // CS 2 Unlock(&L2);
lock(&meta);
lock(&L2); lock(&L1); unlock(&meta);
// CS1
unlock(&L1);
// CS2
Unlock(&L2);
lock(&meta);
lock(&L1);
unlock(&meta);
// CS1
unlock(&L1);
3. No preemption
Problem: Resources (e.g., locks) cannot be forcibly removed from threads that are Strategy: if thread can’t get what it wants, release what it holds
top:
lock(A);
if (trylock(B) == -1) {
unlock(A);
Disadvantages?
goto top; }
…
Livelock:
No processes make progress, but the state of involved processes constantly changes
Classic solution: Exponential back-off
4. Circular Wait
Circular chain of threads such that each thread holds a resource (e.g., lock) being requested by next thread in the chain.
Strategy:
– decide which locks should be acquired before others – if A before B, never acquire A if B is already held!
– document this, and write code accordingly
Works well if system has distinct layers
Lock Ordering in Xv6
Creating a file requires simultaneously holding: • a lock on the directory,
• a lock on the new file’s inode,
• a lock on a disk block buffer,
• idelock,
• ptable.lock
Always acquires locks in order listed Linux has similar rules…
Summary
When in doubt about correctness, better to limit concurrency (i.e., add unnecessary locks, one big lock)
Concurrency is hard, encapsulation makes it harder!
Have a strategy to avoid deadlock and stick to it
Choosing a lock order is probably most practical for reasonable performance
Review: Easy Piece 1
Context Switch
CPU Process
Virtualization
Schedulers
Dynamic Relocation
TLBs
Segmentation
Memory Address Space
Multilevel
Swapping
Paging
Review: Easy Piece 2
Locks
Concurrency
Threads
Semaphores
Mutual Exclusion
Synchronization Techniques
Monitors
Condition Variables
Implementation
Ordering
Classic Synchronization Problems
Semaphores
Atomic HW Instr
Spin vs Block
Dining Philosophers
Producer/Consumer
Reader/Writer Locks
int a = 0;
int main() { fork();
a++;
fork();
a++;
if (fork() == 0) {
How many times will “Hello!\n” be displayed?
Review: Processes vs Threads
printf(“Hello!\n”); 4 } else {
printf(“Goodbye!\n”); }
a++;
printf(“a is %d\n”, a); }
What will be the final value of “a” as displayed by the final line of the program?
3
Review: Processes vs Threads
volatile int balance = 0; void *mythread(void *arg) {
int result = 0;
result = result + 200;
balance = balance + 200; printf(“Result is %d\n”, result); printf(“Balance is %d\n”, balance); return NULL;
}int main(int argc, char *argv[]) {
pthread_t p1, p2;
pthread_create(&p1, NULL, mythread, “A”); pthread_create(&p2, NULL, mythread, “B”); pthread_join(p1, NULL);
pthread_join(p2, NULL);
printf(“Final Balance is %d\n”, balance); }
How many total threads are part of this process?
3
When thread p1 prints “Result is %d\n”,
what value of result will be printed?
200. ‘result’ is a local variable allocated on the stack; each thread has its own private copy which only it increments, therefore there are no race conditions.
When thread p1 prints “Balance is %d\n”, what value of balance will be printed?
Unknown. balance is allocated on the heap and shared between the two threads that are each accessing it without locks; there is a race condition.
Sample Homework: HW-THREADSIntro
./x86.py -p looping-race-nolock.s -t 2 -r -i 3
# assumes %bx has loop count in it
.main
.top
mov 2000, %ax # get the value at the address
add $1, %ax # increment it
mov %ax, 2000 # store it back
# see if we’re still looping
sub $1, %bx
test $0, %bx
jgt .top
halt
Looping-race-nolocks.s (addr 2000 has 0)
Walking through code behavior
After the instruction stream “P” (i.e., after scheduler runs one line from parent), which line of the parent’s will execute when it is scheduled again?
p2. P will have acquired lock and finished mutex_lock(). Assume the scheduler continues on with “C” (the full instruction stream is PC).
Which line will child execute when it is scheduled again?
C1. C must wait to acquire the lock since it is currently held by P.
After PPP (full is PCPPP), which line for parent next?
p3. Since done = 0, P will execute p2 and p3; it is stuck in p3 until signaled.
After CCC (full is PCPPPCCC), which line for child next?
c4. C finishes c1, c2, c3. Next time it executes c4. CV signaled but mutex still locked
After PP (full is PCPPPCCCPP), which line for parent next?
p3. P cannot return from cond_wait() until it acquires lock held by C; P stays in p3
After CC (full is PCPPPCCCPPCC, which line for child next?
Beyond. C executes c4 and then code beyond c4.
After PPP (full PCPPPCCCPPCCPPP), which line for parent next?
Beyond. P finishes p3, rechecks p2, then p4. Next time beyond p4.
typedef struct __lock_t { int ticket;
int turn; }
void lock_init(lock_t *lock) { lock->ticket = 0; lock->turn = 0;
}
Ticket Lock with yield
void acquire(lock_t *lock) {
int myturn = FAA(&lock->ticket);
while (lock->turn != myturn)
yield();
void release(lock_t *lock) {
FAA(&lock->turn);
}
Remember: yield() voluntarily relinquishes CPU for remainder of timeslice, but process remains READY
}
Block when Waiting: FINAL correct LOCK
typedef struct {
bool lock = false;
bool guard = false;
queue_t q;
} LockT;
void acquire(LockT *l) {
while (TAS(&l->guard, true));
if (l->lock) {
qadd(l->q, tid);
setpark(); // notify of plan
l->guard = false;
park(); // unless unpark()
} else {
l->lock = true;
l->guard = false;
} }
void release(LockT *l) {
while (TAS(&l->guard, true));
if (qempty(l->q)) l->lock=false;
else unpark(qremove(l->q));
l->guard = false;
}
setpark() fixes race condition
Park() does not block if unpark() occurred after setpark()
sem_t mayEat[5]; // how to initialize? sem_t mutex; // how to init?
int state[5] = {THINKING}; take_chopsticks(int i) {
wait(&mutex); // enter critical section state[i] = HUNGRY;
testSafetyAndLiveness(i); // check if I can run signal(&mutex); // exit critical section wait(&mayEat[i]);
}
put_chopsticks(int i) {
wait(&mutex); // enter critical section state[i] = THINKING;
test(i+1 %5); // check if neighbor can run now test(i+4 %5);
signal(&mutex); // exit critical section }
testSafetyAndLiveness(int i) { if(state[i]==HUNGRY&&state[i+4%5]!=EATING&&state[i+1%5]!=EATING) {
state[i] = EATING;
signal(&mayEat[i]); }}
READER/Writer Locks
13
14
15
16
17
18
19
21
22
23
24
25
26
27 }
29 rwlock_acquire_writelock(rwlock_t *rw) { sem_wait(&rw->writelock); } 31 rwlock_release_writelock(rwlock_t *rw) { sem_post(&rw->writelock); }
void rwlock_acquire_readlock(rwlock_t *rw) {
}
sem_wait(&rw->lock); rw->readers++;
if (rw->readers == 1)
T1: acquire_readlock() T2: acquire_readlock() T3: acquire_writelock() T4: acquire_readlock()
sem_wait(&rw->writelock); sem_post(&rw->lock);
// what happens?
void rwlock_release_readlock(rwlock_t *rw) {
sem_wait(&rw->lock); rw->readers–;
if (rw->readers == 0)
sem_post(&rw->writelock); sem_post(&rw->lock);
Acquire_readlock() {
Sem_wait(&mutex);
If (ActiveWriters +
WaitingWriters==0) {
sem_post(OKToRead);
ActiveReaders++;
} else WaitingReaders++;
Sem_post(&mutex);
Sem_wait(OKToRead);
Release_readlock() { Sem_wait(&mutex);
ActiveReaders–;
If (ActiveReaders==0 &&
WaitingWriters > 0) {
ActiveWriters++;
WaitingWriters–;
Sem_post(OKToWrite);
Sem_post(&mutex);
}
() {
(&mutex);
Acquire_writelock
Sem_wait
ActiveWriters
ActiveWriters
sem_post
} else
Sem_post
If (
==0) {
+
++;
ActiveReaders
+
WaitingWriters
(
WaitingWriters
OKToWrite
); ++;
(&mutex);
Release_writelock() {
Sem_wait(&mutex);
ActiveWriters–;
If (WaitingWriters > 0) {
ActiveWriters++;
WaitingWriters–;
Sem_post(OKToWrite);
(
T1: acquire_readlock() T2: acquire_readlock() T3: acquire_writelock() T4: acquire_readlock()
// what happens?
… release_readlock() x 2 T3: release_writelock()
Sem_wait
);
OKToWrite
}
}
}
} else while(WaitingReaders>0) { ActiveReaders++; WaitingReaders–; sem_post(OKToRead);
}
Sem_post(&mutex);