CS计算机代考程序代写 concurrency algorithm [537] Locks and Condition Variables

[537] Locks and Condition Variables

Locks and
Condition Variables
Questions answered in this lecture:
How can threads block instead of spin-waiting while waiting for a lock?
When should a waiting thread block and when should it spin?
How can threads enforce ordering across operations?
How can thread_join() be implemented?
How can condition variables be used to support producer/consumer apps?
CSE 2431
Introduction to Operating Systems
Based on slides by Andrea C. Arpaci-Dusseau
Remzi H. Arpaci-Dusseau

Lock Evaluation
How to tell if a lock implementation is good?

Fairness:
Do processes acquire lock in same order as requested?

Performance
Two scenarios:
– low contention (fewer threads, lock usually available)
– high contention (many threads per CPU, all contending)

spin

spin

spin

spin

spin
A
B

0

20

40

60

80

100

120

140

160
C
D
A
B
C
D
lock

unlock

lock

A

0

20

40

60

80

100

120

140

160

A
B

lock

unlock

lock

no yield:
yield:
Why is yield useful?
Why doesn’t yield solve all performance problems?

Lock Implementation:
Block when Waiting
Lock implementation removes waiting threads from scheduler ready queue (e.g., park() and unpark())
Scheduler runs any thread that is ready
Good separation of concerns

RUNNABLE:
RUNNING:
WAITING:
A, B, C, D

0

20

40

60

80

100

120

140

160

RUNNABLE:
RUNNING:
WAITING:
B, C, D
A

0

20

40

60

80

100

120

140

160
A
lock

RUNNABLE:
RUNNING:
WAITING:
C, D, A
B

0

20

40

60

80

100

120

140

160
A
lock

B

RUNNABLE:
RUNNING:
WAITING:
C, D, A

B

0

20

40

60

80

100

120

140

160
A
lock

B
try lock
(sleep)

RUNNABLE:
RUNNING:
WAITING:
D, A
C
B

0

20

40

60

80

100

120

140

160
A
lock

B
try lock
(sleep)

C

RUNNABLE:
RUNNING:
WAITING:
A, C
D
B

0

20

40

60

80

100

120

140

160
A
lock

B
try lock
(sleep)

C
D

RUNNABLE:
RUNNING:
WAITING:
A, C

B, D

0

20

40

60

80

100

120

140

160
A
lock

B
try lock
(sleep)

C
D
try lock
(sleep)

RUNNABLE:
RUNNING:
WAITING:
C
A
B, D

0

20

40

60

80

100

120

140

160
A
lock

B
try lock
(sleep)

C
D
try lock
(sleep)

A

RUNNABLE:
RUNNING:
WAITING:
A
C
B, D

0

20

40

60

80

100

120

140

160
A
lock

B
try lock
(sleep)

C
D
try lock
(sleep)

A
C

RUNNABLE:
RUNNING:
WAITING:
C
A
B, D

0

20

40

60

80

100

120

140

160
A
lock

B
try lock
(sleep)

C
D
try lock
(sleep)

A
C
A

RUNNABLE:
RUNNING:
WAITING:
B, C
A

0

20

40

60

80

100

120

140

160
A
lock

B
try lock
(sleep)

C
D
try lock
(sleep)

A
C
A
unlock

D

RUNNABLE:
RUNNING:
WAITING:
B, C
A

0

20

40

60

80

100

120

140

160
A
lock

B
try lock
(sleep)

C
D
try lock
(sleep)

A
C
A
unlock

D

RUNNABLE:
RUNNING:
WAITING:
C, A
B

0

20

40

60

80

100

120

140

160
A
lock

B
try lock
(sleep)

C
D
try lock
(sleep)

A
C
A
unlock

B
lock

D

Lock Implementation:
Block when Waiting
typedef struct {
bool lock = false;
bool guard = false;
queue_t q;
} LockT;
void acquire(LockT *l) {
while (TAS(&l->guard, true));
if (l->lock) {
qadd(l->q, tid);
l->guard = false;
park(); // blocked
} else {
l->lock = true;
l->guard = false;
}
}

void release(LockT *l) {
while (TAS(&l->guard, true));
if (qempty(l->q)) l->lock=false;
else unpark(qremove(l->q));
l->guard = false;
}
(a) Why is guard used?
(b) Why okay to spin on guard?
(c) In release(), why not set lock=false when unpark?
(d) What is the race condition?

Race Condition
Thread 1
if (l->lock) {
qadd(l->q, tid);
l->guard = false;

park(); // block

(in unlock)
(in lock)
Thread 2

while (TAS(&l->guard, true));
if (qempty(l->q)) // false!!
else unpark(qremove(l->q));
l->guard = false;

Problem: Guard not held when call park()
Unlocking thread may unpark() before thread does park()

Block when Waiting:
FINAL correct LOCK
Typedef struct {
bool lock = false;
bool guard = false;
queue_t q;
} LockT;
void acquire(LockT *l) {
while (TAS(&l->guard, true));
if (l->lock) {
qadd(l->q, tid);
setpark(); // notify of plan
l->guard = false;
park(); // unless unpark()
} else {
l->lock = true;
l->guard = false;
}
}

void release(LockT *l) {
while (TAS(&l->guard, true));
if (qempty(l->q)) l->lock=false;
else unpark(qremove(l->q));
l->guard = false;
}
setpark() fixes race condition

Spin-Waiting vs Blocking
Each approach is better under different circumstances
Uniprocessor
Waiting process is scheduled –> Process holding lock isn’t
Waiting process should always relinquish processor
Associate queue of waiters with each lock (as in previous implementation)
Multiprocessor
Waiting process is scheduled –> Process holding lock might be
Spin or block depends on how long, t, before lock is released
Lock released quickly –> Spin-wait
Lock released slowly –> Block
Quick and slow are relative to context-switch cost, C

When to Spin-Wait?
When to Block?
If know how long, t, before lock released, can determine optimal behavior
How much CPU time is wasted when spin-waiting?

How much wasted when block?

What is the best action when tC?

Problem:
Requires knowledge of future; too much overhead to do any special prediction
t
C
spin-wait
block

Two-Phase Waiting
Theory: Bound worst-case performance; ratio of actual/optimal
When does worst-possible performance occur?

Algorithm: Spin-wait for C then block –> Factor of 2 of optimal
Two cases:
t < C: optimal spin-waits for t; we spin-wait t too t > C: optimal blocks immediately (cost of C); we pay spin C then block (cost of 2 C); 2C / C  2-competitive algorithm
Example of competitive analysis
Spin for very long time t >> C
Ratio: t/C (unbounded)

Condition Variables

Concurrency Objectives
Mutual exclusion (e.g., A and B don’t run at same time)
– solved with locks

Ordering (e.g., B runs after A does something)
– solved with condition variables and semaphores

Ordering Example: Join
pthread_t p1, p2;
Pthread_create(&p1, NULL, mythread, “A”);
Pthread_create(&p2, NULL, mythread, “B”);
// join waits for the threads to finish
Pthread_join(p1, NULL);
Pthread_join(p2, NULL);
printf(“main: done\n [balance: %d]\n [should: %d]\n”,
balance, max*2);
return 0;
how to implement join()?

Condition Variables
Condition Variable: queue of waiting threads
B waits for a signal on CV before running
wait(CV, …)
A sends signal to CV when time for B to run
signal(CV, …)

Condition Variables
wait(cond_t *cv, mutex_t *lock)
– assumes the lock is held when wait() is called
– puts caller to sleep + releases the lock (atomically)
– when awoken, reacquires lock before returning

signal(cond_t *cv)
– wake a single waiting thread (if >= 1 thread is waiting)
– if there is no waiting thread, just return, doing nothing

Join Implementation:
Attempt 1
void thread_exit() {
Mutex_lock(&m); // a
Cond_signal(&c); // b
Mutex_unlock(&m); // c
}
void thread_join() {
Mutex_lock(&m); // x
Cond_wait(&c, &m); // y
Mutex_unlock(&m); // z
}
(P) x (P) y (C) a (C) b (C) c (P) z
Parent:
Child:
Example schedule: Parent is (P); Child is (C)
Works!

Join Implementation:
Attempt 1
void thread_exit() {
Mutex_lock(&m); // a
Cond_signal(&c); // b
Mutex_unlock(&m); // c
}
void thread_join() {
Mutex_lock(&m); // x
Cond_wait(&c, &m); // y
Mutex_unlock(&m); // z
}

Parent:
Child:

(C) a (C) b (C) c (P) x (P) y

Parent waits forever!

Can you construct ordering that does not work?

Example broken schedule: Parent is (P); Child is (C)

Rule of Thumb 1
Keep state in addition to CV’s!

CV’s are used to signal threads when state changes

If state is already as needed, thread doesn’t wait for a signal!

Join Implementation:
Attempt 2
void thread_exit() {
done = 1; // a
Cond_signal(&c); // b
}
void thread_join() {
Mutex_lock(&m); // w
if (done == 0) // x
Cond_wait(&c, &m); // y
Mutex_unlock(&m); // z
}
Parent:
Child:
(C) a (C) b (P) w (P) x (P) z
Fixes previous broken ordering: Parent is (P); Child is (C)

Join Implementation:
Attempt 2
void thread_exit() {
done = 1; // a
Cond_signal(&c); // b
}
void thread_join() {
Mutex_lock(&m); // w
if (done == 0) // x
Cond_wait(&c, &m); // y
Mutex_unlock(&m); // z
}
Parent:
Child:
(P) w (P) x (C) a (C) b (P) y

… sleep forever …
Can you construct ordering that does not work?

Join Implementation:
COrrect
void thread_exit() {
Mutex_lock(&m); // a
done = 1; // b
Cond_signal(&c); // c
Mutex_unlock(&m); // d
}
void thread_join() {
Mutex_lock(&m); // w
if (done == 0) // x
Cond_wait(&c, &m); // y
Mutex_unlock(&m); // z
}
Parent:
Child:
(P) w (P) x (P) y (C) a (C) b (C) c (C) d (P) z
Use mutex to ensure no race between interacting with state
and wait/signal

Producer/Consumer Problem

Producer/Consumer Problem
Producers generate data

Consumers grab data and process it

Producer/consumer problems are frequent in systems
Web servers
General strategy use condition variables to:
-make producers wait when buffers are full
-make consumers wait when there is nothing to consume

Produce/Consumer Example
Start with easy case:
1 producer thread
1 consumer thread
1 shared buffer to fill/consume (max = 1)

Numfull = number of buffers currently filled
Examine slightly broken code to begin…

void *consumer(void *arg) {
while(1) {
Mutex_lock(&m);
while(numfull == 0)
Cond_wait(&cond, &m);
int tmp = do_get();
Cond_signal(&cond);
Mutex_unlock(&m);
printf(“%d\n”, tmp);
}
}
[RUNNABLE]
[RUNNING]

numfull=0
void *producer(void *arg) {
for (int i=0; i= 1 thread is waiting)
– if there is no waiting thread, just return, doing nothing

broadcast(cond_t *cv)
– wake all waiting threads (if >= 1 thread is waiting)
– if there are no waiting thread, just return, doing nothing
any disadvantage?

How to wake the right thread?
One solution:

Better solution (usually): use two condition variables

wake all the threads!

Producer/Consumer:
Two CVs
void *producer(void *arg) {
for (int i = 0; i < loops; i++) { Mutex_lock(&m); // p1 if (numfull == max) // p2 Cond_wait(&empty, &m); // p3 do_fill(i); // p4 Cond_signal(&fill); // p5 Mutex_unlock(&m); //p6 } } void *consumer(void *arg) { while (1) { Mutex_lock(&m); if (numfull == 0) Cond_wait(&fill, &m); int tmp = do_get(); Cond_signal(&empty); Mutex_unlock(&m); } } Is this correct? Can you find a bad schedule? 1. consumer1 waits because numfull == 0 2. producer increments numfull, wakes consumer1 3. before consumer1 runs, consumer2 runs, grabs entry, sets numfull=0. 4. consumer1 then reads bad data. Good Rule of Thumb 3 Whenever a lock is acquired, recheck assumptions about state! Possible for another thread to grab lock in between signal and wakeup from wait Producer/Consumer: Two CVs and WHILE void *producer(void *arg) { for (int i = 0; i < loops; i++) { Mutex_lock(&m); // p1 while (numfull == max) // p2 Cond_wait(&empty, &m); // p3 do_fill(i); // p4 Cond_signal(&fill); // p5 Mutex_unlock(&m); //p6 } } void *consumer(void *arg) { while (1) { Mutex_lock(&m); while (numfull == 0) Cond_wait(&fill, &m); int tmp = do_get(); Cond_signal(&empty); Mutex_unlock(&m); } } Is this correct? Can you find a bad schedule? Correct! - no concurrent access to shared state - every time lock is acquired, assumptions are reevaluated - a consumer will get to run after every do_fill() - a producer will get to run after every do_get() Summary: rules of thumb for CVs Keep state in addition to CV’s Always do wait/signal with lock held Whenever thread wakes from waiting, recheck state