CS计算机代考程序代写 data structure compiler assembly Summary of Lecture 5

Summary of Lecture 5

• Three ways to get from user code to kernel code:
– System calls, interrupts, exceptions
– All mediated through the Interrupt Descriptor Table

• Interrupt handler and other code shares various stuff:
– Register, EFLAGS (push on stack)
– Memory (use a separate stack)
– Shared state:

• Data structures (linked list example)
• Device state (VGA example)
• Memory (is_device_locked)

© 2021 Yih-Chun Hu ECE391

Summary of Lecture 5

• Solving these issues:
– CLI / STI – disable interrupts
– Define shared variables as volatile
– Prevent compiler and ISA reorderings

• Google: “memory barrier” and “serializing instruction”

© 2021 Yih-Chun Hu ECE391

ECE391

• Multiprocessors and locks
• Spin locks

Lecture Topics

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• We solved the critical section problem for uniprocessors

• What about multiprocessors?
– CLI … critical section …. STI

• What is a multiprocessor?
– usually a symmetric multiprocessor (SMP)

– symmetric aspect: all processors have equal latency to all
memory banks

– multicore processors are similar from our perspective

Multiprocessors and Locks

P P P P
bus

I/O
I/O bus

M M

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• Some non-uniform memory architecture (NUMA)
machines were built

Multiprocessors and Locks (cont.)

P P P P

bus
I/O

I/O bus

M M

com PPPP

bus
I/O

I/O bus

MM

com

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• Multithreaded code is not protected by IF
• Why haven’t we solved the atomicity problem on

multiprocessors?
– interrupts are masked if IF is cleared!
– answer: IF is not cleared on other processors!
– just tell other processors to clear IF, too?

• too slow

• requires an interrupt!

• We need to use shared memory to synchronize…

Multiprocessors and Locks (cont.)

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• Logically, we use a lock
– when we want to access a piece of shared data

we first look at the lock
– if it’s locked, we wait until it’s unlocked

– once it’s unlocked
• we lock it
• access the data
• then unlock it

Multiprocessors and Locks (cont.)

shared
data

(locked state is
often represented

as 1)
shared
data

(unlocked state is
often represented

as 0)

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• Locking must be atomic with respect to other
processors!

INCL (%EAX)

read EAX
add 1
put it back

Multiprocessors and Locks (cont.)

PP

M read
write

It’s MY BUS!

LOCK: Prefix – execute
following instruction with bus
locked

LOCK INCL (%EAX)
© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• The simplest lock
– spin lock
– lock op

do {

try to change lock variable from 0 to 1
} while (attempt failed);

– other work to do? ignore it!
– spin in a tight loop on the lock (hence the name)

• Once successful, program/interrupt handler owns the lock

Spin Locks

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• Only the owner can unlock
– How?

• (change lock variable to 0)
– Need to be atomic?

• (no, only owner can change when locked)
– What about memory op reordering?

• (must be avoided!)

• Why keep asking the last question?
– reordering leads to subtle bugs
– unlikely to happen often
– unlikely to be repeatable
– very hard to find!

Spin Locks (cont.)

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• Static initialization

static spinlock_t a_lock = SPIN_LOCK_UNLOCKED;

• Dynamic initialization

spin_lock_init (&a_lock);

• When is dynamic initialization safe?
– lock must not be in use (race condition!)
– other synchronization method must prevent use

Linux Spin Lock API

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

void spin_lock (spinlock_t* lock);

void spin_unlock (spinlock_t* lock);

Linux Spin Lock API – Basic Functions

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

int spin_is_locked (spinlock_t* lock);

returns 1 if held, 0 if not, but beware of races!

int spin_trylock (spinlock_t* lock);

make one attempt; returns 1 on success, 0 on failure

void spin_unlock_wait (spinlock_t* lock);

wait until available (race condition again!)

Linux Spin Lock API – Testing Functions

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• Is spinlock enough to protect a critical section?

Linux Spin Lock API (cont.)

P1 P2

try to get lock
(forced to wait)

interrupt (P2)

code

lock

gets lock

crit. sect.
unlock

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• What about ?

• Still need CLI/STI
• Which is first, CLI or lock?

– CLI first
– interrupt may occur between them, leading to scenario

above

Linux Spin Lock API (cont.)

P1

try to get lock
(waits forever!)

interrupt (P1)

code

lock

called a
deadlock

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

static spinlock_t the_lock = SPIN_LOCK_UNLOCKED;

unsigned long flags;

spin_lock_irqsave (&the_lock, flags);
/* the critical section */
spin_unlock_irqrestore (&the_lock, flags);

asm volatile (”
PUSHFL
POPL %0
CLI

” : “=g” (flags) /* outputs */
: /* inputs */
: “memory” /* clobbers */

);
spin_lock (&the_lock);

Linux’ Lock/CLI Combo

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

spin_unlock (&the_lock);

asm volatile (”

PUSHL %0

POPFL

” : /* outputs */

: “g” (flags) /* inputs */

: “memory”, “cc” /* clobbers */

);

Linux’ Lock/CLI Combo (cont)

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• Notice that spin_lock_irqsave changes the flags argument (it’s a macro)
• The “memory” argument

– tells compiler that all memory is written by assembly block
– prevents compiler from moving memory ops across assembly

• The “cc” argument
– condition codes change

– can lead to subtle bugs if left out!

• spin_lock and spin_unlock calls
– become NOPs on uniprocessors

– in that case, calls just change IF

• Restore rationale: may have had IF=0 on entry; if so, STI is unsafe

Comments on code

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• Synch issues

• Conservative synchronization design
• Semaphores
• Reader/writer synchronization

• Selecting a synchronization mechanism

Lecture Topics

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• Synchronization issues
– five hungry philosophers
– five chopsticks

• Protocol
– take left chopstick (or wait)
– take right chopstick (or wait)
– eat
– release right chopstick

– release left chopstick
– digest
– repeat

problems? deadlock!

Another Philosophy Lesson

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• How about the following protocol?
– take left chopstick (or wait)
– if right chopstick is free, take it
– else release left chopstick and start over
– eat
– release right
– release left
– digest
– repeat

• Does this work?

Another Philosophy Lesson (cont.)

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• What if all philosophers act in lock-step (same speed)?

left left left left left
release release release release release

left left left left left
release release release release release

left left left left left
(ad infinitum)

• Called a livelock

Another Philosophy Lesson (cont.)

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• To solve the problem, need (partial) lock ordering
– e.g., call chopsticks #1 through #5
– protocol: take lower-numbered, then take higher-numbered
– two philosophers try to get #1 first
– can’t form a loop of waiting philosophers
– thus someone will be able to eat

Another Philosophy Lesson (cont.)

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• Getting synchronization correct can be hard
– it’s the focus of several research communities

• On uniprocessor
– mentally insert handler code

between every pair of adjacent
instructions

– ask whether anything bad can
happen

– if so, prevent with CLI/STI

Conservative Synchronization Design

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• On a multiprocessor
– consider all possible interleavings

of instructions
– amongst all pieces of (possibly

concurrent) code
– ask whether anything bad can

happen
– if so, use a lock to force

serialization
– good luck!

Conservative Synchronization Design (cont)

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• What does “bad” mean, anyway?

• A conservative but systematic definition

if any data written by one piece of code

are also read or written by another piece of code

these two pieces must be atomic with respect to each other

Conservative Synchronization Design (cont)

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• What variables are shared?
• step 0: ignore the parts that don’t touch shared data
• step 1: calculate read & write sets
• step 2: check for R/W, W/W relationships

– must be atomic!
• step 3: add lock(s) to guarantee atomic execution (pick order if > 1 locks)
• step 4: optimize if desired

Conservative Synchronization Design (cont)
not shared

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• Read/write sets
– move reads position, velocity

writes position
– dist reads position
– speed reads velocity
– momentum reads mass, velocity
– stop writes velocity
– change_mass writes mass

• Edges in graph imply need for atomicity

Conservative Synchronization Design –
Example Code Analysis

momentum

dist

speed

move

stop change_mass

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• Each lock = circle in graph

• All edges must be contained in some circle

• One lock suffices, but prevents parallelism (performance)

• Could use three (as shown above);
then MUST pick a lock order!

Conservative Synchronization Design –
Example Code Analysis (cont)

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• Recall our philosophical friends

• Five philosophers
• Three pairs of chopsticks

(one “lock” per pair)
• Problem: how do you get a pair?

• Option 1: walk around the table until you find a pair free
– lots of walking
– other people may cut in front of you

Role of Semaphores

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• Option 2: pick a target pair and wait for it to be free
– other pairs may be on the table
– but you’re still waiting hungrily for your chosen pair

• Instead, use a semaphore!
– an atomic counter
– proberen (P for short, meaning test/decrement)
– verhogen (V for short, meaning increment)
– Dutch courtesy of E. Dijkstra

Role of Semaphores

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• When are semaphores useful?
• Fixed number of resources to be allocated dynamically

– physical devices
– virtual devices
– entries in a static array
– logical channels in a network

• Linux semaphores have a critical difference from
Linux spin locks
– can block (sleep) and let other programs run

(spin locks do not block)
– thus must not be used by interrupt handlers
– used for system calls, particularly with long critical sections

Semaphores

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

void sema_init (struct semaphore* sem, int val);

Initialize dynamically to a specified value

void init_MUTEX (struct semaphore* sem);

Initialize dynamically to value one (mutual exclusion)

void down (struct semaphore* sem);

Wait on the semaphore (P)

void up (struct semaphore* sem);

Signal the semaphore (V)

Linux Semaphore API

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• If critical section needs both semaphores and spin locks
– Get all semaphores first!
– Linux expects not to be preempted while holding spin locks
– Semaphore code voluntarily relinquishes processor

Linux Semaphore API (cont.)

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• Philosophers like to read newspaper
– each philosopher reads frequently, taking short breaks between
– multiple philosophers may read at same time

• different sections
• or just over another’s shoulder

• Paper carrier delivers new paper
– once per day (infrequently)
– must change all sections at once

• Reader/writer synchronization supports this style
– allows many (in theory infinite) readers simultaneously
– at most one writer (and never at same time as readers)

Reader/Writer Problem:
The Philosophers and Their Newspaper

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• What if newspaper is always being read? starvation!

• Linux provides two types of reader/writer synchronization
– reader/writer spin locks (rwlock_t)

• extension of spin locks

• use for short critical sections
• ok to use in interrupt handlers
• admit writer starvation (you must ensure that this not happen)

– reader/writer semaphores (struct rw_semaphore)
• extension of semaphores
• use only in system calls
• do not admit writer starvation

(new readers wait if writer is waiting)

Reader/Writer Problem:
The Philosophers and Their Newspaper (cont)

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• General questions for synchronization of shared resources
– What are the operations?
– What are their frequencies?
– example:

• walk list and find min. age?
• update min. age every time list changes?

– What are the shared data?

• Goals
– short critical sections
– low contention for data

Selecting a Synchronization Mechanism

© Steven Lumetta, Zbigniew Kalbarczyk ECE391

• Safe selection (when in doubt…)

Selecting a Synchronization Mechanism (cont)

mutual exclusion reader/writer

data shared by
interrupt handlers

spin_lock_irqsave
spin_unlock_irqrestore

read/write__lock_irqsave
read/write_unlock_irqrestore

data shared only
by system calls

down
up

down_read/write
up_read/write

© Steven Lumetta, Zbigniew Kalbarczyk ECE391