程序代写代做 concurrency C kernel cache I/O Management Intro

I/O Management Intro
Chapter 5 – 5.3
1

Learning Outcomes
• A high-level understanding of the properties of a variety of I/O devices.
• An understanding of methods of interacting with I/O devices.
2

3

I/O Devices
• There exists a large variety of I/O devices:
– Many of them with different properties
– They seem to require different interfaces to manipulate and manage them
• We don’t want a new interface for every device
• Diverse, but similar interfaces leads to code duplication
• Challenge:
– Uniform and efficient approach to I/O
4

• Logical position of device drivers is shown here
• Drivers (originally) compiled into the kernel
– Including OS/161
– Device installers were technicians
– Number and types of devices rarely changed
• Nowadays they are dynamically loaded when needed
– Linux modules
– Typical users (device installers) can’t build kernels
– Number and types vary greatly
• Even while OS is running (e.g hot-plug USB devices)
Device Drivers
5

Device Drivers
• Driversclassifiedintosimilarcategories
– Block devices and character (stream of data) device
• OSdefinesastandard(internal)interfaceto the different classes of devices
– Example: USB Human Input Device (HID) class specifications
• human input devices follow a set of rules making it easier to design a standard interface.
6

USB Device Classes
Base Class
Descriptor Usage
Description
00h
Device
Use class information in the Interface Descriptors
01h
Interface
Audio
02h
Both
Communications and CDC Control
03h
Interface
HID (Human Interface Device)
05h
Interface
Physical
06h
Interface
Image
07h
Interface
Printer
08h
Interface
Mass Storage
09h
Device
Hub
0Ah
Interface
CDC-Data
0Bh
Interface
Smart Card
0Dh
Interface
Content Security
0Eh
Interface
Video
0Fh
Interface
Personal Healthcare
10h
Interface
Audio/Video Devices
DCh
Both
Diagnostic Device
E0h
Interface
Wireless Controller
EFh
Both
Miscellaneous
FEh
Interface
Application Specific
FFh
Both
Vendor Specific
7

I/O Device Handling
• Data rate
– May be differences of several orders of magnitude between the data transfer rates
– Example: Assume 1000 cycles/byte I/O
• Keyboard needs 10 KHz processor to keep up • Gigabit Ethernet needs 100 GHz processor…..
8

Sample Data Rates
USB 3.0 625 MB/s (5 Gb/s) Thunderbolt 2.5GB/sec (20 Gb/s) PCIe v3.0 x16 16GB/s
9

Device Drivers
• Device drivers job
– translate request through the device-independent standard interface (open, close, read, write) into appropriate sequence of commands (register manipulations) for the particular hardware
– Initialise the hardware at boot time, and shut it down cleanly at shutdown
10

Device Driver
• After issuing the command to the device, the device either
– Completes immediately and the driver simply returns to the caller
– Or, device must process the request and the driver usually blocks waiting for an I/O complete interrupt.
• Drivers are thread-safe as they can be called by another process while a process is already blocked in the driver.
– Thead-safe: Synchronised…
11

Device-Independent I/O Software
• There is commonality between drivers of similar classes
• Divide I/O software into device-dependent and device-independent I/O software
• Device independent software includes – Buffer or Buffer-cache management
– TCP/IP stack
– Managing access to dedicated devices
– Error reporting
12

Driver  Kernel Interface
• MajorIssueisuniforminterfacestodevicesand
kernel
– Uniform device interface for kernel code
• Allows different devices to be used the same way
– No need to rewrite file-system to switch between SCSI, IDE or RAM disk
• Allows internal changes to device driver with fear of breaking kernel code
– Uniform kernel interface for device code
• Drivers use a defined interface to kernel services (e.g. kmalloc, install IRQ handler, etc.)
• Allows kernel to evolve without breaking existing drivers
– Together both uniform interfaces avoid a lot of programming implementing new interfaces
• Retains compatibility as drivers and kernels change over time.
13

a)
b)
Separate I/O and memory space
– I/O controller registers appear as I/O ports
– Accessed with special I/O instructions
Memory-mapped I/O
– Controller registers appear as memory
– Use normal load/store instructions to access
Accessing I/O Controllers
c) Hybrid
– x86 has both ports and memory mapped I/O
14

Bus Architectures
(a) A single-bus architecture
(b) A dual-bus memory architecture
15

Intel IXP420
16

Interrupts
• DevicesconnectedtoanInterruptControllervia lines on an I/O bus (e.g. PCI)
• InterruptControllersignalsinterrupttoCPUand is eventually acknowledged.
• Exact details are architecture specific.
17

I/O Interaction
18

Programmed I/O
• Alsocalledpolling,orbusy waiting
• I/O module (controller) performs the action, not the processor
• SetsappropriatebitsintheI/O status register
• No interrupts occur
• Processorchecksstatusuntil operation is complete
– Wastes CPU cycles
19

Interrupt-Driven I/O
• ProcessorisinterruptedwhenI/O module (controller) ready to exchange data
• Processorisfreetodootherwork
• No needless waiting
• Consumesalotofprocessortime because every word read or written passes through the processor
20

Direct Memory Access
• Transfers data directly between Memory and Device
• CPU not needed for copying
DMA Controller in Device
Separate DMA Controller
CPU
Memory
DMA Controller
CPU
Memory
Device
Device
DMA Controller
21

Direct Memory Access
• Transfers a block of data directly to or from memory
• An interrupt is sent when the task is complete
• The processor is only involved at the beginning and end of the transfer
22

DMA Considerations
 Reduces number of interrupts
– Less (expensive) context switches or kernel entry-exits
 Requires contiguous regions (buffers) – Copying
– Some hardware supports “Scatter-gather”
• Synchronous/Asynchronous
• Shared bus must be arbitrated (hardware)
– CPU cache reduces (but not eliminates) CPU need for bus
CPU Memory Device
23

The Process to Perform DMA Transfer
24

I/O Management Software
Chapter 5 – 5.3
25

Learning Outcomes
• AnunderstandingofthestructureofI/Orelated software, including interrupt handers.
• Anappreciationoftheissuessurroundinglong running interrupt handlers, blocking, and deferred interrupt handling.
• AnunderstandingofI/Obufferingandbuffering’s relationship to a producer-consumer problem.
26

Operating System Design Issues
• Efficiency
– Most I/O devices slow compared to main memory (and the CPU)
• Use of multiprogramming allows for some processes to be waiting on I/O while another process executes
• Often I/O still cannot keep up with processor speed
• Swapping may used to bring in additional Ready processes – More I/O operations
• Optimise I/O efficiency – especially Disk & Network I/O
27

Operating System Design Issues
• The quest for generality/uniformity:
– Ideally, handle all I/O devices in the same way
• Both in the OS and in user applications
– Problem:
• Diversity of I/O devices
• Especially, different access methods (random access versus stream based) as well as vastly different data rates.
• Generality often compromises efficiency!
– Hide most of the details of device I/O in lower-level routines so that processes and upper levels see devices in general terms such as read, write, open, close.
28

I/O Software Layers
Layers of the I/O Software System
29


Interrupt Handlers
Interrupt handlers

Can execute at (almost) any time
• Raise (complex) concurrency issues in the kernel
• Can propagate to userspace (signals, upcalls), causing similar issues
• Generally structured so I/O operations block until interrupts notify them of completion
– kern/dev/lamebus/lhd.c
30

static int
lhd_io(struct device *d,
struct uio *uio)
{ …
lhd_iodone(struct lhd_softc *lh, int err) {
lh->lh_result = err;
V(lh->lh_done);
}
void
lhd_irq(void *vlh)
{

val = lhd_rdreg(lh, LHD_REG_STAT);
switch (val & LHD_STATEMASK) { case LHD_IDLE:
case LHD_WORKING:
Interrupt Handler Example
/* Loop over all the sectors
* we were asked to do. */
for (i=0; ilh_clear);

/* Tell it what sector we want… */ lhd_wreg(lh, LHD_REG_SECT, sector+i); /* and start the operation. */ lhd_wreg(lh, LHD_REG_STAT, statval); /* Now wait until the interrupt
* handler tells us we’re done. */ P(lh->lh_done); SLEEP
/* Get the result value
* saved by the interrupt handler. */ result = lh->lh_result;
break;
case LHD_OK:
case LHD_INVSECT:
case LHD_MEDIA:
lhd_wreg(lh, LHD_REG_STAT, 0); lhd_iodone(lh,
}
break;
lhd_code_to_errno(lh, val));
31
} }









Interrupt Handler Steps
Save Registers not already saved by hardware interrupt mechanism
(Optionally) set up context for interrupt service procedure Typically, handler runs in the context of the currently running process
• No expensive context switch
Set up stack for interrupt service procedure
Handler usually runs on the kernel stack of current process Or “nests” if already in kernel mode running on kernel stack
Ack/Mask interrupt controller, re-enable other interrupts Implies potential for interrupt nesting.
32


Interrupt Handler Steps
Run interrupt service procedure
Acknowledges interrupt at device level
Figures out what caused the interrupt
• Received a network packet, disk read finished, UART transmit queue empty
If needed, it signals blocked device driver
In some cases, will have woken up a higher priority blocked thread
Choose newly woken thread to schedule next. Set up MMU context for process to run next What if we are nested?
Load new/original process’ registers
Re-enable interrupt; Start running the new process

• •
– –

– – –
33

Sleeping in Interrupts
• An interrupt generally has no context (runs on current kernel stack) – Unfair to sleep on interrupted process (deadlock possible)
– Where to get context for long running operation?
– What goes into the ready queue?
• What to do?
– Top and Bottom Half
– Linux implements with tasklets and workqueues
– Generically, in-kernel thread(s) handle long running kernel operations.
34

Top/Half Bottom Half
Higher Software Layers
Bottom Half
• Top Half
– Interrupt handler – remains short
• Bottom half
– Is preemptable by top half (interrupts)
– performs deferred work (e.g. IP stack processing)
– Is checked prior to every kernel exit
– signals blocked processes/threads to continue
• Enables low interrupt latency
• Bottom half can’t block
35
Top Half (Interrupt Handler)

1. Higher-level software
1
2. Interrupt 2 processing
Stack Usage
Kernel Stack
H
T
H
B
H
(interrupts disabled)
3. Deferred processing
(interrupt re- enabled)
4. Interrupt while in bottom half
3 4
T
B
H
36

Deferring Work on In-kernel Threads
• Interrupt
– handler defers work onto in-kernel thread
• In-kernel thread handles deferred work (DW)
– Scheduled normally – Can block
• Both low interrupt latency and blocking operations
Normal process/thread stack
In-kernel thread stack
I
H
D W
37

Buffering
38

Device-Independent I/O Software
(a) Unbuffered input
(b) Buffering in user space
(c) Single buffering in the kernel followed by copying to user space
(d) Double buffering in the kernel
39

No Buffering
• Process must read/write a device a byte/word at a time
– Each individual system call adds significant overhead
– Process must what until each I/O is complete
• Blocking/interrupt/waking adds to overhead.
• Many short runs of a process is inefficient (poor CPU cache temporal locality)
40

User-level Buffering
• Processspecifiesamemorybufferthatincoming data is placed in until it fills
– Filling can be done by interrupt service routine
– Only a single system call, and block/wakeup per data buffer
• Much more efficient
41

User-level Buffering
• Issues
– What happens if buffer is paged out to disk
• Could lose data while unavailable buffer is paged in
• Could lock buffer in memory (needed for DMA), however many processes doing I/O reduce RAM available for paging. Can cause deadlock as RAM is limited resource
– Consider write case
• When is buffer available for re-use?
– Either process must block until potential slow device drains buffer
– or deal with asynchronous signals indicating buffer drained
42

Single Buffer
• Operatingsystemassignsabufferinkernel’s memory for an I/O request
• Inastream-orientedscenario
– Used a line at time
– User input from a terminal is one line at a time with carriage return signaling the end of the line
– Output to the terminal is one line at a time
43

Single Buffer
• Block-oriented
– Input transfers made to buffer
– Block copied to user space when needed
– Another block is written into the buffer • Read ahead
44

Single Buffer
– User process can process one block of data while next block is read in
– Swapping can occur since input is taking place in system memory, not user memory
– Operating system keeps track of assignment of system buffers to user processes
45

Single Buffer Speed Up
• Assume
– T is transfer time for a block from device
– C is computation time to process incoming block – M is time to copy kernel buffer to user buffer
• Computation and transfer can be done in parallel
• Speed up with buffering
TC max(T,C)M
No Buffering Cost
Single Buffering Cost
46

Single Buffer
• What happens if kernel buffer is full
– the user buffer is swapped out, or
– The application is slow to process previous buffer
and more data is received???
=> We start to lose characters or drop network packets
47

Double Buffer
• Use two system buffers instead of one
• A process can transfer data to or from one buffer while the operating system empties or fills the other buffer
48

Double Buffer Speed Up
• Computation and Memory copy can be done in parallel with transfer
• Speed up with double buffering
TC max(T , C  M )
• UsuallyMismuchlessthanTgivinga favourable result
No Buffering Cost
Double Buffering Cost
49

Double Buffer
• May be insufficient for really bursty traffic
– Lots of application writes between long periods of computation
– Long periods of application computation while receiving data
– Might want to read-ahead more than a single block for disk
50

Circular Buffer
• Morethantwobuffersareused
• Eachindividualbufferisoneunitinacircular buffer
• UsedwhenI/Ooperationmustkeepupwith process
51

Important Note
• Notice that buffering, double buffering, and circular buffering are all
Bounded-Buffer Producer-Consumer
Problems
52

Is Buffering Always Good?
TC TC max(T,C)M max(T,CM)
Single
• Can M be similar or greater than C or T?
Double
53

Buffering in Fast Networks
• Networking may involve many copies
• Copying reduces performance
– Especially if copy costs are similar to or greater than computation or transfer costs
• Super-fast networks put significant effort into achieving zero-copy
• Buffering also increases latency
54

I/O Software Summary
Layers of the I/O system and the main functions of each layer
55