CS计算机代考程序代写 file system cache arm algorithm Announcements

Announcements

Persistence:
I/O devices
Reading: Chapters 35, 36 and 37
Questions answered in this lecture:
How does the OS interact with I/O devices (check status, send data+control)?
What is a device driver?
What are the components of a hard disk drive?
How can you calculate sequential and random throughput of a disk?
What algorithms are used to schedule I/O requests?
CSE 2431
Systems 2
Based on slides by Andrea C. Arpaci-Dusseau
Remzi H. Arpaci-Dusseau

Motivation
What good is a computer without any I/O devices?
– keyboard, display, disks

We want:
– H/W that will let us plug in different devices
– OS that can interact with different combinations

CPU
RAM

Graphics

Memory Bus
General I/O Bus
(e.g., PCI)
Peripheral I/O Bus
(e.g., SCSI, SATA, USB)
Why use hierarchical buses?
Hardware support for I/O

Canonical Device

Status
COMMAND
DATA

OS reads/writes to these
Device Registers:

Canonical Device

Status
COMMAND
DATA
Device Registers:

OS reads/writes to these
Hidden Internals:

???

Canonical Device

Status
COMMAND
DATA
Device Registers:

OS reads/writes to these
Hidden Internals:

Microcontroller (CPU+RAM)
Extra RAM
Other special-purpose chips

Example Write Protocol
while (STATUS == BUSY)
; // spin/poll
Write data to DATA register
Write command to COMMAND register
while (STATUS == BUSY)
; // spin/poll

Status
COMMAND
DATA

Microcontroller (CPU+RAM)
Extra RAM
Other special-purpose chips

while (STATUS == BUSY) // 1
;
Write data to DATA register // 2
Write command to COMMAND register // 3
while (STATUS == BUSY) // 4
;
CPU:
Disk:

A
CPU:
Disk:
C
while (STATUS == BUSY) // 1
;
Write data to DATA register // 2
Write command to COMMAND register // 3
while (STATUS == BUSY) // 4
;

A
CPU:
Disk:
C

A wants to do I/O
while (STATUS == BUSY) // 1
;
Write data to DATA register // 2
Write command to COMMAND register // 3
while (STATUS == BUSY) // 4
;

1
A
CPU:
Disk:
C
while (STATUS == BUSY) // 1
;
Write data to DATA register // 2
Write command to COMMAND register // 3
while (STATUS == BUSY) // 4
;

1

2
A
CPU:
Disk:
A
C
while (STATUS == BUSY) // 1
;
Write data to DATA register // 2
Write command to COMMAND register // 3
while (STATUS == BUSY) // 4
;

1

2
A
CPU:
Disk:
A
C

3
while (STATUS == BUSY) // 1
;
Write data to DATA register // 2
Write command to COMMAND register // 3
while (STATUS == BUSY) // 4
;

1

2

4

3
A
CPU:
Disk:
C
A
while (STATUS == BUSY) // 1
;
Write data to DATA register // 2
Write command to COMMAND register // 3
while (STATUS == BUSY) // 4
;

1

2

4

3
A
CPU:
Disk:
C
A
while (STATUS == BUSY) // 1
;
Write data to DATA register // 2
Write command to COMMAND register // 3
while (STATUS == BUSY) // 4
;

1

2

4

3
A
B
CPU:
Disk:
C
A
while (STATUS == BUSY) // 1
;
Write data to DATA register // 2
Write command to COMMAND register // 3
while (STATUS == BUSY) // 4
;
how to avoid spinning/polling?
interrupts!

1

2

4

3
A
B
CPU:
Disk:
C
A
how to avoid spinning?
interrupts!
while (STATUS == BUSY) // 1
wait for interrupt;
Write data to DATA register // 2
Write command to COMMAND register // 3
wait for interrupt; // 4

2

3,4
A
B
CPU:
Disk:
C
A
how to avoid spinning?
interrupts!
B
B
A
A

1
while (STATUS == BUSY) // 1
wait for interrupt;
Write data to DATA register // 2
Write command to COMMAND register // 3
wait for interrupt; // 4

Interrupts vs. Polling
Are interrupts ever worse than polling?
Fast device: Better to spin than take interrupt overhead
Device time unknown? Hybrid approach (spin then use interrupts)
Flood of interrupts arrive
Can lead to livelock (always handling interrupts)
Better to ignore interrupts while make some progress handling them
Other improvement
Interrupt coalescing (batch together several interrupts)

Protocol Variants
Status checks: polling vs. interrupts

Data: PIO vs. DMA

Control: special instructions vs. memory-mapped I/O

Status
COMMAND
DATA

Microcontroller (CPU+RAM)
Extra RAM
Other special-purpose chips

2

3,4
A
B
CPU:
Disk:
C
A
what else can we optimize?
B
B
A
A

1
while (STATUS == BUSY) // 1
wait for interrupt;
Write data to DATA register // 2
Write command to COMMAND register // 3
wait for interrupt; // 4

data transfer!

Programmed I/O vs.
Direct Memory Access
PIO (Programmed I/O):
CPU directly tells device what the data is (puts data in device’s data register)

DMA (Direct Memory Access):
CPU leaves data in memory
Device reads data directly from memory

2

3,4
A
B
CPU:
Disk:
C
A
B
B
A
A

1
while (STATUS == BUSY) // 1
wait for interrupt;
Write data to DATA register // 2
Write command to COMMAND register // 3
wait for interrupt; // 4

2

3,4
A
B
CPU:
Disk:
C
A
B
B
A
A

1
while (STATUS == BUSY) // 1
wait for interrupt;
Write data to DATA register // 2
Write command to COMMAND register // 3
wait for interrupt; // 4

A
CPU:
Disk:
C
A
B
B
A

1

3,4
while (STATUS == BUSY) // 1
wait for interrupt;
Write data to DATA register // 2
Write address of data in register
Write command to COMMAND register // 3
wait for interrupt; // 4

Protocol Variants
Status checks: polling vs. interrupts

Data: PIO vs. DMA

Control: special instructions vs. memory-mapped I/O

Status
COMMAND
DATA

Microcontroller (CPU+RAM)
Extra RAM
Other special-purpose chips

A
CPU:
Disk:
C
A
B
B
A

1

3,4
how does OS read and write registers?
while (STATUS == BUSY) // 1
wait for interrupt;
Write data to DATA register // 2
Write command to COMMAND register // 3
while (STATUS == BUSY) // 4
wait for interrupt;

Special Instructions vs.
Mem-Mapped I/O
Special instructions
each device has a port
in/out instructions (x86) communicate with device

Memory-Mapped I/O
H/W maps registers into address space
loads/stores sent to device’s address space (which functions as “registers”)

Doesn’t matter much (both are used)

Protocol Variants
Status checks: polling vs. interrupts

Data: PIO vs. DMA

Control: special instructions vs. memory-mapped I/O

Status
COMMAND
DATA

Microcontroller (CPU+RAM)
Extra RAM
Other special-purpose chips

Variety is a Challenge
Problem:
many, many devices
each has its own protocol
How can we avoid writing a slightly different OS for each H/W combination?
Write device driver for each device! To use different combinations of devices, only the device drivers have to change. Only device driver code has to know about characteristics of device; rest of OS interacts with device driver through common interface presented by device driver.
Drivers are 70% of Linux source code

Storage Stack

application

file system

scheduler

driver
hard drive

build common interface
on top of all HDDs

Hard Disks

Basic Interface
Disk has a sector-addressable address space
Appears as an array of sectors

Sectors are typically 512 bytes or 4096 bytes.

Main operations: reads + writes to sectors

Mechanical (slow) nature makes management “interesting”

Platter
Disk Internals

Platter is covered with a magnetic film.

Spindle

Surface

Surface

Many platters may be bound to the spindle.

Each surface is divided into rings called tracks.
A stack of tracks (across platters) is called a cylinder.

The tracks are divided into numbered sectors.
1
2
3
0
6
5
4
7
8
9
10
11
15
14
13
12
16
17
18
19
23
22
21
20

Heads on a moving arm can read from each surface.
1
2
3
0
6
5
4
7
8
9
10
11
15
14
13
12
16
17
18
19
23
22
21
20

1
2
3
0
6
5
4
7
8
9
10
11
15
14
13
12
16
17
18
19
23
22
21
20

spin

Spindle/platters rapidly spin.

Disk Terminology

spindle
platter
surface
track
cylinder
sector
read/write head

43

Hard Drive Demo

Let’s Read 12!

1
2
3
0
6
5
4
7
8
9
10
11
15
14
13
12
16
17
18
19
23
22
21
20

Positioning
Drive servo system keeps head on track
How does the disk head know where it is?
Platters not perfectly aligned, tracks not perfectly concentric (runout) — difficult to stay on track
More difficult as density of disk increase
More bits per inch (BPI), more tracks per inch (TPI)
Use servo burst:
Record placement information every few (3-5) sectors
When head crosses servo burst, figure out location and adjust as needed

46

Let’s Read 12!

1
2
3
0
6
5
4
7
8
9
10
11
15
14
13
12
16
17
18
19
23
22
21
20

Seek to right track.

1
2
3
0
6
5
4
7
8
9
10
11
15
14
13
12
16
17
18
19
23
22
21
20

Seek to right track.

1
2
3
0
6
5
4
7
8
9
10
11
15
14
13
12
16
17
18
19
23
22
21
20

Seek to right track.

1
2
3
0
6
5
4
7
8
9
10
11
15
14
13
12
16
17
18
19
23
22
21
20

Wait for rotation.

1
2
3
0
6
5
4
7
8
9
10
11
15
14
13
12
16
17
18
19
23
22
21
20

Wait for rotation.

1
2
3
0
6
5
4
7
8
9
10
11
15
14
13
12
16
17
18
19
23
22
21
20

Wait for rotation.

1
2
3
0
6
5
4
7
8
9
10
11
15
14
13
12
16
17
18
19
23
22
21
20

Wait for rotation.

1
2
3
0
6
5
4
7
8
9
10
11
15
14
13
12
16
17
18
19
23
22
21
20

Wait for rotation.

1
2
3
0
6
5
4
7
8
9
10
11
15
14
13
12
16
17
18
19
23
22
21
20

Wait for rotation.

1
2
3
0
6
5
4
7
8
9
10
11
15
14
13
12
16
17
18
19
23
22
21
20

Transfer data.

1
2
3
0
6
5
4
7
8
9
10
11
15
14
13
12
16
17
18
19
23
22
21
20

Transfer data.

1
2
3
0
6
5
4
7
8
9
10
11
15
14
13
12
16
17
18
19
23
22
21
20

Transfer data.

1
2
3
0
6
5
4
7
8
9
10
11
15
14
13
12
16
17
18
19
23
22
21
20

Yay!

1
2
3
0
6
5
4
7
8
9
10
11
15
14
13
12
16
17
18
19
23
22
21
20

Time to Read/write
Three components:
Time = seek + rotation + transfer time

Seek, Rotate, Transfer
Seek cost: Function of cylinder distance from current head position
Not purely linear cost
Must accelerate, coast, decelerate, settle
Settling alone can take 0.5 – 2 ms
Entire seeks often takes several milliseconds
4 – 10 ms
Approximate average seek distance = 1/3 max seek distance

Seek, Rotate, Transfer
Depends on rotations per minute (RPM)
7200 RPM is common, 15000 RPM is high end.
With 7200 RPM, how long to rotate around?
1 / 7200 RPM =
1 minute / 7200 rotations =
1 second / 120 rotations =
8.3 ms / rotation

Average rotation time cost?
8.3 ms / 2 = 4.15 ms

Seek, Rotate, Transfer
Pretty fast — depends on RPM and sector density.

100+ MB/s is typical for maximum transfer rate

How long to transfer 512-bytes?

512 bytes * (1s / 100 MB) = 5 us

Workload Performance
So…
– seeks are slow
– rotations are slow
– transfers are fast

What kind of workload is fastest for disks?
Sequential: access sectors in order (transfer dominated)
Random: access sectors arbitrarily (seek+rotation dominated)

Other Improvements
Track Skew

Zones

Cache

8
9
10
11
15
14
13
12
16
17
18
19
23
22
21
20

Imagine sequential reading,
how should sectors numbers be laid out on disk?

8
9
10
11
15
14
13
12
16
17
18
19
23
22
21
20

When reading 16 after 15, the head won’t settle
quick enough, so we need to do a rotation.

8
9
10
11
15
14
13
12
23
23
16
17
21
20
19
18

8
9
10
11
15
14
13
12
23
23
16
17
21
20
19
18
enough time to settle now

Other Improvements
Track Skew

Zones

Cache

ZBR (Zoned bit recording): More sectors on outer tracks

Other Improvements
Track Skew

Zones

Cache

Drive Cache
Drives may cache both reads and writes.
OS caches data too

What advantage does caching in drive have for reads?

What advantage does caching in drive have for writes?

8
9
10
11
15
14
13
12
16
17
18
19
23
22
21
20

Buffering
Disks contain internal memory (2MB-16MB) used as cache
Read-ahead: “Track buffer”
Read contents of entire track into memory during rotational delay
Write caching with volatile memory
Immediate reporting: Claim written to disk when not
Data could be lost on power failure
Tagged command queueing
Have multiple outstanding requests to the disk (do not do writes immediately)
Disk can reorder (schedule) requests for better performance

78

I/O Schedulers

I/O Schedulers
Given a stream of I/O requests, in what order should they be served?

Much different than CPU scheduling

Position of disk head relative to request position matters more than length of job

FCFS
(First-Come-First-Serve)
Assume seek+rotate = 10 ms for random request

How long (roughly) does the below workload take?
Requests are given in sector numbers

300001, 700001, 300002, 700002, 300003, 700003
~60ms

FCFS
(First-Come-First-Serve)
Assume seek+rotate = 10 ms for random request

How long (roughly) does the below workload take?
Requests are given in sector numbers

300001, 700001, 300002, 700002, 300003, 700003 ~60ms

300001, 300002, 300003, 700001, 700002, 700003

~20ms

Schedulers

OS
Disk
Scheduler
Scheduler
Where should the
scheduler go?

SPTF (Shortest Positioning Time First)
Strategy: always choose request that requires least positioning time (time for seeking and rotating)
Greedy algorithm (just looks for best NEXT decision)
How to implement in disk?

How to implement in OS?
Use Shortest Seek Time First (SSTF) instead;
We will only consider SSTF; this is equivalent to finding the request in the disk queue at the closest cylinder to the current head position.
Disadvantages?
Easy for far away requests to starve

SCAN
Elevator Algorithm:
Sweep back and forth, from one end of disk other, serving requests as pass that cylinder
Sorts by cylinder number; ignores rotation delays

Pros/Cons?

Better: C-SCAN (circular scan)
Only sweep in one direction

LOOK
Another algorithm works like SCAN, BUT it “peeks” or looks, at the disk I/O queue after each transfer.
It determines if there are any other transfers at cylinders between the current cylinder and the last cylinder in the direction the head is moving.
If not, it reverses direction, and starts servicing requests moving in the opposite direction.
This eliminates “wasted” head movement after the last request moving a particular direction.
What is the cost, though?
There is also a circular version of this algorithm: C-LOOK.

I/O Device Summary
Overlap I/O and CPU whenever possible!
– use interrupts, DMA

Storage devices provide common block interface

On a disk: Never do random I/O unless you must!
– e.g., Quicksort is a terrible algorithm on disk

Spend time to schedule on slow, stateful devices