CS计算机代考程序代写 file system cache arm Hive Storage Systems

Storage Systems

Storage Systems

DSCI 551

Wensheng Wu

1

Storage hierarchy

2

Primary
(Cache, Memory)

Secondary
(Hard disks, SSD)

Tertiary
(Tape, Optical Disk)

S
p

e
e

d

C
o

st, E
n

e
rg

y
C

o
n

su
m

p
tio

n

Access times

LEVEL ACCESS TIME TYPICAL SIZE

Registers “instantaneous” under 1KB

Level 1 Cache 1-3 ns 64KB per core

Level 2 Cache 3-10 ns 256KB per core

Level 3 Cache 10-20 ns 2-20 MB per chip

Main Memory 30-60 ns 4-32 GB per system

Hard Disk 3,000,000-10,000,000 ns over 1TB

3

Resource: https://arstechnica.com/information-
technology/2012/06/inside-the-ssd-revolution-how-
solid-state-disks-really-work/

SSD: 25,000 ns

Time taken before drive is ready to transfer data

https://arstechnica.com/information-technology/2012/06/inside-the-ssd-revolution-how-solid-state-disks-really-work/

CRUD

• Basic functions of a storage device

• CRUD:

– (C)reate/write

– (R)ead

– (U)pdate/overwrite

– (D)elete

4

Characterizing a storage device

• Capacity (bytes)
– How much data it can hold

• Cost ($$$)
– Price per byte of storage

• Bandwidth (bytes/sec)
– Number of bytes that can be transferred per second

– Note that read and write bandwidth may be different

• Latency (seconds)
– Time elapsed, waiting for response/delivery of data

5

Time to complete an operation

• Time to complete an operation depends on
both bandwidth and latency

– CompletionTime = Latency + Size/Bandwidth

• The time for a workload may depend on

– Technology, e.g., hard drive/SSD

– Operation type, e.g., read/write

– Number of operations in the workload

– Access pattern (random vs. sequential)

6

Access pattern

• Sequential

– Data to be accessed are located next to each other
or sequentially on the device

• Random

– Access data located randomly on storage device

7

Magnetic recording

• Write head

– Applies electrical current to write head

– Changes direction of magnetic field under head

8

Reading

• Read head senses direction of magnetic field

9

Road map

• Tapes

• Hard disk

• Solid state drive

10

Linear tape

• Data recorded on parallel tracks that span the
length of the tape

11

Tapes

• Current technology is LTO
– Linear Tape-Open (an open standard)

• Characteristics
– Capacity up to 6.25TB per tape (LTO-7)
– Drive cost ~ $2000
– Tape cost ~ $60 for 6TB tape

• Tape access time (~ minute)
– Time to mount the tape
– Time to wind the tape to correct position

• Data transmission rates ~ 250MB/sec

12

$60
300MB/s
6TB
LTO-7

Performance characteristics

• High latency/low cost makes tape most
appropriate for “archival” storage

– Low frequency of (mostly sequential) reads

– Very large data objects

• Random access will be slow due to latency

– Sequential reads will be fast

13

Linear tape file system

• Two partitions on tape

– First contains metadata and directories. Tape
reader can find and load this very quickly

– Second contains blocks for data

• Directory structure coded in XML

– Self describing file format…

14

15

Tape Cartridge

16

17

12.8TB

6.25TB

A tape library

18

Inside a robotic tape library

• https://www.youtube.com/watch?v=nYfTtvpQ
778

19

Road map

• Tapes

• Hard disk

• Solid state drive

20

Hard disk drives

• Perhaps the most pervasive form of storage

• Basic Idea:

– One or more spinning magnetic platters

• Typically two surfaces per platter

– Disk arm positions over the radial position
(tracks) where data are stored

• It swings across tracks (but do not extend/shrink)

– Data is read/written by a read/write head as
platter spins

21

22

Internal of hard disk

23

Actuator Spindle

Disk head

Platter

https://youtu.be/4iaxOUYalJU

Disk arm and platter

24

Disk arm & head close-up

25

Disk head close-ups

26

27

Disk head movement

• Hard disk head movement while copying files
between two folders (e.g., partition c to d)

– https://www.youtube.com/watch?v=BlB49F6ExkQ

28

2GB Storage in 1980s ($250,000!)

29

Physical characteristics

• 3.5″ (diameter, common in desktops)

• 2.5″ (common in laptops)

• Rotational speed

– 4,800 RPM

– 5,400 RPM

– 7,200 RPM

– 10,000 RPM

• Between 5-7 platters

• Current capacity up to 10TB (Western Digital)
30

Disk organization

• Each platter consists of a number of tracks

• Each track is divided into N fixed size sectors

– Typical sector size is 512 bytes (old) or 4KB (new)

– Sectors can be numbered from 0 to N-1

– Entire sector is written “atomically”

• All or nothing

31

CHS (cylinder-head-sector)

• Early way to address a sector

– Now LBA (Logical Block Addressing) more
common

32

https://en.wikipedia.org/wiki/Cylinder-head-sector

https://en.wikipedia.org/wiki/Logical_block_addressing
https://en.wikipedia.org/wiki/Cylinder-head-sector

CHS (Wikipedia)

33

Example

• # cylinders: 256

• # heads: 16 (i.e., 8 platters, 2 heads/platter)

• # sectors/track: 64

• Sector size = 4KB

34
What is the capacity of the drive?

A simple disk drive (one track only)

35

track

sector

Rotational latency

• Waiting for the right sector to rotate under
the head

– On average: about ½ of time of a full rotation

– Worst case?

– Best case?

36

Rotation time

• Assume 10,000 RPM (rotations per minute)

60000 𝑚𝑠

1000 𝑟𝑜𝑡𝑎𝑡𝑖𝑜𝑛𝑠
=

6 𝑚𝑠

𝑟𝑜𝑡𝑎𝑡𝑖𝑜𝑛

37

Multiple tracks: add seek times

38

Average seek time is about 1/3 max seek time
(see reading: Chapter 37, page 9 for more details)

Transfer time

• Assume that we transfer 512KB

• Assume 128 MB/sec transmission bandwidth

• Transfer time:

512KB/128MB * 1000ms = 4ms

39

Completion time

▪ T = Tseek + Trotation + Ttransfer
▪ Tseek : Time to get the disk head on right track

▪ Trotation :Time to wait for the right sector to rotate
under the head

▪ Ttransfer: Time to actually transfer the data

40

Example

• Capacity 4TB
• # platters: 4
• # heads: 8
• Bytes per sector: 4096

• Transmission bandwidth: 100MB/sec
• Maximum seek time: 12ms

• RPM: 10,000

41

Time to transfer a file

• The file occupies 100 sectors (sequentially)

• Avg. seek time =?

• Avg. rotational latency =?

• Transfer time = ?

42

Sector vs. block

• Block has 1 or more sectors

• Disk typically transfers one block at a time

• We will assume one block = one sector

– Unless stated otherwise

43

Sequential operations

• May assume all sectors involved are on same
track

– We may need to seek to the right track or rotate
to the first sector

• But no rotation/seeking needed afterward

44

Actual bandwidth

• Consider a workload w

– E.g., w = sequential access of 100 blocks of data

– Denote size (# of bytes) of data in w as |w|

– E.g., w = 400KB (100 blocks, 4KB/block)

• Suppose completion time for w = t

• Actual bandwidth (with respect to w) = |w|/t

45

Sequential vs. random

• Consider disk with 7ms avg seek, 10,000 RPM
platter speed and 50 MB/sec transfer rate,
4KB/block

• Sequential access of 10 MB

– Completion time = 7 + 3 + 10/50*1000 = 210ms

– Actual bandwidth = 10MB/210ms = 47.62 MB/s

• Random access of 10 MB (2,500 blocks)

– Completion time = 2500 * (7 + 3 + 4/50) = 25.2s

– Actual bandwidth = 10MB / 25.2s = .397 MB/s
46

Road map

• Tapes

• Hard disk

• Solid state drive

47

Solid State Drive

48

Memory chips

49

Solid State Drives

• All electronic, made from flash memory

• Lower energy consumption than hard drive

• More expensive, less capacity

– 3 times or more expensive

• Limited lifetime, can only write a limited
number of times.

– E.g., 100K program/write cycles for SLC (single-
level cell) memory

50

SSD vs. Hard Drive (price)

51

Solid State Drives

• Same form-factor and control interface as
magnetic disks

• Significantly better latency

– No seek or rotational delay

• Much better performance on random
workload:

– Benefits from improved latency

– However, write has much higher latency (but see
next slide)

52

Speed comparison (YouTube)

53

Due to buffered writes

Read vs write: significant difference in SSD vs marginal in HDD

Writing to SSD is complicated

• Can not overwrite a page

– Need to erase its block (at a certain point) instead

• SSD controllers take care of all these details

54

SSD

• Contains a number of flash memory chips

– Chip -> dies -> planes -> blocks -> pages (rows) ->
cells

– Cells are made of floating-gate transistors

• Page is the smallest unit of data transfer
between SSD and main memory

– Much like a block in hard disk

55

Die Layout

56Understanding Flash: Blocks, Pages and Program / Erases

http://flashdba.com/2014/06/20/understanding-flash-blocks-pages-and-program-erases/http:/flashdba.com/2014/06/20/understanding-flash-blocks-pages-and-program-erases/

Dies, planes, block, and pages

• Typically, a chip may have 1, 2, or 4 dies

• A die may have 1 or 2 planes

• A plane has a number of blocks

– Block is the smallest unit that can be erased

• A block has a number of pages

– Page is the smallest unit that can be read,
programmed/written

57

Typical page and block sizes

• Common page sizes: 2K, 4K, 8K, and 16K

• A block typically has 128 to 256 pages

=> Block size: 256KB to 4MB

58

Normal transistor (MOSFET)

• When current applied to gate terminal

– semi-conducting region (purple) becomes
conductive

59

Floating gate transistor

• Contain an additional gate: floating gate

– Floating, since isolated by oxide layer

– (thus not connected to other components)

60

Floating gate transistor

• By applying high positive/negative voltage to
control gate, electrons can be attracted to or
repelled from floating gate

61

Floating gate transistor

• State = 1, if no electrons in the floating gate

• State = 0, if there are electrons (negative
charges)

– Electrons stuck there even when power is off

– So state is retained

62

Read operations

• Electrons on the floating gate affect the
threshold voltage for the floating gate
transistor to conduct

• Higher voltage needed when gate has
electrons

63

Read operations

64

See here for more details

Floating gate is charged
i.e., containing electrons

Typically, .7v

1.5v

1.8v

Understanding Flash: Floating Gates and Wear

Read operations

• Apply Vint (intermediate voltage)

• If the current is detected, gate has no
electrons

=> bit = 1

• If no current, gate must have electrons

=> bit = 0

65

NAND flash layout

• Transistors are strung together in a series

– Similar to the transistors in an NAND gate

66

Output F = 0 only when both a & b = 1
I.e., F will be grounded only when both a and b conduct

NAND reading

67

Apply Von to all others so that
they all conduct, no mater
they are charged or not.

Bit line

NAND flash

68

Write and erase

• Write: 1 => 0

– Apply high POSITVE voltage (>> voltage for read)
to the control gate

– Attract electrons from channel to floating gate
(through quantum tunneling)

• Erase: 0 => 1

– Need to apply much higher NEGATIVE voltage

– Get rid of electrons from floating gate

– May stress surrounding cells

– So dangerous to do on individual pages
69

Read/write units

• Page is the smallest unit for read and write
(write is also called program, 1->0)

• Block is the smallest unit for erase (0->1)

– i.e., make cells “empty” (i.e., no electrons)

70

Example

71

Latencies: read, write, and erase

72

P/E cycle

• P: program/write; E: erase

• Every write & erase damages oxide layer
surrounding the floating-gate to some extent

• P/E cycle:

– Data are written to cells (P): cell value from 1 -> 0

– Then erased (E): 0 -> 1

73

Multi-level cell (MLC)

74

http://www.cactus-tech.com/resources/blog/details/solid-
state-drive-primer-2-slc-mlc-and-tlc-nand-flash

Note different
levels of electrons

Solid State Drive Primer # 2 – SLC, MLC and TLC NAND Flash

MLC reading

• 2 bits, 3 intermediate voltages

75

0010 0111

V2 V3V1

Current vs voltage

SLC compared with MLC

• SLC:

– Less complex

– Faster

– More reliable

– Less storage

– More costly

76

Read more

• Solid-state revolution: in-depth on how SSDs
really work

• How do SSDs work?

– http://www.extremetech.com/extreme/210492-
extremetech-explains-how-do-ssds-work

77

http://arstechnica.com/information-technology/2012/06/inside-the-ssd-revolution-how-solid-state-disks-really-work/3/
http://www.extremetech.com/extreme/210492-extremetech-explains-how-do-ssds-work
http://www.extremetech.com/extreme/210492-extremetech-explains-how-do-ssds-work

References

• How Flash Memory Works

– https://www.youtube.com/watch?v=msi5GDz9JIw

• Floating Gate Basics

– http://www.cse.scu.edu/~tschwarz/coen180/LN/fl
ash.html

• Friend of Flash

– http://www.nnc3.com/mags/LM10/Magazine/Arc
hive/2008/86/040-041_logfs/article.html

78

https://www.youtube.com/watch?v=msi5GDz9JIw
http://www.cse.scu.edu/~tschwarz/coen180/LN/flash.html
http://www.nnc3.com/mags/LM10/Magazine/Archive/2008/86/040-041_logfs/article.html

References

• Understanding Flash: Floating Gates and Wear

– https://flashdba.com/2015/01/09/understanding-
flash-floating-gates-and-wear/

• From Transistors to Functions

– http://www.cs.bu.edu/~best/courses/modules/Tr
ansistors2Gates/

79

Understanding Flash: Floating Gates and Wear
http://www.cs.bu.edu/~best/courses/modules/Transistors2Gates/

References

• Solid State Drive Primer

– https://www.cactus-
tech.com/resources/blog/details/solid-state-drive-
primer-1-the-basic-nand-flash-cell

• How Does a Transistor Work?

– https://www.youtube.com/watch?v=IcrBqCFLHIY
&feature=youtu.be

80

Solid State Drive Primer # 1 – The Basic NAND Flash Cell

References

• How Flash Memory Works

– https://www.youtube.com/watch?v=s7JLXs5es7I

81

https://www.youtube.com/watch?v=s7JLXs5es7I