程序代写代做代考 database arm assembly file system SQL concurrency cache Hive PowerPoint Presentation

PowerPoint Presentation

Storing Data:
Disks and Buffers

1

Big picture:
Architecture of a DBMS

Architecture of a DBMS: SQL Client
Last few lectures: SQL
Next:
How is a SQL query executed?
Database Management
System
Database
SQL Client

3

DBMS: Parsing & Optimization
Purpose:
Parse, check, and verify the SQL
SELECT S.sid, S.sname, R.bid
FROM Sailors R, Reserves R
WHERE S.sid = R.sid and S.age > 30
GROUP BY age
And translate into an efficient
relational query plan
Database Management
System
Database
Query Parsing & Optimization
SQL Client

DBMS: Relational Operators
Purpose: Execute a dataflow by
operating on records and files
Heap Scan Reserves
Indexed Scan Sailors
Indexed Join

GroupBy (Age)
Database Management
System
Database
Query Parsing & Optimization
Relational Operators
SQL Client

DBMS: Files and Index Management
Purpose: Organize tables and
Records as groups of pages in
a logical file
SSN Last Name First Name Age Salary
123 Adams Elmo 31 $400
443 Grouch Oscar 32 $300
244 Oz Bert 55 $140
134 Sanders Ernie 55 $400

Database Management
System
Database
Query Parsing & Optimization
Relational Operators
Files and Index Management
SQL Client

6

DBMS: Buffer Management
RAM
Page 1
Page 2
Page 3
Disk Space Management

Purpose:
Provide the illusion of operating
in memory
Database Management
System
Database
Query Parsing & Optimization
Relational Operators
Files and Index Management
Buffer Management
SQL Client

7

DBMS: Disk Space Management
Purpose: Translate page requests into physical bytes on one or more device(s)

Block 1
Block 2
Block 3

Database Management
System
Database
Query Parsing & Optimization
Relational Operators
Files and Index Management
Buffer Management
Disk Space Management
SQL Client

Architecture of a DBMS
Organized in layers
Each layer abstracts the layer below
Manage complexity
Performance assumptions
Example of good systems design
Database Management
System
Database
Query Parsing & Optimization
Relational Operators
Files and Index Management
Buffer Management
Disk Space Management
SQL Client

DBMS: Concurrency & Recovery
Two cross-cutting issues related to storage and memory management:
Concurrency Control
Recovery

Database Management
System
Database
Query Parsing & Optimization
Relational Operators
Files and Index Management
Buffer Management
Disk Space Management
SQL Client

Context
Completed

You are Here

Database Management
System
Database
Query Parsing & Optimization
Relational Operators
Files and Index Management
Buffer Management
Disk Space Management
SQL Client

Context, cont.

Database Management
System
Database
Query Parsing & Optimization
Relational Operators
Files and Index Management
Buffer Management
Disk Space Management
SQL Client
Completed
We’ll visit here
You are Here

Before We Begin:
Storage Media

Disks
Most database systems were originally designed for magnetic disks
Disk are a mechanical anachronism!
Instilled design ideas that apply to using solid state disks as well
Major implications!
No “pointer derefs”. Instead, an API:
READ: transfer “page” of data from disk to RAM.
WRITE: transfer “page” of data from RAM to disk.
Both API calls are very, very slow!
Plan carefully!
An explicit API can be a good thing
Minimizes the kind of pointer errors you see in C

Economics
$1000 at NewEgg 2018:
Mag Disk: ~40TB for $1000
SSD: ~2.3TB for $1000
RAM: 80GB for $1000

15

TB/$1000

Series 1 RAM SSD Hard Drive 0.08 2.2999999999999998 40 Column1 RAM SSD Hard Drive Column2 RAM SSD Hard Drive

Storage Hierarchy
Small, Fast
Big, Slow
For currently used Data
Varies by deployment
Database and backup/logs Secondary & tertiary storage
Registers
L1 Cache
L2 Cache
RAM
SSD
Disk

Hierarchy – Storage Latencies
.5 ns – L1 cache reference
7.0 ns – L2 cache reference
100.0 ns – main memory reference
1,000,000.0 ns – to read 1MB sequentially
20,000,000.0 ns to read 1MB sequentially
Registers
L1 Cache
L2 Cache
RAM
SSD
Disk

How Far Away is the Data?
Note: From Turing Award Winner Jim Gray

18

Components of a Disk, Pt. 1
Platters spin (say 15000 rpm)
Arm assembly moved in or out to position a head on a desired track
Tracks under heads make a “cylinder”

Platters

Spindle

Disk head
Arm assembly

Tracks

Disk Platters

19

Components of a Disk, Pt. 2
Platters spin (say 15000 rpm)
Arm assembly moved in or out to position a head on a desired track
Tracks under heads make a “cylinder”

Platters

Spindle

Disk head
Arm assembly

Tracks

Components of a Disk, Pt. 3
Platters spin (say 15000 rpm)
Arm assembly moved in or out to position a head on a desired track
Tracks under heads make a “cylinder”

Platters

Spindle

Disk head
Arm assembly

Tracks

Arm movement

Components of a Disk, Pt. 4
Platters spin (say 15000 rpm)
Arm assembly moved in or out to position a head on a desired track
Tracks under heads make a “cylinder”

Platters

Spindle

Disk head
Arm assembly

Tracks

Components of a Disk, Pt. 5
Platters spin (say 15000 rpm)
Arm assembly moved in or out to position a head on a desired track
Tracks under heads make a “cylinder”
Only one head reads/writes at any one time

Platters

Spindle

Disk head
Arm assembly

Tracks

Components of a Disk, Pt. 6
Platters spin (say 15000 rpm)
Arm assembly moved in or out to position a head on a desired track
Tracks under heads make a “cylinder”
Only one head reads/writes at any one time
Block/page size is a multiple of (fixed) sector size

Platters

Spindle

Disk head
Arm assembly

Tracks

Arm movement

Sector

Accessing a Disk page
Time to access (read/write) a disk block:
seek time (moving arms to position disk head on track)
~2-3 ms on average
rotational delay (waiting for block to rotate under head)
~0-4 ms (15000 RPM)
transfer time (actually moving data to/from disk surface)
~0.25 ms per 64KB page
Key to lower I/O cost: reduce seek/rotational delays

Content Break

Notes on Flash (SSD)
Issues in current generation (NAND)
Fine-grain reads (4-8K reads), coarse-grain writes (1-2 MB writes)
Only 2k-3k erasures before failure,
so keep moving hot write units around (“wear leveling”)
Write amplification: big units, need to reorg for wear & garbage collection

Notes on Flash (SSD), Pt. 2
So… read is fast and predictable
Single read access time: 0.03 ms
4KB random reads: ~500MB/sec
Sequential reads: ~525MB/sec
64K: 0.48 ms

Notes on Flash (SSD), cont
But… write is not! Slower for random
Single write access time: 0.03 ms
4KB random writes: ~120 MB/sec
Sequential writes: ~480 MB/sec

Is Flash Faster than Disk?
Why of course it is…it’s called “flash”!
Can be 1-10x the bandwidth (bytes/sec) of ideal HDD #s
Note: Ideal HDD #s hard to achieve.
Expect 10-100x bandwidth for non-sequential read.

Image

Is Flash Faster Than Disk Pt 2.
Locality” matters for both
Reading/writing to “far away” blocks on disk requires slow seek/rotation delay
Writing 2 “far away” blocks on SSD can require writing multiple much larger units
High-end flash drives are getting much better at this
And don’t forget:
Disk offers about 10x the capacity per $

Storage Pragmatics & Trends
Many significant DBs are not big.
Size in GB

Series 1 Daily weather (1929-2009) Us Census 2009 English Wikipedia 20 200 14 Column1 Daily weather (1929-2009) Us Census 2009 English Wikipedia Column2 Daily weather (1929-2009) Us Census 2009 English Wikipedia

Storage Trends Pt. 2
But data sizes grow faster than Moore’s Law
“Big Data” is real
Boeing 787 generates ½ TB of data per flight
Walmart handles 1M transactions/hour,
maintains 2.5 PetaByte data warehouse
So…what is the role of disk, flash, RAM
The subject of some debate!

Airplane Cash Register

Bottom Line (last few years)
Very large DBs: relatively traditional
Disk still the best cost/MB by a lot
SSDs improve performance and performance variance
Smaller DB story is changing quickly
Entry cost for disk is not cheap, so flash wins at the low end
Many interesting databases fit in RAM

Bottom Line Pt. 2
Change brewing on the HW storage tech side
Mixed answers on the SW/usage side
Big Data: Can generate and archive data cheaply and easily
Small Data: Many rich data sets have (small) fixed size
People will continue to worry about magnetic disk for some time yet, typically at large scale

Content Break 2

Disk Space Management

Disks and Files
Recall, most DBMSs stores information on Disks and SSDs.
Disk are a mechanical anachronism (slow!)
SSDs faster, slow relative to memory, costly writes

Disk Platters

Block Level Storage
Read and Write large chunks of sequential bytes
Sequentially: “Next” disk block is fastest
Maximize usage of data per Read/Write
“Amortize” seek delays (HDDs) and writes (SSDs):
if you’re going all the way to Pluto, pack the spaceship full!
Predict future behavior
Cache popular blocks
Pre-fetch likely-to-be-accessed blocks
Buffer writes to sequential blocks
More on these as we go

A Note on Terminology
Block = Unit of transfer for disk read/write
64KB – 128KB is a good number today
Book says 4KB
Page: a common synonym for “block”
In some texts, “page” = a block-sized chunk of RAM

We’ll treat “block” and “page” as synonyms

Arranging Blocks on Disk
‘Next’ block concept:
sequential blocks on same track, followed by
blocks on same cylinder, followed by
blocks on adjacent cylinder
Arrange file pages sequentially by ‘next’ on disk
minimize seek and rotational delay.
For a sequential scan, pre-fetch
several blocks at a time!
Read large consecutive blocks

Spindle

Disk head

Arm movement
Arm assembly

Tracks

Disk Space Management, cont
Lowest layer of DBMS, manages space on disk
Purpose:
Map pages to locations on disk
Load pages from disk to memory
Save pages back to disk & ensuring writes
Higher levels call upon this layer to:
Read/write a page
Allocate/de-allocate logical pages
Database Management
System
Database
Query Parsing & Optimization
Relational Operators
Files and Index Management
Buffer Management
Disk Space Management
SQL Client

Disk Space Management: Requesting Pages
Request for a sequence of pages best satisfied by pages stored sequentially on disk
Physical details hidden from higher levels of system
Higher levels may “safely” assume Next Page is fast, so they will simply expect sequential runs of pages to be quick to scan.

Disk Space Management: Implementation
Proposal 1: Talk to the storage device directly
Could be very fast if you knew the device well
What happens when devices change?

Disk Space Management: Implementation 2
Proposal 2: Run over filesystem (FS)
Allocate single large “contiguous” file on a nice empty disk,
and assume sequential/nearby byte access are fast
Most FS optimize disk layout for sequential access
Gives us more or less what we want if we start with an empty disk
DBMS “file” may span multiple FS files on multiple disks/machines

Using Local Filesystem
Disk Space Management
Get Page 4
Get Page 5

File System
File System
File System
Page 1
Page 2
Page 3
Big File 1

Page 4
Page 5
Page 6
Big File 2

Page 7
Page 8
Page 9
Big File 3

Page 4
Page 5

Summary: Disk Space Management
Provide API to read and write pages to device
Pages: block level organization of bytes on disk
Provides “next” locality and abstracts FS/device details

Disk Space Management
Page 1
Page 2
Page 3
Page 4
Page 5
Page 6

Content Break 3

Disks and Files: Summary
Magnetic (hard) disks and SSDs
Basic HDD mechanics
SSD write amplification
Concept of “near” pages and how it relates to cost of access
Relative cost of
Random vs. sequential disk access (10x)
Disk (pluto) vs RAM (sacramento) vs. registers (your head)
Big, big differences!

Files: Summary Pt 2
DB File storage
Typically over FS file(s)
Disk space manager loads and stores pages
Block level reasoning
Abstracts device and file system; provides fast “next”

/docProps/thumbnail.jpeg