程序代写代做代考 database algorithm scheme chain SQL concurrency PowerPoint Presentation

PowerPoint Presentation

Recovery
R&G – Chapter 20

Review: The ACID properties
Atomicity: All actions in the Xact happen, or none happen.
Consistency: If the DB starts consistent before the Xact…
it ends up consistent after.
Isolation: Execution of one Xact is isolated from that of other Xacts.
Durability: If a Xact commits, its effects persist.
Recovery Manager
Atomicity & Durability
Also to rollback transactions that violate Consistency

Motivation
Atomicity:
Transactions may abort (“Rollback”).
Durability:
What if DBMS stops running?
Desired state after system restarts:
T1 & T3 should be durable.
T2, T4 & T5 should be aborted (effects not seen).

Questions:
Why do transactions abort?
Why do DBMSs stop running?

crash!
T1
T2
T3
T4
T5

Abort
Commit
Commit

Atomicity: Why Do Transactions Abort?
User/Application explicitly aborts
Failed Consistency check
Integrity constraint violated
Deadlock
System failure prior to successful commit

Transactions and SQL
You don’t need SQL to want transactions and vice versa
But they often go together
SQL Basics
BEGIN
COMMIT
ROLLBACK

SQL Savepoints
Savepoints
SAVEPOINT
RELEASE SAVEPOINT
Makes it as if the savepoint never existed
ROLLBACK TO SAVEPOINT
Statements since the savepoint are rolled back
BEGIN;
INSERT INTO table1 VALUES (‘yes1’);
SAVEPOINT sp1;
INSERT INTO table1 VALUES (‘yes2’);
RELEASE SAVEPOINT sp1;
SAVEPOINT sp2;
INSERT INTO table1 VALUES (‘no’);
ROLLBACK TO SAVEPOINT sp2;
INSERT INTO table1 VALUES (‘yes3’);
COMMIT;

6

Example of SQL Integrity Constraints
Constraint violation rolls back transaction

7

Durability: Why Do Databases Crash?
Operator Error
Trip over the power cord
Type the wrong command
Configuration Error
Insufficient resources: disk space
File permissions, etc.
Software Failure
DBMS bugs, security flaws, OS bugs
Hardware Failure
Media or Server

8

Assumptions for Our Recovery Discussion
Concurrency control is in effect.
Strict 2PL, in particular.
Updates are happening “in place”.
i.e. data is modified in buffer pool and pages in DB are overwritten
Transactions are not done on “private copies” of the data.

Exercise in Simplicity
Devise a simple scheme (requiring no logging) for Atomicity & Durability
Questions:
What is happening during the transaction?
What happens at commit for Durability?
How do you rollback on abort?
How is Atomicity guaranteed?
Any limitations/assumptions?
Buffer Pool

Database

Exercise in Simplicity, cont
Devise a simple scheme (requiring no logging) for Atomicity & Durability
Example:
Dirty buffer pages stay pinned in the buffer pool
Can’t be “stolen” by replacement policy
Page-level locking to ensure 1 transaction per page
At commit, we:
Force dirty pages to disk
Unpin those pages
Then we commit
Unfotunately, this doesn’t work!

Buffer Pool

Database

12

Problems with Our Simplistic Solution
All dirty pages stay pinned in the buffer pool
What happens if buffer pool fills up?
Not scalable!
At commit, we:
Force dirty pages to disk
Unpin those pages
Then we commit
What if DBMS crashes halfway through step a?
Not atomic!

Buffer Pool

Database

Buffer Management Plays a Key Role
NO STEAL policy – don’t allow buffer-pool frames with uncommited
updates to be replaced (or otherwise flushed to disk).
Useful for achieving atomicity without UNDO logging.
But can cause poor performance (pinned pages limit buffer replacement)
FORCE policy: make sure every update is “forced” onto the DB disk
before commit.
Provides durability without REDO logging.
But, can cause poor performance (lots of random I/O to commit)
Our simple idea was NO STEAL/FORCE
And even that didn’t really achieve atomicity

Buffer Pool

Database

Preferred Policy: Steal/No-Force
Most complicated, but highest performance.
NO FORCE (complicates enforcing Durability)
Problem: System crash before dirty buffer page of a committed transaction is flushed to DB disk.
Solution: Flush as little as possible, in a convenient place, prior to commit. Allows REDOing modifications.
STEAL (complicates enforcing Atomicity)
What if a Xact that flushed updates to DB disk aborts?
What if system crashes before Xact is finished?
Must remember the old value of flushed pages
(to support UNDOing the write to those pages).
This is a dense slide … and the crux of the lecture.
Read it over carefully, and return to it later!

Buffer Management summary

Force
No
Force
No Steal
Steal

Slowest
Fastest
Performance
Implications

No Steal
Steal

No REDO
No UNDO
UNDO
No REDO
UNDO
REDO
No UNDO
REDO
Logging/Recovery
Implications
Force
No
Force

Basic Idea: Logging
For every update, record info to allow REDO/UNDO in a log.
Sequential writes to log (on a separate disk).
Minimal info written to log: pack multiple updates in a single log page.
Log: An ordered list of log records to allow REDO/UNDO
Log record contains:

and additional control info (which we’ll see soon).

DB

Log

7

Write-Ahead Logging (WAL)
The Write-Ahead Logging Protocol:
Must force the log record for an update before the corresponding data page gets to the DB disk.
Must force all log records for a Xact before commit.
I.e. transaction is not committed until
all of its log records including its “commit” record are on the stable log.
#1 (with UNDO info) helps guarantee Atomicity.
#2 (with REDO info) helps guarantee Durability.
This allows us to implement Steal/No-Force

DB

Log

WAL & the Log
Log: an ordered file, with a write buffer (“tail”) in RAM.
Each log record has a Log Sequence Number (LSN).
LSNs unique and increasing.

LSNs
flushedLSN

RAM
Log records flushed to disk

time

time
Log tail

WAL & the Log, Pt 2
Log: an ordered file, with a write buffer (“tail”) in RAM.
Each log record has a Log Sequence Number (LSN).
LSNs unique and increasing.
flushedLSN tracked
in RAM

Log records flushed to disk

time

time
Log tail
flushedLSN

WAL & the Log, Pt 3
Each data page in the DB contains a pageLSN.
A “pointer” into the log
The LSN of the most recent log record for an update to that page.
Log records flushed to disk

time

time
Log tail
flushedLSN

DB

pageLSN

LSNs
pageLSNs
flushedLSN

DB

RAM

WAL & the Log, Pt 4
WAL: Before page i is flushed to DB, log must satisfy:
pageLSNi £ flushedLSN
Log records flushed to disk

time

DB

time
Log tail
flushedLSN

Buffer Pool

LSNs
pageLSNs
flushedLSN

DB

RAM

WAL & the Log, Pt 5
WAL: Before page i is flushed to DB, log must satisfy:
pageLSNi £ flushedLSN
Log records flushed to disk

time

DB

time
Log tail
flushedLSN

Buffer Pool


LSNs
pageLSNs
flushedLSN

DB

RAM

WAL & the Log, Pt 6
WAL: Before page i is written to DB, log must satisfy:
pageLSNi £ flushedLSN
Log records flushed to disk

time

time
Log tail
flushedLSN

Buffer Pool

DB

LSNs
pageLSNs
flushedLSN

DB

RAM

WAL & the Log, Pt 7
WAL: Before page i is written to DB, log must satisfy:
pageLSNi £ flushedLSN
Don’t need to steal buffer frame if page is hot
can write back later

Log records flushed to disk

time

DB

time
Log tail
flushedLSN

Buffer Pool

LSNs
pageLSNs
flushedLSN

DB

RAM

Summary
WAL: Before page i is written to DB, log must satisfy:
pageLSNi £ flushedLSN
Exactly how is logging (and recovery!) done?
We’ll look at the
ARIES algorithm
from IBM.

LSNs
pageLSNs
flushedLSN

DB

RAM
Log records flushed to disk

time

DB

time
Log tail
flushedLSN

Buffer Pool

ARIES Log Records
prevLSN is the LSN of the previous log record written by this XID
So records of an Xact form a linked list backwards in time

prevLSNs

LSNs
pageLSNs
flushedLSN

DB

RAM
prevLSNs

Log Records, Pt 2
prevLSN is the LSN of the previous log record written by this XID
So records of an Xact form a linked list backwards in time
Possible log record types:
Update, Commit, Abort
Checkpoint (for log maintainence)
Compensation Log Records (CLRs)
(for UNDO actions)
End (end of commit or abort)

LSN
prevLSN
XID
type
length
pageID
offset
before-image
after-image
LogRecord fields:

update
records
only

LSNs
pageLSNs
flushedLSN

DB

RAM
prevLSNs

Log Records, Pt 3
Update records contain sufficient information for REDO and UNDO
Our “physical diff” to the left works fine.
There are other encodings that can be more space-efficient

LSN
prevLSN
XID
type
length
pageID
offset
before-image
after-image
LogRecord fields:

update
records
only

LSNs
pageLSNs
flushedLSN

DB

RAM
prevLSNs

Other Log-Related State
Two in-memory tables:
Transaction Table
One entry per currently active Xact.
removed when Xact commits or aborts
Contains:
XID
Status (running, committing, aborting)
lastLSN (most recent LSN written by Xact).
Dirty Page Table
One entry per dirty page currently in buffer pool.
Contains recLSN
LSN of the log record which first caused the page to be dirty.
XID Status lastLSN
1 R 33
2 C 42

PageID recLSN
46 11
63 24

Transaction Table
Dirty Page Table

ARIES Big Picture: What’s Stored Where

Data pages
each with a
pageLSN
Master record

DB

LSN
prevLSN
XID
type
length
pageID
offset
before-image
after-image
LogRecords

Xact Table
xid
lastLSN
status

Dirty Page Table
pid
recLSN

Log tail
flushedLSN

Buffer pool

RAM

LOGGING

Normal Execution of an Xact
Series of reads & writes, followed by commit or abort.
For our discussion, the recovery manager sees page-level reads/writes
We will assume that disk write is atomic.
In practice, kind of tricky!
STEAL, NO-FORCE buffer management, with Write-Ahead Logging.
Update, Commit, Abort log records written to log tail as we go
Transaction Table and Dirty Page Table being kept current
PageLSNs updated in buffer pool
Log tail flushed to disk periodically in background
And flushedLSN changed as needed
Buffer manager stealing pages subject to WAL

Transaction Commit
Write commit record to log.
All log records up to Xact’s commit record are flushed to disk.
Guarantees that flushedLSN ³ lastLSN.
Note that log flushes are sequential, synchronous writes to disk.
Many log records per log page.
Commit() returns.
Write end record to log.

Simple Transaction Abort
For now, consider an explicit abort of a Xact.
No crash involved.
We want to “play back” the log in reverse order, UNDOing updates.
Get lastLSN of Xact from Xact table.
Write an Abort log record before starting to rollback operations
Can follow chain of log records backward via the prevLSN field.
Write a “CLR” (compensation log record) for each undone operation.

Note: CLRs are a different type of log record we glossed over before

Abort, cont.
To perform UNDO, must have a lock on data!
No problem!
Before restoring old value of a page, write a CLR:
You continue logging while you UNDO!!
CLR has one extra field: undonextLSN
Points to the next LSN to undo
i.e. the prevLSN of the record we’re currently undoing
CLR contains REDO info
CLRs never Undone
Undo needn’t be idempotent (>1 UNDO won’t happen)
But they might be Redone when repeating history
(=1 UNDO guaranteed)
At end of all UNDOs, write an “end” log record.

Currently Undoing
PrevLsn=1234
lastLSN(CLR)
undoNextLSN = 1234
Idempotent: can be applied multiple times without changing the result beyond the initial application

Checkpointing
Conceptually, keep log around for all time.
Performance/implementation problems…
Periodically, the DBMS creates a checkpoint
Minimizes recovery time after crash. Write to log:
begin_checkpoint record: Indicates when chkpt began.
end_checkpoint record: Contains current Xact table DPT
. A “fuzzy checkpoint”: Other Xacts continue to run;
So all we know is that these tables are after the time of the begin_checkpoint record.
Store LSN of most recent chkpt record in a safe place
(master record, often block 0 of the log file).

Currently Undoing
PrevLsn=1234
lastLSN(CLR)
undoNextLSN = 1234
Xact Table, DPT

CRASH RECOVERY

39

Crash Recovery: Big Picture
Start from a checkpoint
found via master record.
Three phases. Need to do:
Analysis – Figure out which Xacts committed since checkpoint, which failed.
REDO all actions.
(repeat history)
UNDO effects of failed Xacts.

Oldest log rec. of Xact active at crash
Smallest recLSN in dirty page table after Analysis
Last chkpt
CRASH

A
R
U

Recovery: The Analysis Phase
Re-establish knowledge of state at checkpoint.
via transaction table and dirty page table stored in the checkpoint
Scan log forward from checkpoint.
End record:
Remove Xact from Xact table
Update record:
If page P not in Dirty Page Table, Add P to DPT, set its recLSN=LSN.
!End record:
Add Xact to Xact table
set lastLSN=LSN
change Xact status on commit.
At end of Analysis…
For any Xacts in the Xact table in Committing state,:
Write a corresponding END log record
…and Remove Xact from Xact table.
Now, Xact table says which xacts were active at time of crash.
DPT says which dirty pages might not have made it to disk

Phase 2: The REDO Phase
We Repeat History to reconstruct state at crash:
Reapply all updates (even of aborted Xacts!), redo CLRs.
Scan forward from log rec containing smallest recLSN in DPT.
Q: why start here?
For each update log record or CLR with a given LSN, REDO the action unless:
Affected page is not in the Dirty Page Table, or
Affected page is in D.P.T., but has recLSN > LSN, or
pageLSN (in DB) >= LSN. (this last case requires I/O)
To REDO an action:
Reapply logged action.
Set pageLSN to LSN. No additional logging, no forcing!

Scenarios When We Do Not REDO
Given an update log record…
Affected page is not in the Dirty Page Table. How did that happen?
This page was flushed to DB, removed from DPT before checkpoint
Then DPT flushed to checkpoint
Affected page is in DPT, but has DPT recLSN > LSN. H.D.T.H.?
This page was flushed to DB, removed from DPT before checkpoint
Then this page was referenced again and reinserted in DPT with larger recLSN
pageLSN (in DB) >= LSN. (this last case requires DB I/O). H.D.T.H.?
This page was updated again and flushed to DB
after this log record

Phase 3: The UNDO Phase
A simple solution:
The xacts in the Xact Table are losers.
For each loser, perform simple transaction abort.
Problem?
Lots of random I/O in the log following undoNextLSN chains.
Can we do this in one backwards pass of log?
Next slide!

Phase 3: The UNDO Phase, cont
toUndo = {lastLSNs of all Xacts in the Xact Table}
while !toUndo.empty():
thisLR = toUndo.find_and_remove_largest_LSN()
if thisLR.type == CLR:
if thisLR.undoNextLSN != NULL:
toUndo.insert(thisLR.undonextLSN)
else: // thisLR.undonextLSN == NULL
write an End record for thisLR.xid in the log
else:
if thisLR.type == UPDATE:
write a CLR for the undo in the log
undo the update in the database
if thisLR.prevLSN != NULL:
toUndo.insert(thisLR.prevLSN)
elif thisLR.prevLSN == NULL:
write an END record for thisLR.xid

Example of Recovery

begin_checkpoint
end_checkpoint
update: T1 writes P5
update: T2 writes P3
T1 abort
CLR: Undo T1 LSN 10
T1 End
update: T3 writes P1
update: T2 writes P5
CRASH, RESTART

LSN LOG
00
05
10
20
30
40
45
50
60

prevLSNs

Xact Table
lastLSN
status
Dirty Page Table
recLSN
flushedLSN

ToUndo
RAM

Using pencil and paper, run the ARIES recovery algorithm on this log, assuming you have access to a master record pointing to LSN 05. Maintain all the state on the left as you go!

Example: Crash During Restart!

Xact Table
lastLSN
status
Dirty Page Table
recLSN
flushedLSN

ToUndo
RAM

begin_checkpoint, end_checkpoint
update: T1 writes P5
update T2 writes P3
T1 abort
CLR: Undo T1 LSN 10, T1 End
update: T3 writes P1
update: T2 writes P5
CRASH, RESTART
T2 abort, T3 abort
CLR: Undo T2 LSN 60
CLR: Undo T3 LSN 50, T3 end
CRASH, RESTART
CLR: Undo T2 LSN 20, T2 end

LSN LOG
00,05
10
20
30
40,45
50
60

70,80
90
100,105

110,115

undonextLSN

Using pencil and paper, run the ARIES recovery algorithm on this log, assuming you have access to a master record pointing to LSN 05. Maintain all the state on the left as you go!

Additional Crash FAQs to Understand
Q: What happens if system crashes during Analysis?
A: Nothing serious. RAM state lost, need to start over next time.
Q: What happens if the system crashes during REDO?
A: Nothing bad. Some REDOs done, and we’ll detect that next time.
Q: How do you limit the amount of work in REDO?
A: Flush asynchronously in the background. Even“hot” pages!
Q: How do you limit the amount of work in UNDO?
A: Avoid long-running Xacts.

Summary of Logging/Recovery
Recovery Manager guarantees Atomicity & Durability.
Use WAL to allow STEAL/NO-FORCE w/o sacrificing correctness.
LSNs identify log records; linked into backwards chains per transaction (via prevLSN).
pageLSN allows comparison of data page and log records.

Summary, Cont.
Checkpointing: Quick way to limit the amount of log to scan on recovery.
Recovery works in 3 phases:
Analysis: Forward from checkpoint.
Redo: Forward from oldest recLSN.
Undo: Backward from end to first LSN of oldest Xact alive at crash.
Upon Undo, write CLRs.
Redo “repeats history”: Simplifies the logic!

/docProps/thumbnail.jpeg