PowerPoint Presentation
Transactions &
Concurrency Control 1
R&G 16/17
There are three side effects of acid. Enhanced long term memory, decreased short term memory, and I forget the third.
– Timothy Leary
1
Architecture of a DBMS
You are here
Completed
Database Management
System
Database
Query Parsing
& Optimization
SQL Client
Relational Operators
Files and Index Management
Buffer Management
Disk Space Management
Completed
Completed
Completed
Completed
Completed
Completed
2
Database
Architecture of a DBMS, Part 2
Database Management
System
Query Parsing
& Optimization
SQL Client
Relational Operators
Files and Index Management
Buffer Management
Disk Space Management
Database Management
System
Query Parsing
& Optimization
SQL Client
Relational Operators
Files and Index Management
Buffer Management
Disk Space Management
Database Management
System
Query Parsing
& Optimization
SQL Client
Relational Operators
Files and Index Management
Buffer Management
Disk Space Management
You are here
Transaction Manager
3
Database
Architecture of a DBMS, Part 3
Database Management
System
You are here
Logging & Recovery
Lock Manager
Database Management
System
Query Parsing
& Optimization
SQL Client
Relational Operators
Files and Index Management
Buffer Management
Disk Space Management
Database Management
System
Query Parsing
& Optimization
SQL Client
Relational Operators
Files and Index Management
Buffer Management
Disk Space Management
Database Management
System
Query Parsing
& Optimization
SQL Client
Relational Operators
Files and Index Management
Buffer Management
Disk Space Management
4
Applications on DBMS
Virtually any compute service that maintains state today is an application on top of some kind of DBMS
Uber
Kayak
Amazon.com
BankofAmerica
Pokemon Go
5
Applications Want Something from the DBMS
Queries and updates of course: what you learned so far!
Real applications are composed of many statements being generated by user behaviors
Many users work with the application at the same time
Database Management
System
SQL
SQL
SQL
SQL
SQL
SQL
SQL
6
Concurrency Control & Recovery
Part 1: Concurrency Control
Correct/fast data access in the presence of concurrent work by many users
Disorderly processing that provides the illusion of order
Part 2: Recovery
Ensure database is fault tolerant
Not corrupted by software, system or media failure
Storage guarantees for mission-critical data
It’s all about the programmer!
Systems provide guarantees
These guarantees lighten the load of app writers
7
Concurrent Execution: Why bother?
Multiple transactions are allowed to run concurrently in the system.
Advantages are twofold:
Throughput (transactions per second):
Increase processor/disk utilization more transactions per second (TPS) completed
Single core: can use the CPU while another xact is reading to/writing from the disk
Multicore: ideally, scale throughput in the number of processors
Latency (response time per transaction):
Multiple transactions can run at the same time
So one transaction’s latency need not be dependent on another unrelated transaction
Or that’s the hope
Both are important!
8
Motivating Example
UPDATE Budget
SET money = money – 500
WHERE pid = 1
UPDATE Budget
SET money = money + 200
WHERE pid = 2
UPDATE Budget
SET money = money + 300
WHERE pid = 3
SELECT sum(money)
FROM Budget
Two Issues:
Order matters!
Users need a way to say what’s OK
9
Different Types of Problems
INSERT INTO DollarProducts(name, price)
SELECT pname, price
FROM Product
WHERE price <= 0.99
DELETE Product
WHERE price <= 0.99
SELECT count(*)
FROM Product
SELECT count(*)
FROM DollarProducts
User 1
User 2
What could go wrong?
Inconsistent Reads
10
Different Types of Problems, Part 2
UPDATE Product
SET Price = Price – 10.99
WHERE pname = “CoolToy”
Lost Update
UPDATE Product
SET Price = Price*0.6
WHERE pname = “CoolToy”
User 2
User 1
What could go wrong?
11
Different Types of Problems, Part 3
UPDATE Account
SET amount = 1000000
WHERE number = “my-account”
Dirty Reads
Aborted by the system
SELECT amount
FROM Account
WHERE number = “my-account”
User 2
User 1
What could go wrong?
12
Transactions
Transaction: Concept and Implementation
Major component of database systems
Critical for most applications; arguably more so than SQL
14
An Aside: Database Turing Awards
15
What is a Transaction?
A sequence of multiple actions to be executed as an atomic unit
Application View (SQL View):
Begin transaction
Sequence of SQL statements
End transaction
Examples
Transfer money between accounts
Book a flight, a hotel and a car together on Expedia
16
Our Transaction Model
Transaction (“Xact”):
DBMS’s abstract view of an application program (or activity)
A sequence of reads and writes of database objects
Batch of work that must commit or abort as an atomic unit
Xact Manager controls execution of transactions
Program logic is invisible to DBMS!
Arbitrary computation possible on data fetched from the DB
The DBMS only sees data read/written from/to the DB
(Note: modern systems have started rethinking this
assumption, but we’ll stick with it here)
17
Transaction Example
Transaction to transfer $100 from account R to account S
start transaction
read(R)
R = R – 100
write(R)
read(S)
S = S + 100
write(S)
end transaction
Not seen by the DBMS transaction manager!
18
ACID: High-Level Properties of Transactions
A tomicity: All actions in the Xact happen, or none happen.
C onsistency: If the DB starts out consistent, it ends up consistent at the end of the Xact
I solation: Execution of each Xact is isolated from that of others
D urability: If a Xact commits, its effects persist.
Note: This is a mnemonic, not a formalism. We’ll do some formalisms shortly.
19
Isolation (Concurrency)
DBMS interleaves actions of many xacts
Actions = reads/writes of DB objects
DBMS ensures 2 xacts do not “interfere”
Each xact executes as if it ran by itself.
Concurrent accesses have no effect on xact’s behavior
Net effect must be identical to executing all transactions in some serial order
Users & programmers think about transactions in isolation
Without considering effects of other concurrent Xacts!
20
Isolation: An Example
Think about avoiding problems due to concurrency
If another transaction T2 accesses R and S between steps 4 and 5 of T1, it will see a lower value for R+S.
Isolation easy to achieve by running one Xact at a time
However, recall that serial execution is not desirable
start transaction
read(R)
R = R – 100
write(R)
read(S)
S = S + 100
write(S)
end transaction
T1
start transaction
read(R)
print(R+S)
end transaction
T2
21
Atomicity and Durability
A transaction ends in one of two ways:
Commit after completing all its actions
“commit” is a contract with the caller of the DB
Abort (or be aborted by the DBMS) after executing some actions
Or system crash while the xact is in progress; treat as abort.
Two key properties for a transaction
Atomicity: Either execute all its actions, or none of them
Durability: The effects of a committed xact must survive failures.
DBMS typically ensures the above by logging all actions:
Undo the actions of aborted/failed transactions.
Redo actions of committed transactions not yet propagated
to disk when system crashes
22
Atomicity and Durability, cont.
Atomicity
If the transaction fails after step 4 and before step 7
Money will be “lost” inconsistent database
DBMS should ensure that updates of a partially executed transaction are not reflected
Durability
Once the user hears that the transaction is complete, can rest easy that the $100M was xferred from R to S.
start transaction
read(R)
R = R – 100
write(R)
read(S)
S = S + 100
write(S)
end transaction
23
Transaction Consistency
Transactions preserve DB consistency
Given a consistent DB state, produce another consistent DB state
DB consistency expressed as a set of declarative integrity constraints
CREATE TABLE/ASSERTION statements
Transactions that violate integrity are aborted
That’s all the DBMS can automatically check!
DBMS
(state1)
Transaction
DBMS
(state2)
Consistent States
(ICs all satisfied)
24
Summary
We have seen an overview
ACID Transactions make guarantees that
Improve performance (via concurrency)
Relieve programmers of correctness concerns
Hide concurrency and failure handling!
Two key issues to consider, and mechanisms
Concurrency control (via two-phase locking)
Recovery (via write-ahead logging WAL)
We’ll do concurrency control first
25
Concurrency Control
Concurrency Control: Providing Isolation
Naïve approach - serial execution
One transaction runs at a time
Safe but slow
Execution must be interleaved for better performance
With concurrent executions, how does one define and ensure correctness?
27
Transaction Schedules
A schedule is a sequence of actions on data from one or more transactions.
Actions: Begin, Read, Write, Commit and Abort.
R1(A) W1(A) R1(B) W1(B) R2(A) W2(A) R2(B) W2(B)
By convention we only include committed transactions, and omit
Begin and Commit.
T1 T2
begin
read(A)
write(A)
read(B)
write(B)
commit
begin
read(A)
write(A)
read(B)
write(B)
commit
Serial Equivalence
We need a “touchstone” concept for correct behavior
Definition: Serial schedule
Each transaction runs from start to finish without any intervening actions from other transactions
Definition: 2 schedules are equivalent if they:
involve the same transactions
each individual transaction’s actions are ordered the same
both schedules leave the DB in the same final state
Serializability
Definition: Schedule S is serializable if:
S is equivalent to some serial schedule
=?
=?
S
Schedule 1
Let T1 transfer $100 from A to B
Let T2 add 10% interest to A & B
Serial schedule in which T1 is followed by T2
Final outcome:
A := 1.1*(A-100)
B := 1.1*(B+100)
T1: Transfer $100 from A to B T2: Add 10% interest to A & B
begin
read(A)
A = A - 100
write(A)
read(B)
B = B + 100
write(B)
commit
begin
read(A)
A = A * 1.1
write(A)
read(B)
B = B * 1.1
write(B)
commit
31
Schedule 2
T1: Transfer $100 from A to B T2: Add 10% interest to A & B
begin
read(A)
A = A * 1.1
write(A)
read(B)
B = B * 1.1
write(B)
commit
begin
read(A)
A = A - 100
write(A)
read(B)
B = B + 100
write(B)
commit
Serial schedule in which T2 is followed by T1
Final outcome:
A := (1.1*A)-100
B := (1.1*B)+100
Different!
But still understandable
32
Schedule 3
Schedule in which actions of T1 and T2 are interleaved.
This is not a serial schedule
But it is equivalent to schedule 1
A := (A-100)*1.1
B := (B+100)*1.1
Hence serializable!
Thanks to @jmpatel
T1: Transfer $100 from A to B T2: Add 10% interest to A & B
begin
read(A)
A = A - 100
write(A)
begin
read(A)
A = A * 1.1
write(A)
read(B)
B = B + 100
write(B)
commit
read(B)
B = B * 1.1
write(B)
commit
33
Conflicting Operations
Tricky to check property “leaves the DB in the same final state”
Need an easier equivalence test!
Settle for a “conservative” test: always true positives, but some false negatives
I.e. sacrifice some concurrency for easier correctness check
Use notion of “conflicting” operations (read/write)
Definition: Two operations conflict if they:
Are by different transactions,
Are on the same object,
At least one of them is a write.
The order of non-conflicting operations has no effect on
the final state of the database!
Focus our attention on the order of conflicting operations
34
Conflict Serializable Schedules
Definition: Two schedules are conflict equivalent if:
They involve the same actions of the same transactions, and
Every pair of conflicting actions is ordered the same way
Definition: Schedule S is conflict serializable if:
S is conflict equivalent to some serial schedule
Implies S is also Serializable
Note: some serializable schedules are NOT conflict serializable
Conflict serializability gives false negatives as a test for
serializability!
The cost of a conservative test
A price we pay to achieve efficient enforcement
35
Conflict Serializability - Intuition
A schedule S is conflict serializable if
You are able to transform S into a serial schedule by swapping consecutive non-conflicting operations of different transactions
Example
R(A)
R(B)
W(A)
W(B)
R(A)
W(A)
R(B)
W(B)
R(A)
R(B)
W(A)
W(B)
R(A)
W(A)
R(B)
W(B)
36
Conflict Serializability – Intuition, Part 2
A schedule S is conflict serializable if
You are able to transform S into a serial schedule by swapping consecutive non-conflicting operations of different transactions
Example
R(A)
R(B)
W(A)
W(B)
R(A)
W(A)
R(B)
W(B)
R(A)
R(B)
W(A)
W(B)
R(A)
W(A)
R(B)
W(B)
R(A)
R(B)
W(A)
W(B)
R(A)
W(A)
R(B)
W(B)
37
Conflict Serializability – Intuition, Part 3
A schedule S is conflict serializable if
You are able to transform S into a serial schedule by swapping consecutive non-conflicting operations of different transactions
Example
R(A)
R(B)
W(A)
W(B)
R(A)
W(A)
R(B)
W(B)
R(A)
R(B)
W(A)
W(B)
R(A)
W(A)
R(B)
W(B)
R(A)
R(B)
W(A)
W(B)
R(A)
W(A)
R(B)
W(B)
38
Conflict Serializability – Intuition, Part 4
A schedule S is conflict serializable if
You are able to transform S into a serial schedule by swapping consecutive non-conflicting operations of different transactions
Example
R(A)
R(B)
W(A)
W(B)
R(A)
W(A)
R(B)
W(B)
R(A)
R(B)
W(A)
W(B)
R(A)
W(A)
R(B)
W(B)
R(A)
R(B)
W(A)
W(B)
R(A)
W(A)
R(B)
W(B)
39
Conflict Serializability – Intuition, Part 5
A schedule S is conflict serializable if
You are able to transform S into a serial schedule by swapping consecutive non-conflicting operations of different transactions
Example
R(A)
R(B)
W(A)
W(B)
R(A)
W(A)
R(B)
W(B)
R(A)
R(B)
W(A)
W(B)
R(A)
W(A)
R(B)
W(B)
R(A)
R(B)
W(A)
W(B)
R(A)
W(A)
R(B)
W(B)
40
Conflict Serializability – Intuition, cont
A schedule S is conflict serializable if
You are able to transform S into a serial schedule by swapping consecutive non-conflicting operations of different transactions
Example
R(A)
R(B)
W(A)
W(B)
R(A)
W(A)
R(B)
W(B)
R(A)
R(B)
W(A)
W(B)
R(A)
W(A)
R(B)
W(B)
R(A)
R(B)
W(A)
W(B)
R(A)
W(A)
R(B)
W(B)
41
Conflict Serializability (Continued)
Here’s another example:
Conflict Serializable or not?
R(A)
W(A)
R(A)
W(A)
NOT!
42
Conflict Dependency Graph
Dependency Graph:
One node per Xact
Edge from Ti to Tj if:
An operation Oi of Ti conflicts with an operation Oj of Tj and
Oi appears earlier in the schedule than Oj
Theorem: Schedule is conflict serializable if and only if its
dependency graph is acyclic.
Proof Sketch: Conflicting operations prevent us
from “swapping” operations into a
serial schedule
Ti
Tj
43
Example
A schedule that is not conflict serializable
T1
T2
Dependency graph
T1: R(A), W(A)
44
Example, pt 2
A schedule that is not conflict serializable
T1: R(A), W(A),
T2: R(A)
T1
T2
Dependency graph
45
Example, pt 3
A schedule that is not conflict serializable
T1: R(A), W(A),
T2: R(A), W(A), R(B), W(B)
T1
T2
Dependency graph
A
46
Example, pt 4
A schedule that is not conflict serializable
T1: R(A), W(A), R(B)
T2: R(A), W(A), R(B), W(B)
B
T1
T2
Dependency graph
A
47
View Serializability
Alternative notion of serializability: fewer false negatives
Schedules S1 and S2 are view equivalent if:
Same initial reads:
If Ti reads initial value of A in S1, then Ti also reads initial value of A in S2
Same dependent reads:
If Ti reads value of A written by Tj in S1, then Ti also reads value of A written by Tj in S2
Same winning writes:
If Ti writes final value of A in S1, then Ti also writes final value of A in S2
Basically, allows all conflict serializable schedules + “blind writes”
view
T1: R(A) W(A)
T2: W(A)
T3: W(A)
T1: R(A) W(A)
T2: W(A)
T3: W(A)
48
Notes on Serializability Definitions
View Serializability allows (a few) more schedules than conflict serializability
But V.S. is difficult to enforce efficiently.
Neither definition allows all schedules that are actually serializable.
Because they don’t understand the meanings of the operations or the data
Conflict Serializability is what gets used, because it can be enforced efficiently
To allow more concurrency, some special cases do get handled separately.
(Search the web for “Escrow Transactions” for example)
49
T1 T2
begin
read(A)
write(A)
read(B)
write(B)
commit
begin
read(A)
write(A)
read(B)
write(B)
commit
T1 T2
begin
read(A)
write(A)
read(B)
write(B)
commit
begin
read(A)
write(A)
read(B)
write(B)
commit
T1 T2
begin
read(A)
write(A)
begin
read(A)
write(A)
read(B)
write(B)
commit
read(B)
write(B)
commit
T1 T2
begin
read(A)
write(A)
read(B)
write(B)
commit
begin
read(A)
write(A)
read(B)
write(B)
commit
T1 T2
begin
begin
read(A)
read(A)
write(A)
write(A)
read(B)
read(B)
write(B)
write(B)
commit
commit