T ransactions, Recovery
and Concurrency (II)
Concurrency Control
3/30/20 1
Concurrency Control Methods
• Locking Mechanism
The idea of locking some data item X is to:
• give a transaction exclusive use of the data item X,
• do not restrict the access of other data items.
This prevents one transaction from changing a data item
currently being used in another transaction.
• We will discuss a simple locking scheme which locks
individual items, using read and write locks
3/30/20 2
Locking Rules
• In this schema, every transaction T must obey the following rules.
• 1) If T has only one operation (read/write) manipulating an item X:
– obtain a read lock on X before reading it,
– obtain a write lock on X before writing it,
– unlock X when done with it.
• 2) If T has several operations manipulating X:
– obtain one proper lock only on X:
a read lock if all operations on X are reads;
a write lock if one of these operations on X is a write.
– unlock X after the last operation on X in T has been executed.
3/30/20 3
Locking Rules (cont.)
• In this scheme,
– Several read locks can be issued on the same data item at the same time.
– A read lock and a write lock cannot be issued on the same data item at the same
time, neither two write locks.
• This still does not guarantee serializability.
3/30/20 4
3/30/20 5
T1
T2
Two Phase Locking (2PL)
• To guarantee serializability, transactions must also obey the two-
phase locking protocol:
– Growing Phase: all locks for a transaction must be obtained before any
locks are released, and
– Shrinking Phase: gradually release all locks (once a lock is released no
new locks may be requested).
3/30/20 6
Two Phase Locking (2PL) (Cont.)
• Locking thus provides a solution to the problem of correctness
of schedules.
3/30/20 7
Two phase locking ensures conflict serializability
Deadlock
• A problem that arises with locking is deadlock.
• Deadlock occurs when two transactions are each waiting for a
lock on an item held by the other.
3/30/20 8
Deadlock Check
• Create the wait-for graph for currently active transactions:
– create a vertex for each transaction; and
– an arc from Ti to Tj if Ti is waiting for an item locked by Tj.
• If the graph has a cycle, then a deadlock has occurred.
3/30/20 9
Several methods to deal with deadlocks
• deadlock detection
– periodically check for deadlocks, abort and rollback some
transactions (restart them later). This is a good choice if
transactions are very short or very independent.
3/30/20 10
Several methods to deal with deadlocks
(Cont.)
• deadlock prevention – Assign priorities based on timestamps.
Assume Ti wants a lock that Tj holds. Two policies are possible:
– Wait-Die: If Ti has higher priority, Ti waits for Tj; otherwise Ti aborts
– Wound-wait: If Ti has higher priority, Tj aborts; otherwise Ti waits
• If a transaction re-starts, make sure it has its original timestamp
3/30/20 11
Timestamp ordering
• The idea here is:
– to assign each transaction a timestamp (e.g. start time of transaction),
and
– to ensure that the schedule used is equivalent to executing the
transactions in timestamp order
3/30/20 12
• Each data item, X, is assigned
– a read timestamp, read TS(X) – the latest timestamp of a transaction that
read X, and
– a write timestamp, write TS(X) – the latest timestamp of a transaction
that write X.
3/30/20 13
• These are used in read and write operations as follows. Suppose the
transaction timestamp is T.
3/30/20 14
• Thomas’ write rule:
3/30/20 15
T1 T2 T3
read (x)
read (y)
write (y)
read (z)
read (z)
write (z)
Write (z)
3/30/20 16
r_TS (x) = 0
w_TS(x) = 0
r_TS (y) = 0
w_TS(y) = 0
r_TS (z) = 0
w_TS(z) = 0
à 1
à 2
à 2
à 1
à 3
à 3
• Some problems:
– Cyclic restart: There is no deadlock, but a kind of livelock can occur –
some transactions may be constantly aborted and restarted.
– Cascading rollback: When a transaction is rolled back, so are any
transactions which read a value written by it, and any transactions
which read a value written by them . . . etc. This can be avoided by not
allowing transactions to read values written by uncommitted
transactions (make them wait).
3/30/20 17
Multiple-Granularity Locks
18
• Hard to decide what granularity to lock (tuples vs.
pages vs. tables).
• Shouldn’t have to decide!
• Data “containers” are nested:
Tuples
Tables
Pages
Database
contains
Solution: New Lock Modes, Protocol
19
• Allow Xacts to lock at each level, but with a special
protocol using new “intention” locks:
❖ Before locking an item, Xact
must set “intention locks” on all
its ancestors.
❖ For unlock, go from specific to
general (i.e., bottom-up).
❖ SIX mode: Like S & IX at the
same time.
— IS IX
—
IS
IX
Ö
Ö
Ö
Ö Ö
Ö
S X
Ö
Ö
S
X
Ö Ö
Ö
Ö
Ö
Ö Ö
Ö
Multiple Granularity Lock Protocol
20
• Each Xact starts from the root of the hierarchy.
• To get S or X lock on a node, must hold IS or IX on parent
node.
–What if Xact holds SIX on parent? S on parent?
• To get X or IX or SIX on a node, must hold IX or SIX on
parent node.
• Must release locks in bottom-up order.
Protocol is correct in that it is equivalent to directly setting
locks at the leaf levels of the hierarchy.
Examples
21
• T1 scans R, and updates a few tuples:
– T1 gets an SIX lock on R, then
repeatedly gets an S lock on tuples of
R, and occasionally upgrades to X on
the tuples.
• T2 uses an index to read only part of R:
– T2 gets an IS lock on R, and
repeatedly gets an S lock on tuples of
R.
• T3 reads all of R:
– T3 gets an S lock on R.
– OR, T3 could behave like T2; can
use lock escalation to decide which.
— IS IX
—
IS
IX
Ö
Ö
Ö
Ö Ö
Ö
S X
Ö
Ö
S
X
Ö Ö
Ö
Ö
Ö
Ö Ö
Ö
Dynamic Databases
22
• If we relax the assumption that the DB is a fixed collection of
objects, even Strict 2PL will not assure serializability:
– T1 locks all pages containing sailor records with
rating = 1, and finds oldest sailor (say, age = 71).
– Next, T2 inserts a new sailor; rating = 1, age = 96.
– T2 also deletes oldest sailor with rating = 2 (and,
say, age = 80), and commits.
– T1 now locks all pages containing sailor records
with rating = 2, and finds oldest (say, age = 63).
• No consistent DB state; however T1 “correctly” gets through!
Sailors (sid: integer, sname: string, rating: integer, age: real)
Reserves (sid: integer, bid: integer, day: dates, rname: string)
The Problem
23
• T1 implicitly assumes that it has locked the set of all
sailor records with rating = 1.
– Assumption only holds if no sailor records are
added while T1 is executing!
– Need some mechanism to enforce this
assumption. (Index locking and predicate
locking.)
• Example shows that conflict serializability guarantees
serializability only if the set of objects is fixed!
Index Locking
24
§ If there is a dense index on the rating field using
Alternative (2), T1 should lock the index page containing
the data entries with rating = 1.
ØIf there are no records with rating = 1, T1 must
lock the index page where such a data entry
would be, if it existed!
§ If there is no suitable index, T1 must lock all pages, and
lock the file/table to prevent new pages from being added,
to ensure that no new records with rating = 1 are added.
r=1
Data
Index
Predicate Locking
25
• Grant lock on all records that satisfy some
logical predicate, e.g. age > 2*salary.
• Index locking is a special case of predicate
locking for which an index supports efficient
implementation of the predicate lock.
• What is the predicate in the sailor example?
• In general, predicate locking has a lot of
locking overhead.
Locking in B+ Trees
26
• How can we efficiently lock a particular leaf
node?
– Btw, don’t confuse this with multiple granularity locking!
• One solution: Ignore the tree structure, just lock
pages while traversing the tree, following 2PL.
• This has terrible performance!
– Root node (and many higher level nodes) become
bottlenecks because every tree access begins at the root.
Two Useful Observations
27
� Higher levels of the tree only direct searches for
leaf pages.
� For inserts, a node on a path from root to modified
leaf must be locked (in X mode, of course), only if
a split can propagate up to it from the modified
leaf. (Similar point holds w.r.t. deletes.)
� We can exploit these observations to design
efficient locking protocols that guarantee
serializability even though they violate 2PL.
A Simple Tree Locking Algorithm
2
8
• Search: Start at root and go down; repeatedly, S lock child
then unlock parent.
• Insert/Delete: Start at root and go down, obtaining X locks
as needed. Once child is locked, check if it is safe:
– If child is safe, release all locks on ancestors.
• Safe node: Node such that changes will not propagate up
beyond this node.
– Inserts: Node is not full.
– Deletes: Node is not half-empty.
29
ROOT
A
B
C
D E
F
G H I
20
35
20*
38 44
22* 23* 24* 35* 36* 38* 41* 44*
Do:
1) Search 38*
2) Delete 38*
3) Insert 45*
4) Insert 25*
23
A Better Tree Locking Algorithm (See
Bayer-Schkolnick paper)
30
• Search: As before.
• Insert/Delete:
– Set locks as if for search, get to leaf, and set X
lock on leaf.
– If leaf is not safe, release all locks, and restart
Xact using previous Insert/Delete protocol.
• Gambles that only leaf node will be modified; if not, S locks
set on the first pass to leaf are wasteful. In practice, better
than previous alg.
Example
31
ROOT
A
B
C
D E
F
G H I
20
35
20*
38 44
22* 23* 24* 35* 36* 38* 41* 44*
Do:
1) Delete 38*
2) Insert 25*
4) Insert 45*
5) Insert 45*,
then 46*
23
Even Better Algorithm
32
• Search: As before.
• Insert/Delete:
– Use original Insert/Delete protocol, but set IX
locks instead of X locks at all nodes.
– Once leaf is locked, convert all IX locks to X
locks top-down: i.e., starting from node nearest
to root. (Top-down reduces chances of
deadlock.)
(Contrast use of IX locks here with their use in
multiple-granularity locking.)
Hybrid Algorithm
33
• The likelihood that we really need an X lock
decreases as we move up the tree.
• Hybrid approach:
Set S locks
Set SIX locks
Set X locks
Multiversioning
• Similar to the timestamp ordering approach; but is allowed to
access “old” versions of a table.
• A history of the values and timestamps (versions) of each item
is kept.
• When the value of an item is needed, the system chooses a
proper version of the item that maintains serializability.
• This results in fewer aborted transactions at the cost of greater
complexity to maintain more versions of each item.
3/30/20 34
• We will look at a scheme, several versions X1,…,Xk of each data item
are kept. For each Xi we also keep
– read TS(Xi) – as for timestamp ordering.
– write TS(Xi) – as for timestamp ordering.
3/30/20 35
• Read and write are done as follows for a transaction P with
timestamp T.
3/30/20 36
• Note: Cascading rollback and cyclic restart problems can still occur,
but should be reduced.
• However, there is an increased overhead in maintaining multiple
versions of items.
3/30/20 37
Optimistic scheduling
• In two-phase locking, timestamp ordering, and multiversioning
concurrency control techniques, a certain degree of checking is done
before a database operation can be executed.
• The idea here is to push on and hope for the best!
• No checking is done while the transaction is executing.
3/30/20 38
• The protocol has three phases.
– read phase – A transaction can read data items from the database into
local variables. However, updates are applied only to local copies of the
data items kept in the transaction workspace.
– validation phase – checks are made to ensure that serializability is not
violated,
– write phase -if validation succeeds, updates are applied and the
transaction is committed. Otherwise, the updates are discarded and the
transaction is restarted.
3/30/20 39
• A scheme uses timestamps and keeps each transaction’s
– read-set – the set of items read by the transaction,
– write-set – the set of items written by the transaction.
• During validation, we check that the transaction does not interfere
with any transaction that is committed or currently validating.
3/30/20 40
• Each transaction T is assigned 3 timestamps:
Start(T), Validation(T), Finish(T).
• To pass the validation test for T, one of the following must be true:
– 1. Finish(S) < Start(T); or
– 2. for S s.t. Start(T) < Finish(S), then
a) write set of S is disjoint from the read set of T, and
b) Finish(S) < Validation(T).
3/30/20 41
• Optimistic control is a good option if there is not much
interaction between transactions.
• Note: Our earlier treatment of recovery methods largely
ignored concurrency issues.
3/30/20 42
2PL vs. TSO vs. MV vs. OP
• A Comparison among two-phase locking (2PL), timestamp ordering
(TSO), multiversioning (MV), optimistic (OP) concurrency control
techniques.
• MV should provide the greatest concurrency degree (in average).
However, we need to maintain multiversions for each data item.
• 2PL can offer the second greatest concurrency degree (in average);
but will result in deadlocks. To resolve the deadlocks, either
– need additional computation to detect deadlocks and to resolve the
deadlocks, or
– reduce the concurrency degree to prevent deadlocks by adding other
restrictions.
3/30/20 43
2PL vs. TSO vs. MV vs. OP (cont.)
• If most transactions are very short, we can use 2PL + deadlock
detection and resolution.
• TSO has a less concurrency degree than that of 2PL if a proper
deadlock resolution is found. However, TSO does not cause
deadlocks. Other problems, such as cyclic restart and cascading
rollback, will appear in TSO.
• If there are not much interaction between transactions, OP is a very
good choice. Otherwise, OP is a bad choice.
3/30/20 44