CS计算机代考程序代写 scheme database concurrency algorithm T ransactions, Recovery

T ransactions, Recovery
and Concurrency (II)

Concurrency Control

3/30/20 1

Concurrency Control Methods

• Locking Mechanism

The idea of locking some data item X is to:
• give a transaction exclusive use of the data item X,
• do not restrict the access of other data items.

This prevents one transaction from changing a data item
currently being used in another transaction.

• We will discuss a simple locking scheme which locks
individual items, using read and write locks

3/30/20 2

Locking Rules

• In this schema, every transaction T must obey the following rules.

• 1) If T has only one operation (read/write) manipulating an item X:
– obtain a read lock on X before reading it,
– obtain a write lock on X before writing it,
– unlock X when done with it.

• 2) If T has several operations manipulating X:
– obtain one proper lock only on X:
a read lock if all operations on X are reads;
a write lock if one of these operations on X is a write.
– unlock X after the last operation on X in T has been executed.

3/30/20 3

Locking Rules (cont.)

• In this scheme,
– Several read locks can be issued on the same data item at the same time.

– A read lock and a write lock cannot be issued on the same data item at the same
time, neither two write locks.

• This still does not guarantee serializability.

3/30/20 4

3/30/20 5

T1

T2

Two Phase Locking (2PL)

• To guarantee serializability, transactions must also obey the two-
phase locking protocol:

– Growing Phase: all locks for a transaction must be obtained before any
locks are released, and

– Shrinking Phase: gradually release all locks (once a lock is released no
new locks may be requested).

3/30/20 6

Two Phase Locking (2PL) (Cont.)

• Locking thus provides a solution to the problem of correctness
of schedules.

3/30/20 7

Two phase locking ensures conflict serializability

Deadlock

• A problem that arises with locking is deadlock.
• Deadlock occurs when two transactions are each waiting for a

lock on an item held by the other.

3/30/20 8

Deadlock Check

• Create the wait-for graph for currently active transactions:
– create a vertex for each transaction; and
– an arc from Ti to Tj if Ti is waiting for an item locked by Tj.

• If the graph has a cycle, then a deadlock has occurred.

3/30/20 9

Several methods to deal with deadlocks

• deadlock detection
– periodically check for deadlocks, abort and rollback some

transactions (restart them later). This is a good choice if
transactions are very short or very independent.

3/30/20 10

Several methods to deal with deadlocks
(Cont.)

• deadlock prevention – Assign priorities based on timestamps.
Assume Ti wants a lock that Tj holds. Two policies are possible:

– Wait-Die: If Ti has higher priority, Ti waits for Tj; otherwise Ti aborts

– Wound-wait: If Ti has higher priority, Tj aborts; otherwise Ti waits

• If a transaction re-starts, make sure it has its original timestamp

3/30/20 11

Timestamp ordering

• The idea here is:

– to assign each transaction a timestamp (e.g. start time of transaction),
and

– to ensure that the schedule used is equivalent to executing the
transactions in timestamp order

3/30/20 12

• Each data item, X, is assigned

– a read timestamp, read TS(X) – the latest timestamp of a transaction that
read X, and

– a write timestamp, write TS(X) – the latest timestamp of a transaction
that write X.

3/30/20 13

• These are used in read and write operations as follows. Suppose the
transaction timestamp is T.

3/30/20 14

• Thomas’ write rule:

3/30/20 15

T1 T2 T3
read (x)

read (y)
write (y)

read (z)
read (z)
write (z)

Write (z)

3/30/20 16

r_TS (x) = 0
w_TS(x) = 0
r_TS (y) = 0
w_TS(y) = 0
r_TS (z) = 0
w_TS(z) = 0

à 1

à 2
à 2
à 1
à 3

à 3

• Some problems:
– Cyclic restart: There is no deadlock, but a kind of livelock can occur –

some transactions may be constantly aborted and restarted.

– Cascading rollback: When a transaction is rolled back, so are any
transactions which read a value written by it, and any transactions
which read a value written by them . . . etc. This can be avoided by not
allowing transactions to read values written by uncommitted
transactions (make them wait).

3/30/20 17

Multiple-Granularity Locks

18

• Hard to decide what granularity to lock (tuples vs.
pages vs. tables).

• Shouldn’t have to decide!
• Data “containers” are nested:

Tuples

Tables

Pages

Database

contains

Solution: New Lock Modes, Protocol

19

• Allow Xacts to lock at each level, but with a special
protocol using new “intention” locks:

❖ Before locking an item, Xact
must set “intention locks” on all
its ancestors.

❖ For unlock, go from specific to
general (i.e., bottom-up).

❖ SIX mode: Like S & IX at the
same time.

— IS IX

IS

IX

Ö

Ö

Ö

Ö Ö

Ö

S X

Ö
Ö

S

X

Ö Ö

Ö

Ö

Ö

Ö Ö

Ö

Multiple Granularity Lock Protocol

20

• Each Xact starts from the root of the hierarchy.
• To get S or X lock on a node, must hold IS or IX on parent

node.
–What if Xact holds SIX on parent? S on parent?

• To get X or IX or SIX on a node, must hold IX or SIX on
parent node.

• Must release locks in bottom-up order.

Protocol is correct in that it is equivalent to directly setting
locks at the leaf levels of the hierarchy.

Examples

21

• T1 scans R, and updates a few tuples:
– T1 gets an SIX lock on R, then

repeatedly gets an S lock on tuples of
R, and occasionally upgrades to X on
the tuples.

• T2 uses an index to read only part of R:
– T2 gets an IS lock on R, and

repeatedly gets an S lock on tuples of
R.

• T3 reads all of R:
– T3 gets an S lock on R.
– OR, T3 could behave like T2; can

use lock escalation to decide which.

— IS IX

IS

IX

Ö

Ö

Ö

Ö Ö

Ö

S X

Ö
Ö

S

X

Ö Ö

Ö

Ö

Ö

Ö Ö

Ö

Dynamic Databases

22

• If we relax the assumption that the DB is a fixed collection of
objects, even Strict 2PL will not assure serializability:
– T1 locks all pages containing sailor records with
rating = 1, and finds oldest sailor (say, age = 71).

– Next, T2 inserts a new sailor; rating = 1, age = 96.
– T2 also deletes oldest sailor with rating = 2 (and,

say, age = 80), and commits.
– T1 now locks all pages containing sailor records

with rating = 2, and finds oldest (say, age = 63).
• No consistent DB state; however T1 “correctly” gets through!

Sailors (sid: integer, sname: string, rating: integer, age: real)
Reserves (sid: integer, bid: integer, day: dates, rname: string)

The Problem

23

• T1 implicitly assumes that it has locked the set of all
sailor records with rating = 1.
– Assumption only holds if no sailor records are

added while T1 is executing!
– Need some mechanism to enforce this

assumption. (Index locking and predicate
locking.)

• Example shows that conflict serializability guarantees
serializability only if the set of objects is fixed!

Index Locking

24

§ If there is a dense index on the rating field using
Alternative (2), T1 should lock the index page containing
the data entries with rating = 1.
ØIf there are no records with rating = 1, T1 must

lock the index page where such a data entry
would be, if it existed!

§ If there is no suitable index, T1 must lock all pages, and
lock the file/table to prevent new pages from being added,
to ensure that no new records with rating = 1 are added.

r=1

Data
Index

Predicate Locking

25

• Grant lock on all records that satisfy some
logical predicate, e.g. age > 2*salary.

• Index locking is a special case of predicate
locking for which an index supports efficient
implementation of the predicate lock.

• What is the predicate in the sailor example?
• In general, predicate locking has a lot of

locking overhead.

Locking in B+ Trees

26

• How can we efficiently lock a particular leaf
node?
– Btw, don’t confuse this with multiple granularity locking!

• One solution: Ignore the tree structure, just lock
pages while traversing the tree, following 2PL.

• This has terrible performance!
– Root node (and many higher level nodes) become

bottlenecks because every tree access begins at the root.

Two Useful Observations

27

� Higher levels of the tree only direct searches for
leaf pages.

� For inserts, a node on a path from root to modified
leaf must be locked (in X mode, of course), only if
a split can propagate up to it from the modified
leaf. (Similar point holds w.r.t. deletes.)

� We can exploit these observations to design
efficient locking protocols that guarantee
serializability even though they violate 2PL.

A Simple Tree Locking Algorithm

2
8

• Search: Start at root and go down; repeatedly, S lock child
then unlock parent.

• Insert/Delete: Start at root and go down, obtaining X locks
as needed. Once child is locked, check if it is safe:

– If child is safe, release all locks on ancestors.
• Safe node: Node such that changes will not propagate up

beyond this node.
– Inserts: Node is not full.
– Deletes: Node is not half-empty.

29

ROOT
A

B

C

D E

F

G H I

20

35

20*

38 44

22* 23* 24* 35* 36* 38* 41* 44*

Do:
1) Search 38*
2) Delete 38*
3) Insert 45*
4) Insert 25*

23

A Better Tree Locking Algorithm (See
Bayer-Schkolnick paper)

30

• Search: As before.
• Insert/Delete:
– Set locks as if for search, get to leaf, and set X

lock on leaf.
– If leaf is not safe, release all locks, and restart

Xact using previous Insert/Delete protocol.
• Gambles that only leaf node will be modified; if not, S locks

set on the first pass to leaf are wasteful. In practice, better
than previous alg.

Example

31

ROOT
A

B

C

D E

F

G H I

20

35

20*

38 44

22* 23* 24* 35* 36* 38* 41* 44*

Do:
1) Delete 38*
2) Insert 25*
4) Insert 45*
5) Insert 45*,

then 46*

23

Even Better Algorithm

32

• Search: As before.
• Insert/Delete:
– Use original Insert/Delete protocol, but set IX

locks instead of X locks at all nodes.
– Once leaf is locked, convert all IX locks to X

locks top-down: i.e., starting from node nearest
to root. (Top-down reduces chances of
deadlock.)

(Contrast use of IX locks here with their use in
multiple-granularity locking.)

Hybrid Algorithm

33

• The likelihood that we really need an X lock
decreases as we move up the tree.

• Hybrid approach:
Set S locks

Set SIX locks

Set X locks

Multiversioning

• Similar to the timestamp ordering approach; but is allowed to
access “old” versions of a table.

• A history of the values and timestamps (versions) of each item
is kept.

• When the value of an item is needed, the system chooses a
proper version of the item that maintains serializability.

• This results in fewer aborted transactions at the cost of greater
complexity to maintain more versions of each item.

3/30/20 34

• We will look at a scheme, several versions X1,…,Xk of each data item
are kept. For each Xi we also keep

– read TS(Xi) – as for timestamp ordering.

– write TS(Xi) – as for timestamp ordering.

3/30/20 35

• Read and write are done as follows for a transaction P with
timestamp T.

3/30/20 36

• Note: Cascading rollback and cyclic restart problems can still occur,
but should be reduced.

• However, there is an increased overhead in maintaining multiple
versions of items.

3/30/20 37

Optimistic scheduling

• In two-phase locking, timestamp ordering, and multiversioning
concurrency control techniques, a certain degree of checking is done
before a database operation can be executed.

• The idea here is to push on and hope for the best!

• No checking is done while the transaction is executing.

3/30/20 38

• The protocol has three phases.

– read phase – A transaction can read data items from the database into
local variables. However, updates are applied only to local copies of the
data items kept in the transaction workspace.

– validation phase – checks are made to ensure that serializability is not
violated,

– write phase -if validation succeeds, updates are applied and the
transaction is committed. Otherwise, the updates are discarded and the
transaction is restarted.

3/30/20 39

• A scheme uses timestamps and keeps each transaction’s

– read-set – the set of items read by the transaction,

– write-set – the set of items written by the transaction.

• During validation, we check that the transaction does not interfere
with any transaction that is committed or currently validating.

3/30/20 40

• Each transaction T is assigned 3 timestamps:
Start(T), Validation(T), Finish(T).

• To pass the validation test for T, one of the following must be true:
– 1. Finish(S) < Start(T); or – 2. for S s.t. Start(T) < Finish(S), then a) write set of S is disjoint from the read set of T, and b) Finish(S) < Validation(T). 3/30/20 41 • Optimistic control is a good option if there is not much interaction between transactions. • Note: Our earlier treatment of recovery methods largely ignored concurrency issues. 3/30/20 42 2PL vs. TSO vs. MV vs. OP • A Comparison among two-phase locking (2PL), timestamp ordering (TSO), multiversioning (MV), optimistic (OP) concurrency control techniques. • MV should provide the greatest concurrency degree (in average). However, we need to maintain multiversions for each data item. • 2PL can offer the second greatest concurrency degree (in average); but will result in deadlocks. To resolve the deadlocks, either – need additional computation to detect deadlocks and to resolve the deadlocks, or – reduce the concurrency degree to prevent deadlocks by adding other restrictions. 3/30/20 43 2PL vs. TSO vs. MV vs. OP (cont.) • If most transactions are very short, we can use 2PL + deadlock detection and resolution. • TSO has a less concurrency degree than that of 2PL if a proper deadlock resolution is found. However, TSO does not cause deadlocks. Other problems, such as cyclic restart and cascading rollback, will appear in TSO. • If there are not much interaction between transactions, OP is a very good choice. Otherwise, OP is a bad choice. 3/30/20 44