IT代写 Concurrency control (1)

Concurrency control (1)
Permits the concurrent execution different transactions
• Crucial in information systems with high load • banks,financialinstitution,airlinebooking
• Permits to efficiently use the DBMS

Copyright By PowCoder代写 加微信 powcoder

• Maximizing the number of served transactions • Minimizing response times
• Application load measured in transactions per second (tps) • Typical values: from tens to thousands

Concurrency control (2)
Mediates access requests to the data
• Decides to grant or deny requests
• Establishes the order of accesses (scheduler)

Concurrency control (3)
For the sake of lectures, we consider two simplifying assumptions:
• Databases in terms of abstract objects x, y, z with numerical values
• Read/write operations on object x, as read/write of the whole page where x is stored

Concurrency control: architecture
read, write begin, commit, abort Lock table
read, write
(not all and possibly in different order)
Access methods manager
Transaction manager
Concurrency manager
Secondary storage manager

Concurrency control (4)
The concurrent execution of different transactions can cause correctness issues and anomalies
• Update loss
• Dirty read
• Inconsistent reads • Ghost update
• Ghost insert

Update loss
Updates by a transaction lost because overwritten by a concurrent transaction
Transaction t1
x := x + 1
w1(x) commit
Transaction t2
x := x + 1
w2(x) Initial value: 2
commit Final value: 3 instead of 4
(the update by t1 is lost)

Dirty read
A transaction reads the intermediate result of another transaction that then aborts (whose modification is then cancelled)
Transaction t1
x := x + 1 w1(x)
Transaction t2
x := x + 1
w2(x) commit
t2 reads an intemediate state which is then cancelled

Inconsistent reads
A transaction reads objects that another transaction is modifying: some read operations are before, others are after the updates
Transaction t1
bot r1(x) w1(y)
r1(x) w1(z) commit
Transaction t2
x := x + 1 w2(x) commit
t1 reads different values for x

Ghost update
A transaction observes only a subset of the effects of another transaction (observing a status of the data that does not satisfy
integrity constraints)
Transaction t1 bot
Transaction t2 bot
y := y – 100 r2(z)
z := z + 100 w2(y)
w2(z) commit
r1(x) r1(y)
Constraint: x+y+z=1000;
s=1100 but the constraint is satisfied
s := x + y + z commit

Ghost insert
A transaction evaluates twice an aggregated value about a set of elements in a selection condition
select avg(Mark) from Exam
where Subject=‘IM’
If a new tuple is inserted between an evaluation and the subsequent one, the results may be different
The anomaly cannot be prevented referring only to already known data

Concurrency control theory
Formally a transaction
• is a sequence of read and write operations
• has a transaction identifier assigned by the system
• starts with begin transaction and ends with end transaction (omitted)
t1: r1(x) r1(y) w1(x) w1(y)

Sequence of input/output operations presented by concurrent transactions
• all the operations of each transaction that committed must appear in the schedule
• for each transaction, operations must appear in the schedule in the same order as in the transaction
For the study:
• we consider the commit-projection and ignore transactions that
(simplifying assumption, not acceptable in practice, will be removed afterwards)

Schedule: example
Transactions
• t1 : r1(x) w1(x) r1(y) w1(y) • t2 : r2(y) w2(y)
Some possible schedules
• S1 : r1(x) w1(x) r1(y) w1(y) r2(y) w2(y) • S2 : r2(y) w2(y) r1(x) w1(x) r1(y) w1(y) • S3 : r1(x) r2(y) w1(x) r1(y) w2(y) w1(y) • S4 : r2(y) r1(x) w1(x) r1(y) w1(y) w2(y)

Concurrency control
• Avoids schedules causing anomalies
• Managed by a module that accepts or refuses operations requested
by transactions (scheduler)
• Based on the identification of classes of acceptable schedules based on definitions of equivalence
• serial schedule
• serializable schedule

Serial and serializable schedules
Serial schedule
• For each transaction ti in the schedule, all the operations in ti are
executed consecutively
• n transactions Þ n! possible serial schedules
Serializable schedule
• non-serial schedule that produces the same result as a serial schedule of the same transactions
• need definition of equivalence among schedules
• progressive concepts: view-equivalence, conflict-equivalence, two-phase locking, timestamp-based

Serial schedule: example
Transactions
• t1 : r1(x) w1(x) r1(y) w1(y) • t2 : r2(y) w2(y)
Serial schedules:
• S1 : r1(x) w1(x) r1(y) w1(y) r2(y) w2(y) • S2 : r2(y) w2(y) r1(x) w1(x) r1(y) w1(y)
Possible schedules:
• S3 : r1(x) r2(y) w1(x) r1(y) w2(y) w1(y) • S4 : r2(y) r1(x) w1(x) r1(y) w1(y) w2(y)

View-serializability (1)
Preliminary definitions
• ri(x) reads-from wj(x) in a schedule S if:
• wj(x) precedes ri(x) in S
• there is no wk(x) between wj(x) and ri(x) in S
• wi(x) is a final write in a schedule S if: • itisthelastwriteonxinS
S: r1(x) w1(x) w1(y) r2(x) w2(y) • r2(x) read-from w1(x)
• w1(x) final write on x
• w2(y) final write on y

View-serializability (2)
View-equivalence
• two schedules Si and Sj (with i ≠ j) are view-equivalent (Si ov Sj) if they have
• the same read-from relations • the same final writes
View-serializability
• a schedule is view-serializable if it is • view-equivalent to a serial schedule
We denote with VSR the set of view-serializable schedules

View-serializability: example (1)
S: w0(x) r2(x) r1(x) w2(x) w2(z)

View-serializability: example (1)
S: w0(x) r2(x) r1(x) w2(x) w2(z)
• read-from:
• r2(x) ¬ w0(x)
• r1(x) ¬ w0(x)
• final write: • x: w2(x) • z: w2(z)
• view-equivalent to the serial schedule w0(x) r1(x) r2(x) w2(x) w2(z)
Þ view-serializable

View-serializability: example (2)
S: w0(x) r1(x) w1(x) r2(x) w1(z)

View-serializability: example (2)
S: w0(x) r1(x) w1(x) r2(x) w1(z)
• read-from:
• r1(x) ¬ w0(x)
• r2(x) ¬ w1(x)
• final write: • x: w1(x) • z: w1(z)
• view-equivalent to the serial schedule w0(x) r1(x) w1(x) w1(z) r2(x)
Þ view-serializable

View-serializability: example (3)
S: r1(x) r2(x) w2(x) w1(x)

View-serializability: example (3)
S: r1(x) r2(x) w2(x) w1(x)
• read-from: • r1(x) ¬
• final write: • x: w1(x)
Þ not view-serializable (loss of update)

View-serializability: example (4)
S: r1(x) r2(x) w2(x) r1(x)

View-serializability: example (4)
S: r1(x) r2(x) w2(x) r1(x)
• read-from: • r1(x) ¬
• r1(x) ¬ w2(x) • final write:
• x: w2(x)
Þ not view-serializable (inconsistent read)

View-serializability: example (5)
S: r1(x) r1(y) r2(z) r2(y) w2(y) w2(z) r1(z)

View-serializability: example (5)
S: r1(x) r1(y) r2(z) r2(y) w2(y) w2(z) r1(z)
• read-from: • r1(x) ¬ • r1(y) ¬ • r2(z) ¬
• r1(z) ¬ w2(z)
• final write: • y: w2(y) • z: w2(z)
Þ not view-serializable (ghost update)

VSR: exercise
w1(x) r1(x) w2(y) r3(x) w1(t) r2(y) r3(t) w3(x) w2(x) r2(t)

View-serializability: complexity
• Deciding if two schedules are view-equivalent: polynomial cost
• Deciding view-serializability of an arbitrary schedule: NP-hard problem
• it is necessary to compare the schedule with all the possible serial schedules
Note: view-equivalence not usable in practice due to its complexity • It is necessary to define more restrictive but practicable conditions

Conflict between operations
• Two operations ai and aj (i ≠ j) are in conflict if • they belong to two different transactions
• they access the same object
• at least one is a write
• read-write conflicts (rw o wr) • write-write conflicts (ww)

Conflict: example
t1: r1(x) r1(y) w1(x) w1(y) t2: r2(y) r2(x) w2(y)
Conflicts:
• w1(x), r2(x) • w1(y), r2(y) • w1(y), w2(y) • r1(y), w2(y)

Conflict-serializability
Conflict-equivalence
• two schedules Si and Sj (i ≠ j) are conflict-equivalent (Si oC Sj) if
• they include the same operations
• each pair of conflicting operations appear in the same order in both schedules
Conflict-serializability
• a schedule is conflict-serializable if it is • conflict-equivalent to a serial schedule
We use notation CSR to denote the set of conflict-serializable schedules

Conflict-serializability: example
S: w0(x) r1(x) w0(z) r1(z) r2(x) r3(z) w3(z) w1(x)

Conflict-serializability: example
S: w0(x) r1(x) w0(z) r1(z) r2(x) r3(z) w3(z) w1(x)
• conflicts:
• w0(x), r1(x)
• w0(x), r2(x) • w0(x), w1(x) • w0(z), r1(z) • w0(z), r3(z) • w0(z), w3(z) • r1(z), w3(z) • r2(x), w1(x)
Conflict-equivalent to serial schedule w0(x) w0(z) r2(x) r1(x) r1(z) w1(x) r3(z) w3(z)

Conflict-serializability: conflict graph
Could be evaluated building the conflict graph:
• each transaction is represented by a node
• for each pair ai and aj of operations in conflict such that ai precedes aj in the schedule, there is an edge from ti (transaction of ai) to tj (transaction of aj )
A schedule is conflict-serializable if and only if the graph is acyclic: it is conflict-equivalent to all the topological orders of the graph

Conflict graph: example
S: w0(x) r1(x) w0(z) r1(z) r2(x) r3(z) w3(z) w1(x)
• conflicts:
• w0(x), r1(x)
• w0(x), r2(x) • w0(x), w1(x) • w0(z), r1(z) • w0(z), r3(z) • w0(z), w3(z) • r1(z), w3(z) • r2(x), w1(x)

Conflict graph: example
S: w0(x) r1(x) w0(z) r1(z) r2(x) r3(z) w3(z) w1(x)
• conflicts:
• w0(x), r1(x)
• w0(x), r2(x) • w0(x), w1(x) • w0(z), r1(z) • w0(z), r3(z) • w0(z), w3(z) • r1(z), w3(z) • r2(x), w1(x)
Conflict-equivalent to serial schedule w0(x) w0(z) r2(x) r1(x) r1(z) w1(x) r3(z) w3(z)

CSR vs VSR
CSR Í VSR
• SÎCSRÞSÎVSR • SÏVSRÞSÏCSR
• SÏCSRÞdonotknowifVSR • SÎVSRÞdonotknowifCSR

CSR vs VSR: example
t1 : r1(x) w1(x)
t2 : w2(x) blind write t3 : w3(x) blind write
S : r1(x) w2(x) w1(x) w3(x) • S is view-serializable
• view-equivalent to t1, t2, t3 • S is not conflict-serializable

Conflict-serializability: complexity
• Deciding if a schedule is conflict-serializable has linear cost in the size of the graph
• However, it is practically too expensive, especially in case of distributed data
• The mechanisms adopted in practice are based on more restrictive, but also more efficient, conditions

VSR-CSR: exercise (1)
• r1(x) w2(y) w2(x) w1(y) r3(x) r3(y) w3(x) w3(y)

VSR-CSR: exercise (2)
• w2(x) r2(y) r1(x) r3(y) r2(z) w3(x) w4(y) w1(z) w4(z)

VSR-CSR: exercise (3)
• r1(x) w1(t) r3(y) w3(y) w1(x) w1(y) r2(t) r3(z) w3(t) r2(x) w2(z) w4(t)

VSR-CSR: exercise (4)
• w2(y) r3(z) w2(x) r1(y) w3(z) r4(x) w1(z) w3(x) w1(x)

Concurrency control
Systems used in practice do not evaluate serializability a-posteriori, but operate in such a way to guarantee such a property
Possible techniques:
• two phase locking
• timestamp
• mono-version • multi-version

Variable describing the status of data wrt the operations that can be executed over it
• two states (binary) lock • unlocked (0)
• locked (1)
• three states lock • unlocked
• r_locked (read, shared)
• w_locked (write, exclusive)

Locking: two states
• To access data, the transaction must ask a lock on the data
• The transaction releases the lock (unlock) when it does not need the
data anymore
• A transaction can access data only if it obtained the corresponding lock
A transaction that follows these rules is said to be well formed according to locking

Locking: three states
• To access data for reading it, the transaction must ask a r_lock on the data (shared)
• To access data for writing it, the transaction must ask a w_lock on the data (exclusive)
• The transaction releases the lock (unlock) when it does not need the data anymore
• A transaction can access data only if it obtained the corresponding lock
A transaction that follows these rules is said to be well formed according to locking

Lock management
Module that acts as lock manager
• Receives lock requests
• Grants or denies lock requests based on conflict tables
• If the request is denied, the requesting transaction is in wait status until the
lock is granted
• Properly modifies lock tables (status and possibly counter)

Conflict table: two states lock

Conflict table: three states lock
OK (c=c+1) r_locked
no* r_locked
OK (c=c-1) c=0: unlocked c>0: r_locked
*: unless the case of upgrade of an exclusive r_lock

Two phase locking
A transaction, after releasing a lock, cannot acquire other locks
• Each transaction has two phases • Increasing (acquires locks)
• Decreasing (releases locks) Guarantees serializability
Resource requests
increasing phase
decreasing phase

2PL schedule
The schedules produced by a scheduler using
• well formed transactions according to locking • conflict-based lock management
• two phase locking
belong to the 2PL class

2PL vs CSR vs VSR
2PL Í CSR Í VSR
• SÎ2PLÞSÎCSR • SÏCSRÞSÏ2PL
• SÏ2PLÞdonotknowifCSR • SÎCSRÞdonotknowif2PL

2PL vs CSR: example
S: r1(x) w1(x) r2(x) w2(x) r3(y) w1(y) • conflict-serializable
• conflict-equivalent to t3, t1, t2 • not 2PL

2PL: exercise (1)
• r1(x) w1(z) w2(t) r3(y) r4(y) w3(t) r1(z) w2(x) r4(x) w4(y)

2PL: exercise (2)
• r1(x) w3(t) r2(y) r2(z) w2(y) w1(z) r3(k) w2(t) w1(y)

Strict two phase locking
The scheduler operates without knowing how transactions will end, hence:
• we need to remove the hypothesis of using a commit-projection • we add a restriction to 2PL
Strict 2PL (used by commercial DBMSs)
• a transaction, after releasing a lock, cannot acquire other locks
• the locks of a transaction can be released only after commit/abort operations

Variations over 2PL
• Base 2PL
• a transaction, after releasing a lock, cannot acquire other locks
• Strict 2PL (prevents dirty reads) • base 2PL
• a transaction can release locks only after commit/abort
• Conservative 2PL (guarantees absence of deadlock) • base 2PL
• a transaction must acquire all the locks before starting executing operations

Concurrency control
Systems used in practice do not evaluate serializability a-posteriori, but operate in such a way to guarantee such a property
Possible techniques:
• two phase locking
• timestamp
• mono-version
• multi-version

Identifier that defines a total order among the temporal events of the system
• the timestamp of an event is greater than the timestamps of the events that preceded it
• no two events have the same timestamp
Possible implementations: • counter
• system clock

Transactions and timestamp
Each transaction is associated with a timestamp (ts)
• assigned by the system at the beginning of the transaction • represents the time at which the transaction started
A schedule is accepted only if it it reflects the serial order of transactions based on the timestamp value of the transactions

Timestamp (1)
Each object x has two indicators:
• RTM(x): max timestamp among transactions that have read x
• WTM(x): max timestamp among transactions that have written x

Timestamp (2)
Answer of the scheduler to requests
• read(x, ts)
• ts < WTM(x) à request denied, transaction aborted • Otherwiseàrequest accepted; RTM(x) := max(RTM(x),ts) • write(x, ts) • ts < WTM(x) or ts < RTM(x)àrequest denied, transaction aborted • Otherwiseàrequest accepted; WTM(x) := ts Timestamp: example write(x,8) write(x,11) read(x,10) Timestamp: example write(x,8) no (t8 killed) write(x,11) read(x,10) no (t10 killed) Timestamp: exercise write(x,7) read(x,10) write(x,8) read(x,13) read(x,11) Concurrency control with timestamps • there is no lock request on resources Þ free from deadlock • may kill many transactions • behaves correctly only under the commit-projection assumption • to remove the assumption it is necessary to buffer write operations (write on persistent memory only after commit) • read operations of buffer data wait for commit of the writing transaction Timestamp and multiversion Each write operation of an object generates a new copy (version) of the same • read operations requested by ‘old’ transactions are executed on old versions (instead of being denied and transactions aborted) For each object x: • there are multiple versions xi each with its own RTM and WTM • each write operation generates a new version • each read operation operates on the version created by the most recent write operation that precedes the read operation Transactions and timestamp with multiversion Answer of the scheduler to requests • read(x, ts) • request always accepted • read version xk such that WTM(xk) = max{WTM(xi) | WTM(xi) ≤ ts } • RTM(xk):= max(RTM(xk),ts) • write(x, ts) • find version xj such that WTM(xj) = max{WTM(xi) | WTM(xi) ≤ ts } • ts < RTM(xj)àrequest denied, transaction aborted • Otherwiseàrequest accepted; create new version xk and RTM(xk):=ts WTM(xk):=ts Timestamp and multiversion: example write(x,7) write(x,12) read(x,14) write(x,13) write(x,10) read(x,11) Timestamp and multiversion: example write(x,7) no (t7 killed) write(x,12) read(x,14) write(x,13) no (t13 killed) write(x,10) read(x,11) Timestamp and multiversion: exercise write(x, 7) read(x, 15) write(x, 10) read(x, 8) write(x, 18) read(x, 6) write(x, 16) TSvs2PLvsCSRvsVSR • SÎTSÞSÎCSR • SÏCSRÞSÏTS • SÏTSÞdonotknowifCSR • SÎCSRÞdonotknowifTS transactions with denied actions killed and restarted serialization order implied by conflicts implied by timestamp wait for commit strict 2PL write buffering It is more expensive to re-execute transactions than put them in wait Þ 2PL is better 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com