Information Management
Transaction Management (slides provided by Prof. Samarati)
Università degli Studi di (1)
Copyright By PowCoder代写 加微信 powcoder
Transaction
• Elementary unit of work executed by an application, characterized by
specific properties of correctness, robustness, and isolation Transactional system
• Systems that provide a mechanism for the definition and execution of transactions by different concurrent transactions
Transaction (2)
Syntactically
• A transaction is a procedure included between two commands • begin transaction (bot)
• end transaction (eot)
• Within a transaction, one of the following commands is executed exactly once
• commit work (commit): to terminate with success
• rollback work (abort): to undo the procedure execution
Transaction (3)
A transaction is said to be well formed if
• it starts with begin transaction
• or equivalent, depending on the language
• it ends with end transaction
• it includes only one between commands commit and rollback
• there is access operation (read/write) to the database after the execution of command commit or rollback
Transaction: example (1)
begin transaction x := x – 10 y := y + 10
z := z – y if z < 50
then commit work
else rollback
end transaction
Transaction: example (2)
begin transaction
update Account
set Total = Total - 100 where AccountNum= ‘123456’
update Account
set Total = Total + 100 where AccountNum = ‘123457’
commit work
end transaction
Transactions: properties
Transactions should satisfy ACID properties
• Atomicity
• Consistency • Isolation
• Durability
A transaction is an atomic unit of work
• It cannot leave the database in an intermediate state
• a failure or an error before commit causes the UNDO of the work done up to that point
• a failure or an error after commit may require to REDO the work previously done, if its permanence on the database is not guaranteed
Possible transaction:
• commit: usual behavior (99.9%)
• rollback requested by the application: suicide • rollback requested by the system: homicide
Consistency
The execution of a transaction should not violate integrity constraints defined for the database
• Integrity checks can be:
• immediate: during the transaction (the operation causing the violation is
• deferred: at the end of the transaction (if constraints are violated, the whole transaction is refused)
The execution of a transaction should be independent from the execution of any other concurrent transaction
• Concurrent execution of a set of transactions should lead to the same result as an arbitrary sequential execution of the same transactions
Durability
The effects of a transaction that executed commit should not be lost
• The system should guarantee persistency of data even in case of failures and malfunctioning
Transactions and system modules (1)
ACID properties are checked and guaranteed by DBMS modules
• Atomicity: reliability manager • Consistency: DDL compiler
• generates the verification activities enforced at transaction execution time • Isolation: concurrency manager
• Durability: reliability manager
Transactions and system modules (2)
• Transaction manager
• Coordinates activities related with transactions through the execution of
begin transaction, commit, and rollback operations • Reliability manager
• Guarantees atomicity and persistency by interacting with
• Access methods manager (to keep track of requested operations)
• Buffer manager (to request, whenever necessary, write operations on disks)
• Concurrency manager
• Manages isolation by filtering and possibly reorganizing accesses (requested by the access manager)
Transactions and system modules (3)
Query and update manager
Access methods manager
Concurrency manager
Transaction manager
Buffer manager
Reliability manager
Secondary storage manager
Secondary storage
Reliability manager
Responsible for:
• Execution of transactional commands • begin transaction (B)
• commit (C)
• rollback (A, for ‘Abort’)
• Restart after malfunctions • warm restart
• cold restart
Guarantees atomicity and persistency
Persistent storage
Failure resistant storage
• It is an abstraction
• No storage device can guarantee 0% failure probability
• Replication and robust write protocols can provide failure probability close to 0%
• Organized in different ways depending on the critical aspects of the application, for instance:
• atapeandadisk
• two mirrored disks
A failure of persistent storage is considered catastrophic and impossible
Sequential file that registers, in chronological order, the actions executed by different transactions
• Written on persistent storage (permanent archive)
• Managed by the reliability manager
• Enables undo and redo operations
• metaphors: Arianna’s wire, Hansel and Gretel’s pieces of bread, ...
Log organization (1)
Sequential file
• Records in chronological order
Log organization (2)
Two kinds of records:
• transaction
• begin, B(T)
• insert, I(T,O,AS)
• delete, D(T,O,BS)
• update, U(T,O,BS,AS) • commit, C(T)
• abort, A(T)
• dump, (rare)
• checkpoint, (more frequent)
Log organization (3)
Checkpoint Checkpoint BUUC
Records of transaction T
Top of the log
Log organization (4)
• Transaction records include, for operations (insert, delete, update)
• before state (BS)
• State of the object before the operation
• after state (AS)
• State of the object afer the operation
Þ it is possible to perform undo and redo of operations
To undo an action:
• update: undo(U(T,O,BS,AS))
• copies value BS into object O • delete: undo(D(T,O,BS))
• recovers object O with value BS
• insert: undo(I(T,O,AS)) • deletes object O
To redo an action:
• update: redo(U(T,O,BS,AS))
• copies value AS into object O • delete: redo(D(T,O,BS))
• deletes object O
• insert: redo(I(T,O,AS))
• recovers object O with value AS
Undo and Redo: idempotency
Undo and redo have
• idempotency property: an arbitrary number of undo/redo of the same action is equivalent to executing once the undo/redo of the action
• undo(A) ... undo(A) = undo(A) • redo(A) ... redo(A) = redo(A)
Idempotency guarantees correctness even in case of repeated execution of undo and redo operations
Checkpoint
System operation executed by the reliability manager with the coordination of the buffer manager
• Logs active transactions
• Updates secondary storage according to all the completed
transactions
• It is executed periodically
Checkpoint execution
• Suspends accepting write, commit and abort operations from transactions
• Transfers (force) to secondary storage all the dirty pages of the buffer relative to transactions that have already committed
• Writes on the log in a synchronous manner (force) a checkpoint record CK(T1,...,Tn) containing the identifiers of active transactions
• Restarts accepting operations from transactions
Compete copy of the database stored on persistent storage (backup) • Created when the system is not operating
• inmutualexclusionwithalltheothertransactions • Typically on tape
At the end of the dump, a dump record (DUMP) is written in the log to signal that a backup has been executed at a given point in time and that identifies the copy
Log record write
Must obey two rules:
• Write Ahead Log
• Commit-Precedence
Write Ahead Log
The before state part of the log record should be written in the log before executing the corresponding operation on the database
• permits to undo write operations executed by transactions that did not commit
• permits to recover in case of failure after the execution of the operation on the database
• if the log was not written before the operation, the preceding value would be lost
Commit-precedence
The after state part of the log record should be written in the log before the commit
• permits to redo write operations that have already been decided by transactions that have committed, but whose updated pages have not been yet written on secondary storage by the buffer manager
Write log records
Even if the rules refer to before state and after state, practically the components of log records are written at the same time
A simplified version of the rules requires that logs are written: • before the corresponding records in the database
• before executing commit
Commit record
Written, in a synchronous manner (force), in the log record by the transactions that chose to terminate successfully
• Failure before commit
• undo of the executed actions and restore of the initial status of the database
• Failure after commit
• redo of the actions to reconstruct the final status of the transaction
Abort record
Defines the choice of abort (produced by system)
• Without modifying the decisions of the reliability manager, it can be: • written asynchronously in the buffer containing the current block of the log • later asynchronously rewritten on the log (flush)
Combined writing: log and database (1)
We distinguish three schemas, depending on whether the updates on the database performed by a transaction are executed (force of the buffer manager)
• Before commit
• After commit
• Some before and some after commit
Combined writing: log and database (2)
Database modified before commit
• no need of redo operations
B(T) U(T,X,BS,AS) U(T,Y,BS,AS) C(T)
Writes on the log
Writes on the database
Combined writing: log and database (3)
Database modified after commit
• no need of undo operations
B(T) U(T,X,BS,AS) U(T,Y,BS,AS) C(T)
Combined writing: log and database (4)
Database modified at any time (before and after) commit
• requires both undo and redo operations
• It is most commonly used because it allows the buffer manager to
optimize flush operations independently from the reliability manager
B(T) U(T,X,BS,AS) U(T,Y,BS,AS) C(T)
Two classes
• System failures: software bugs (e.g., of the operative system) or interruption of the working of devices (e.g., electric power failure)
• Device failures: failures of mass storage devices (e.g., scratches of the disk)
System failures
Software bugs (e.g., of the operative system) or interruption of the working of devices (e.g., electric power failure)
• Central memory (and all buffers) content is lost • The content of secondary storage is not lost
Þ warm restart
Device failures
Failures of mass storage devices (e.g., scratches of the disk)
• Central memory (and all buffers) content is lost • Secondary memory content is lost
• Persistent storage content is not lost
Þ cold restart
Failure manager
Fail-stop model: if the system identifies a failure
• forces the complete stop of transactions
• forces the correct working of the operative system (boot)
• performs a restart
• warm (for system failures) • cold (for device failures)
Þ becomes usable again • empty buffer
Fail-stop model
Normal Working
End recovery
Restart: classification of transactions
Considering a failure, there are two classes of transactions
• committed
• their actions should be repeated (redo)
• uncommitted
• their actions should be undone (undo)
Warm restart (1)
Phase 1: determine the set of transactions that need to be undone (UNDO) and repeated (REDO)
• read the log backward from the end till the most recent checkpoint
• initialize UNDO e REDO
• UNDO := transactions in the checkpoint • REDO := Æ
• read the log forward:
• for each B(T) Þ UNDO := UNDO È {T}
• for each C(T) Þ UNDO := UNDO - {T} REDO := REDO È {T}
Warm restart (2)
Phase 2: recovery
• read the log backward
• for each action A of transactions in UNDO Þ undo(A)
go back till the first action of the oldest transaction in UNDO È REDO
• read the log forward
• for each action A of transactions in REDO Þ redo(A)
Warm restart (3)
• atomicity: all the active transactions at the time of failure leave the database in the initial or final status
• persistency: all the pages of active transactions are written on secondary storage
Warm restart: example (1)
B(T1), B(T2), I(T2,O1,A1), B(T3), I(T3,O2,A2), D(T1,O3,B3), B(T4), U(T3,O2,B4,A4), I(T4,O4,A5), U(T4,O2,B6,A6), C(T2), CK(T1,T3,T4), C(T4), B(T5), D(T5,O4,B7), U(T1,O2,B8,A8), A(T3), C(T1), failure
• determine UNDO and REDO sets
Warm restart: example (2)
B(T1), B(T2), I(T2,O1,A1), B(T3), I(T3,O2,A2), D(T1,O3,B3), B(T4), U(T3,O2,B4,A4), I(T4,O4,A5), U(T4,O2,B6,A6), C(T2), CK(T1,T3,T4), C(T4), B(T5), D(T5,O4,B7), U(T1,O2,B8,A8), A(T3), C(T1), failure
• UNDO {T3,T5}
D(T5,O4,B7) U(T3,O2,B4,A4) I(T3,O2,A2)
insert O4, O4:=B7 O2:=B4
Warm restart: example (3)
B(T1), B(T2), I(T2,O1,A1), B(T3), I(T3,O2,A2), D(T1,O3,B3), B(T4), U(T3,O2,B4,A4), I(T4,O4,A5), U(T4,O2,B6,A6), C(T2), CK(T1,T3,T4), C(T4), B(T5), D(T5,O4,B7), U(T1,O2,B8,A8), A(T3), C(T1), failure
• REDO {T1,T4}
D(T1,O3,B3) I(T4,O4,A5)
U(T4,O2,B6,A6) U(T1,O2,B8,A8)
insert O4, O4:=A5
O2:=A6 O2:=A8
Cold restart
Phase 1: recover the database in the same status as at the time of failure
• access the dump and selectively copy all the damaged parts of the database
• access the most recent dump record registered in the log
• read the log forward applying, for the damaged part, both the actions
of the database and commit/abort actions
• Execute warm restart
Concurrency control (1)
Permits the concurrent execution different transactions
• Crucial in information systems with high load • banks,financialinstitution,airlinebooking
• Permits to efficiently use the DBMS
• Maximizing the number of served transactions • Minimizing response times
• Application load measured in transactions per second (tps) • Typical values: from tens to thousands
Concurrency control (2)
Mediates access requests to the data
• Decides to grant or deny requests
• Establishes the order of accesses (scheduler)
Concurrency control (3)
For the sake of lectures, we consider two simplifying assumptions:
• Databases in terms of abstract objects x, y, z with numerical values
• Read/write operations on object x, as read/write of the whole page where x is stored
Concurrency control: architecture
read, write begin, commit, abort Lock table
read, write
(not all and possibly in different order)
Access methods manager
Concurrency manager
Transaction manager
Secondary storage manager
Concurrency control (4)
The concurrent execution of different transactions can cause correctness issues and anomalies
• Update loss
• Dirty read
• Inconsistent reads • Ghost update
• Ghost insert
Update loss
Updates by a transaction lost because overwritten by a concurrent transaction
Transaction t1
x := x + 1
w1(x) commit
Transaction t2
x := x + 1
w2(x) Initial value: 2
commit Final value: 3 instead of 4
(the update by t1 is lost)
Dirty read
A transaction reads the intermediate result of another transaction that then aborts (whose modification is then cancelled)
Transaction t1
x := x + 1 w1(x)
Transaction t2
x := x + 1
w2(x) commit
t2 reads an intemediate state which is then cancelled
Inconsistent reads
A transaction reads objects that another transaction is modifying: some read operations are before, others are after the updates
Transaction t1
bot r1(x) w1(y)
r1(x) w1(z) commit
Transaction t2
x := x + 1 w2(x) commit
t1 reads different values for x
Ghost update
A transaction observes only a subset of the effects of another transaction (observing a status of the data that does not satisfy
integrity constraints)
Transaction t1 bot
Transaction t2 bot
y := y - 100 r2(z)
z := z + 100 w2(y)
w2(z) commit
Constraint: x+y+z=1000;
s=1100 but the constraint is satisfied
r1(x) r1(y)
s := x + y + z commit
Ghost insert
A transaction evaluates twice an aggregated value about a set of elements in a selection condition
select avg(Mark) from Exam
where Subject=‘IM’
If a new tuple is inserted between an evaluation and the subsequent one, the results may be different
The anomaly cannot be prevented referring only to already known data
Concurrency control theory
Formally a transaction
• is a sequence of read and write operations
• has a transaction identifier assigned by the system
• starts with begin transaction and ends with end transaction (omitted)
t1: r1(x) r1(y) w1(x) w1(y)
Sequence of input/output operations presented by concurrent transactions
• all the operations of each transaction that committed must appear in the schedule
• for each transaction, operations must appear in the schedule in the same order as in the transaction
For the study:
• we consider the commit-projection and ignore transactions that
(simplifying assumption, not acceptable in practice, will be removed afterwards)
Schedule: example
Transactions
• t1 : r1(x) w1(x) r1(y) w1(y) • t2 : r2(y) w2(y)
Some possible schedules
• S1 : r1(x) w1(x) r1(y) w1(y) r2(y) w2(y) • S2 : r2(y) w2(y) r1(x) w1(x) r1(y) w1(y) • S3 : r1(x) r2(y) w1(x) r1(y) w2(y) w1(y) • S4 : r2(y) r1(x) w1(x) r1(y) w1(y) w2(y)
Concurrency control
• Avoids schedules causing anomalies
• Managed by a module that accepts or refuses operations requested
by transactions (scheduler)
• Based on the identification of classes of acceptable schedules based on definitions of equivalence
• serial schedule
• serializable schedule
Serial and serializable schedules
Serial schedule
• For each transaction ti in the schedule, all the operations in ti are
executed consecutively
• n transactions Þ n! possible serial schedules
Serializable schedule
• non-serial schedule that produces the same result as a serial schedule of the same transactions
• need definition of equivalence among schedules
• progressive concepts: view-equivalence, conflict-equivalence, two-phase locking, timestamp-based
Serial schedule: example
Transactions
• t1 : r1(x) w1(x) r1(y) w1(y) • t2 : r2(y) w2(y)
Serial schedules:
• S1 : r1(x) w1(x) r1(y) w1(y) r2(y) w2(y) • S2 : r2(y) w2(y) r1(x) w1(x) r1(y) w1(y)
Possible schedules:
• S3 : r1(x) r2(y) w1(x) r1(y) w2(y) w1(y) • S4 : r2(y) r1(x) w1(x) r1(y) w1(y) w2(y)
View-serializability (1)
Preliminary definitions
• ri(x) reads-from wj(x) in a schedule S if:
• wj(x) precedes ri(x) in S
• there is no wk(x) between wj(x) and ri(x) in S
• wi(x) is a final write in a schedule S if: • itisthelastwriteonxinS
S: r1(x) w1(x) w1(y) r2(x) w2(y) • r2(x) read-from w1(x)
• w1(x) final write on x
• w2(y) final write on y
View-serializability (2)
View-equivalence
• two schedules Si and Sj (with i ≠ j) are view-equivalent (Si ov Sj) if they have
• the same read-from relations • the same final writes
View-serializability
• a schedule is view-serializable if it is • view-equivalent to a serial schedule
We denote with VSR the set of view-serializable schedules
View-serializability: example (1)
S: w0(x) r2(x) r1(x) w2(x) w2(z)
• read-from:
• r2(x) ¬ w0(x)
• r1(x) ¬ w0(x)
• final write: • x: w2(x) • z: w2(z)
• view-equivalent to the serial schedule w0(x) r1(x) r2(x) w2(x) w2(z)
Þ view-serializable
View-serializability: example (2)
S: w0(x) r1(x) w1(x) r2(x) w1(z)
• read-from:
• r1(x) ¬ w0(x)
• r2(x) ¬ w1(x)
• final write: • x: w1(x) • z: w1(z)
• view-equivalent to the serial schedule w0(x) r1(x) w1(x) w1(z) r2(x)
Þ view-serializable
View-serializability: example (3)
S: r1(x) r2(x) w2(x) w1(x)
• read-from: • r1(x) ¬
• final write: • x: w1(x)
Þ not view-serializable (loss of update)
View-serializability: example (4)
S: r1(x) r2(x) w2(x) r1(x)
• read-from: • r1(x) ¬
• r1(x) ¬ w2(x) • final write:
• x: w2(x)
Þ not view-serializable (inconsistent read)
View-serializability: example (5)
S: r1(x) r1(y) r2(z) r2(y) w2(y) w2(z) r1(z)
• read-from: • r1(x) ¬ • r1(y) ¬ • r2(z) ¬
• r1(z) ¬ w2(z)
• final write: • y: w2(y) • z: w2(z)
Þ not view-serializable (ghost update)
View-seria
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com