代写代考 Information Management

Information Management
Transaction Management (slides provided by Prof. Samarati)

Università degli Studi di (1)

Transaction
• Elementary unit of work executed by an application, characterized by
specific properties of correctness, robustness, and isolation Transactional system
• Systems that provide a mechanism for the definition and execution of transactions by different concurrent transactions

Transaction (2)
Syntactically
• A transaction is a procedure included between two commands • begin transaction (bot)
• end transaction (eot)
• Within a transaction, one of the following commands is executed exactly once
• commit work (commit): to terminate with success
• rollback work (abort): to undo the procedure execution

Transaction (3)
A transaction is said to be well formed if
• it starts with begin transaction
• or equivalent, depending on the language
• it ends with end transaction
• it includes only one between commands commit and rollback
• there is access operation (read/write) to the database after the execution of command commit or rollback

Transaction: example (1)
begin transaction x := x – 10 y := y + 10
z := z – y if z < 50 then commit work else rollback end transaction Transaction: example (2) begin transaction update Account set Total = Total - 100 where AccountNum= ‘123456’ update Account set Total = Total + 100 where AccountNum = ‘123457’ commit work end transaction Transactions: properties Transactions should satisfy ACID properties • Atomicity • Consistency • Isolation • Durability A transaction is an atomic unit of work • It cannot leave the database in an intermediate state • a failure or an error before commit causes the UNDO of the work done up to that point • a failure or an error after commit may require to REDO the work previously done, if its permanence on the database is not guaranteed Possible transaction: • commit: usual behavior (99.9%) • rollback requested by the application: suicide • rollback requested by the system: homicide Consistency The execution of a transaction should not violate integrity constraints defined for the database • Integrity checks can be: • immediate: during the transaction (the operation causing the violation is • deferred: at the end of the transaction (if constraints are violated, the whole transaction is refused) The execution of a transaction should be independent from the execution of any other concurrent transaction • Concurrent execution of a set of transactions should lead to the same result as an arbitrary sequential execution of the same transactions Durability The effects of a transaction that executed commit should not be lost • The system should guarantee persistency of data even in case of failures and malfunctioning Transactions and system modules (1) ACID properties are checked and guaranteed by DBMS modules • Atomicity: reliability manager • Consistency: DDL compiler • generates the verification activities enforced at transaction execution time • Isolation: concurrency manager • Durability: reliability manager Transactions and system modules (2) • Transaction manager • Coordinates activities related with transactions through the execution of begin transaction, commit, and rollback operations • Reliability manager • Guarantees atomicity and persistency by interacting with • Access methods manager (to keep track of requested operations) • Buffer manager (to request, whenever necessary, write operations on disks) • Concurrency manager • Manages isolation by filtering and possibly reorganizing accesses (requested by the access manager) Transactions and system modules (3) Query and update manager Access methods manager Concurrency manager Transaction manager Buffer manager Reliability manager Secondary storage manager Secondary storage Reliability manager Responsible for: • Execution of transactional commands • begin transaction (B) • commit (C) • rollback (A, for ‘Abort’) • Restart after malfunctions • warm restart • cold restart Guarantees atomicity and persistency Persistent storage Failure resistant storage • It is an abstraction • No storage device can guarantee 0% failure probability • Replication and robust write protocols can provide failure probability close to 0% • Organized in different ways depending on the critical aspects of the application, for instance: • atapeandadisk • two mirrored disks A failure of persistent storage is considered catastrophic and impossible Sequential file that registers, in chronological order, the actions executed by different transactions • Written on persistent storage (permanent archive) • Managed by the reliability manager • Enables undo and redo operations • metaphors: Arianna’s wire, Hansel and Gretel’s pieces of bread, ... Log organization (1) Sequential file • Records in chronological order Log organization (2) Two kinds of records: • transaction • begin, B(T) • insert, I(T,O,AS) • delete, D(T,O,BS) • update, U(T,O,BS,AS) • commit, C(T) • abort, A(T) • dump, (rare) • checkpoint, (more frequent) Log organization (3) Checkpoint Checkpoint BUUC Records of transaction T Top of the log Log organization (4) • Transaction records include, for operations (insert, delete, update) • before state (BS) • State of the object before the operation • after state (AS) • State of the object afer the operation Þ it is possible to perform undo and redo of operations To undo an action: • update: undo(U(T,O,BS,AS)) • copies value BS into object O • delete: undo(D(T,O,BS)) • recovers object O with value BS • insert: undo(I(T,O,AS)) • deletes object O To redo an action: • update: redo(U(T,O,BS,AS)) • copies value AS into object O • delete: redo(D(T,O,BS)) • deletes object O • insert: redo(I(T,O,AS)) • recovers object O with value AS Undo and Redo: idempotency Undo and redo have • idempotency property: an arbitrary number of undo/redo of the same action is equivalent to executing once the undo/redo of the action • undo(A) ... undo(A) = undo(A) • redo(A) ... redo(A) = redo(A) Idempotency guarantees correctness even in case of repeated execution of undo and redo operations Checkpoint System operation executed by the reliability manager with the coordination of the buffer manager • Logs active transactions • Updates secondary storage according to all the completed transactions • It is executed periodically Checkpoint execution • Suspends accepting write, commit and abort operations from transactions • Transfers (force) to secondary storage all the dirty pages of the buffer relative to transactions that have already committed • Writes on the log in a synchronous manner (force) a checkpoint record CK(T1,...,Tn) containing the identifiers of active transactions • Restarts accepting operations from transactions Compete copy of the database stored on persistent storage (backup) • Created when the system is not operating • inmutualexclusionwithalltheothertransactions • Typically on tape At the end of the dump, a dump record (DUMP) is written in the log to signal that a backup has been executed at a given point in time and that identifies the copy Log record write Must obey two rules: • Write Ahead Log • Commit-Precedence Write Ahead Log The before state part of the log record should be written in the log before executing the corresponding operation on the database • permits to undo write operations executed by transactions that did not commit • permits to recover in case of failure after the execution of the operation on the database • if the log was not written before the operation, the preceding value would be lost Commit-precedence The after state part of the log record should be written in the log before the commit • permits to redo write operations that have already been decided by transactions that have committed, but whose updated pages have not been yet written on secondary storage by the buffer manager Write log records Even if the rules refer to before state and after state, practically the components of log records are written at the same time A simplified version of the rules requires that logs are written: • before the corresponding records in the database • before executing commit Commit record Written, in a synchronous manner (force), in the log record by the transactions that chose to terminate successfully • Failure before commit • undo of the executed actions and restore of the initial status of the database • Failure after commit • redo of the actions to reconstruct the final status of the transaction Abort record Defines the choice of abort (produced by system) • Without modifying the decisions of the reliability manager, it can be: • written asynchronously in the buffer containing the current block of the log • later asynchronously rewritten on the log (flush) Combined writing: log and database (1) We distinguish three schemas, depending on whether the updates on the database performed by a transaction are executed (force of the buffer manager) • Before commit • After commit • Some before and some after commit Combined writing: log and database (2) Database modified before commit • no need of redo operations B(T) U(T,X,BS,AS) U(T,Y,BS,AS) C(T) Writes on the log Writes on the database Combined writing: log and database (3) Database modified after commit • no need of undo operations B(T) U(T,X,BS,AS) U(T,Y,BS,AS) C(T) Combined writing: log and database (4) Database modified at any time (before and after) commit • requires both undo and redo operations • It is most commonly used because it allows the buffer manager to optimize flush operations independently from the reliability manager B(T) U(T,X,BS,AS) U(T,Y,BS,AS) C(T) Two classes • System failures: software bugs (e.g., of the operative system) or interruption of the working of devices (e.g., electric power failure) • Device failures: failures of mass storage devices (e.g., scratches of the disk) System failures Software bugs (e.g., of the operative system) or interruption of the working of devices (e.g., electric power failure) • Central memory (and all buffers) content is lost • The content of secondary storage is not lost Þ warm restart Device failures Failures of mass storage devices (e.g., scratches of the disk) • Central memory (and all buffers) content is lost • Secondary memory content is lost • Persistent storage content is not lost Þ cold restart Failure manager Fail-stop model: if the system identifies a failure • forces the complete stop of transactions • forces the correct working of the operative system (boot) • performs a restart • warm (for system failures) • cold (for device failures) Þ becomes usable again • empty buffer Fail-stop model Normal Working End recovery Restart: classification of transactions Considering a failure, there are two classes of transactions • committed • their actions should be repeated (redo) • uncommitted • their actions should be undone (undo) Warm restart (1) Phase 1: determine the set of transactions that need to be undone (UNDO) and repeated (REDO) • read the log backward from the end till the most recent checkpoint • initialize UNDO e REDO • UNDO := transactions in the checkpoint • REDO := Æ • read the log forward: • for each B(T) Þ UNDO := UNDO È {T} • for each C(T) Þ UNDO := UNDO - {T} REDO := REDO È {T} Warm restart (2) Phase 2: recovery • read the log backward • for each action A of transactions in UNDO Þ undo(A) go back till the first action of the oldest transaction in UNDO È REDO • read the log forward • for each action A of transactions in REDO Þ redo(A) Warm restart (3) • atomicity: all the active transactions at the time of failure leave the database in the initial or final status • persistency: all the pages of active transactions are written on secondary storage Warm restart: example (1) B(T1), B(T2), I(T2,O1,A1), B(T3), I(T3,O2,A2), D(T1,O3,B3), B(T4), U(T3,O2,B4,A4), I(T4,O4,A5), U(T4,O2,B6,A6), C(T2), CK(T1,T3,T4), C(T4), B(T5), D(T5,O4,B7), U(T1,O2,B8,A8), A(T3), C(T1), failure • determine UNDO and REDO sets Warm restart: example (2) B(T1), B(T2), I(T2,O1,A1), B(T3), I(T3,O2,A2), D(T1,O3,B3), B(T4), U(T3,O2,B4,A4), I(T4,O4,A5), U(T4,O2,B6,A6), C(T2), CK(T1,T3,T4), C(T4), B(T5), D(T5,O4,B7), U(T1,O2,B8,A8), A(T3), C(T1), failure • UNDO {T3,T5} D(T5,O4,B7) U(T3,O2,B4,A4) I(T3,O2,A2) insert O4, O4:=B7 O2:=B4 Warm restart: example (3) B(T1), B(T2), I(T2,O1,A1), B(T3), I(T3,O2,A2), D(T1,O3,B3), B(T4), U(T3,O2,B4,A4), I(T4,O4,A5), U(T4,O2,B6,A6), C(T2), CK(T1,T3,T4), C(T4), B(T5), D(T5,O4,B7), U(T1,O2,B8,A8), A(T3), C(T1), failure • REDO {T1,T4} D(T1,O3,B3) I(T4,O4,A5) U(T4,O2,B6,A6) U(T1,O2,B8,A8) insert O4, O4:=A5 O2:=A6 O2:=A8 Cold restart Phase 1: recover the database in the same status as at the time of failure • access the dump and selectively copy all the damaged parts of the database • access the most recent dump record registered in the log • read the log forward applying, for the damaged part, both the actions of the database and commit/abort actions • Execute warm restart Concurrency control (1) Permits the concurrent execution different transactions • Crucial in information systems with high load • banks,financialinstitution,airlinebooking • Permits to efficiently use the DBMS • Maximizing the number of served transactions • Minimizing response times • Application load measured in transactions per second (tps) • Typical values: from tens to thousands Concurrency control (2) Mediates access requests to the data • Decides to grant or deny requests • Establishes the order of accesses (scheduler) Concurrency control (3) For the sake of lectures, we consider two simplifying assumptions: • Databases in terms of abstract objects x, y, z with numerical values • Read/write operations on object x, as read/write of the whole page where x is stored Concurrency control: architecture read, write begin, commit, abort Lock table read, write (not all and possibly in different order) Access methods manager Concurrency manager Transaction manager Secondary storage manager Concurrency control (4) The concurrent execution of different transactions can cause correctness issues and anomalies • Update loss • Dirty read • Inconsistent reads • Ghost update • Ghost insert Update loss Updates by a transaction lost because overwritten by a concurrent transaction Transaction t1 x := x + 1 w1(x) commit Transaction t2 x := x + 1 w2(x) Initial value: 2 commit Final value: 3 instead of 4 (the update by t1 is lost) Dirty read A transaction reads the intermediate result of another transaction that then aborts (whose modification is then cancelled) Transaction t1 x := x + 1 w1(x) Transaction t2 x := x + 1 w2(x) commit t2 reads an intemediate state which is then cancelled Inconsistent reads A transaction reads objects that another transaction is modifying: some read operations are before, others are after the updates Transaction t1 bot r1(x) w1(y) r1(x) w1(z) commit Transaction t2 x := x + 1 w2(x) commit t1 reads different values for x Ghost update A transaction observes only a subset of the effects of another transaction (observing a status of the data that does not satisfy integrity constraints) Transaction t1 bot Transaction t2 bot y := y - 100 r2(z) z := z + 100 w2(y) w2(z) commit Constraint: x+y+z=1000; s=1100 but the constraint is satisfied r1(x) r1(y) s := x + y + z commit Ghost insert A transaction evaluates twice an aggregated value about a set of elements in a selection condition select avg(Mark) from Exam where Subject=‘IM’ If a new tuple is inserted between an evaluation and the subsequent one, the results may be different The anomaly cannot be prevented referring only to already known data Concurrency control theory Formally a transaction • is a sequence of read and write operations • has a transaction identifier assigned by the system • starts with begin transaction and ends with end transaction (omitted) t1: r1(x) r1(y) w1(x) w1(y) Sequence of input/output operations presented by concurrent transactions • all the operations of each transaction that committed must appear in the schedule • for each transaction, operations must appear in the schedule in the same order as in the transaction For the study: • we consider the commit-projection and ignore transactions that (simplifying assumption, not acceptable in practice, will be removed afterwards) Schedule: example Transactions • t1 : r1(x) w1(x) r1(y) w1(y) • t2 : r2(y) w2(y) Some possible schedules • S1 : r1(x) w1(x) r1(y) w1(y) r2(y) w2(y) • S2 : r2(y) w2(y) r1(x) w1(x) r1(y) w1(y) • S3 : r1(x) r2(y) w1(x) r1(y) w2(y) w1(y) • S4 : r2(y) r1(x) w1(x) r1(y) w1(y) w2(y) Concurrency control • Avoids schedules causing anomalies • Managed by a module that accepts or refuses operations requested by transactions (scheduler) • Based on the identification of classes of acceptable schedules based on definitions of equivalence • serial schedule • serializable schedule Serial and serializable schedules Serial schedule • For each transaction ti in the schedule, all the operations in ti are executed consecutively • n transactions Þ n! possible serial schedules Serializable schedule • non-serial schedule that produces the same result as a serial schedule of the same transactions • need definition of equivalence among schedules • progressive concepts: view-equivalence, conflict-equivalence, two-phase locking, timestamp-based Serial schedule: example Transactions • t1 : r1(x) w1(x) r1(y) w1(y) • t2 : r2(y) w2(y) Serial schedules: • S1 : r1(x) w1(x) r1(y) w1(y) r2(y) w2(y) • S2 : r2(y) w2(y) r1(x) w1(x) r1(y) w1(y) Possible schedules: • S3 : r1(x) r2(y) w1(x) r1(y) w2(y) w1(y) • S4 : r2(y) r1(x) w1(x) r1(y) w1(y) w2(y) View-serializability (1) Preliminary definitions • ri(x) reads-from wj(x) in a schedule S if: • wj(x) precedes ri(x) in S • there is no wk(x) between wj(x) and ri(x) in S • wi(x) is a final write in a schedule S if: • itisthelastwriteonxinS S: r1(x) w1(x) w1(y) r2(x) w2(y) • r2(x) read-from w1(x) • w1(x) final write on x • w2(y) final write on y View-serializability (2) View-equivalence • two schedules Si and Sj (with i ≠ j) are view-equivalent (Si ov Sj) if they have • the same read-from relations • the same final writes View-serializability • a schedule is view-serializable if it is • view-equivalent to a serial schedule We denote with VSR the set of view-serializable schedules View-serializability: example (1) S: w0(x) r2(x) r1(x) w2(x) w2(z) • read-from: • r2(x) ¬ w0(x) • r1(x) ¬ w0(x) • final write: • x: w2(x) • z: w2(z) • view-equivalent to the serial schedule w0(x) r1(x) r2(x) w2(x) w2(z) Þ view-serializable View-serializability: example (2) S: w0(x) r1(x) w1(x) r2(x) w1(z) • read-from: • r1(x) ¬ w0(x) • r2(x) ¬ w1(x) • final write: • x: w1(x) • z: w1(z) • view-equivalent to the serial schedule w0(x) r1(x) w1(x) w1(z) r2(x) Þ view-serializable View-serializability: example (3) S: r1(x) r2(x) w2(x) w1(x) • read-from: • r1(x) ¬ • final write: • x: w1(x) Þ not view-serializable (loss of update) View-serializability: example (4) S: r1(x) r2(x) w2(x) r1(x) • read-from: • r1(x) ¬ • r1(x) ¬ w2(x) • final write: • x: w2(x) Þ not view-serializable (inconsistent read) View-serializability: example (5) S: r1(x) r1(y) r2(z) r2(y) w2(y) w2(z) r1(z) • read-from: • r1(x) ¬ • r1(y) ¬ • r2(z) ¬ • r1(z) ¬ w2(z) • final write: • y: w2(y) • z: w2(z) Þ not view-serializable (ghost update) View-seria 程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts