CS计算机代考程序代写 database concurrency SQL Transactions

Transactions
Reading:
http://en.wikipedia.org/wiki/Database_transact http://en.wikipedia.org/wiki/Serializability http://en.wikipedia.org/wiki/Schedule_(comput
– http://www.postgresql.org/docs/9.4/static/tutorial-t discusses savepoints
• • •

Definition, ACID Serializability Recoverability
i
e r

Definition
set of operations that access the database

– SQL includes START TRANSACTION,
ROLLBACK and COMMIT
– Statements following START TRANSACTION until either ROLLBACK or COMMIT are part of the transaction
– transactions must satisfy ACID properties


ACID
– entire transaction behaves like one indivisible unit
– either all operations are done, or none are
Atomicity

Atomicity
– if transaction is executed, database must look as if all operations were done in order
• (can reorder non-conflicting operations if necessary)
– if transaction is not executed, database must look as if none of the operations was done
• must undo effects of any operations that were done

ACID
– execution must not violate consistency of database


Consistency
Isolation
– execution must not be affected by concurrent execution of other transactions
– database must look as if transactions were performed in some order
• does allow concurrent execution as long as no conflicts

ACID
• ex: Kim is transferring money from her checking to her savings account
• OK if the system removes money from checking, then let’s Jim remove money from his money market account, and finally completes Kim’s transfer

Durability
– above properties maintained even if system fails

Sample Error scenarios

Interleaved execution
• balance = $100
• you withdraw $50 through ATM, bank autopays a $70 utility bill
• system gets balance = $100 for withdrawal
• because of context switch or concurrency, system next does autopay
• system still gets balance = $100, updates balance to be $100-$70 = $30
• system returns to withdrawal, updates balance to be $100- $50 = $50
• bank is unhappy 🙁

Sample Error scenarios

System/network failure during execution
• balance = $50 in checking, $10 in savings
• you transfer $30 from checking to savings
• system withdraws from checking: balance = $20 in checking, $10 in savings
• system crashes
• you are unhappy 🙁


execution
Database solutions
Concurrency control handles interleaved
– 2-phase locking is standard solution

system/network failures
Recovery management handles – log operations to support undo/redo

Possible transaction states
• •

Active : executing
Committed : successfully completed
– SQL: COMMIT
• all modifications of current transaction are saved to
permanent storage
Aborted : finished before success
– SQL: ROLLBACK
• all modifications of database are undone

Possible transaction states


Partially committed
– operations executed, but not all necessary info saved
Failed: cannot complete
– still need to confirm all modifications undone
– If you interrupt a transaction before it is done (ex: a script of INSERT’s to load data), need to ABORT to really cancel the transaction

Savepoints

– Only works inside a transaction
Savepoints let you to do a partial rollback
– SAVEPOINT label; • marks the point
– ROLLBACK TO label;
• Rolls back to that label
• Continues execution until transaction committed or completely rolled back


next
Oversimplified solutions
Complete each transaction before starting
– schedule : an actual sequence of operations
– the above would produce a serial schedule
– Which scenario would this fix? Why don’t you like it?

Oversimplified solutions

maintain pointer to data; edit copy of data
– on abort, delete copy
– to commit, force copy to disk, update pointer to point to copy
– (this is the approach of many text editors/word processors)
– Why not use this for a database?


serial schedule
Serializability
serializable schedule : equivalent to a
– let r(i,X) mean transaction i reads value of X,
– w(i, X) to update X
– ex: r(1,A), r(2,A), w(2,A), w(1,B) is equivalent to r(1,A), w(1,B), r(2,A), w(2,A)
• so as if transaction 1 is executed before transaction 2

Tables to show serializability
Transaction 1
Transaction 2
read(A)
read(A)
write(A)
write(B)
Transaction 1
Transaction 2
read(A)
write(B)
read(A)
write(A)

Serializability
– ex: r(1,A), r(2,A), w(2,A), w(1,A) is not serializable
Transaction 1
Transaction 2
read(A)
read(A)
write(A)
write(A)

Serializability
– ex: r(2,B), r(1,A), w(2,A), w(1,B) — serializable?
Transaction 1
Transaction 2
read(B)
read(A)
write(A)
write(B)

Serializability
– reads never conflict with other reads
– reads and writes on the same object may conflict

Recoverability

maintain consistency even with failure
schedule is recoverable if system can
– schedule notation needs: • c(i) — transaction i commits • a(i) — transaction i aborts

Recoverability examples
– ex: r(1,A), r(2,A), w(2,A), w(1,B), c(1) is recoverable
Transaction 1
Transaction 2
read(A)
read(A)
write(A)
write(B)
commit

Recoverability examples
– ex: w(1,A), r(2,A), w(2,A), w(1,B), c(2) is not • why? (what could go wrong?)
Transaction 1
Transaction 2
write(A)
read(A)
write(A)
write(B)
commit

Cascading rollback

transaction may cause other aborts
schedule is recoverable, but aborting a – ex: r(1,A), r(2,A), w(2,A), w(1,B), c(2) is
cascadeless
Transaction 1
Transaction 2
read(A)
read(A)
write(A)
write(B)

Cascading rollback
– ex: r(1,A), w(1,A), r(2,A) is not
• suppose we abort transaction 1 next — why does transaction 2 need to abort?
Transaction 1
Transaction 2
read(A)
write(A)
read(A)
write(A)

Cascading rollback

ensure a cascadeless schedule?

Can set isolation level of transaction
– Default will avoid cascading rollback, but can set lower level of isolation, too
How can we restrict read operations to

Transactions and performance


Serializability affects performance
– Major companies have been dropping use of transactions for non-critical data
Transactions have overhead
– Ex: for bulk insert, should avoid having a separate transaction for each INSERT
• Start transaction before first INSERT • Commit transaction after last INSERT