CS 4320/Homework 4
Question 1 [30 Points] Decide for each of the following schedules which of the
following properties it has: – Serial
– Conflict serializable
Copyright By PowCoder代写 加微信 powcoder
– View serializable
– Recoverable
– Avoids cascading aborts
Analyze the following schedules with regards to those properties:
1. W1(A) W2(B) R3(A) C1 W3(D)
2. R1(A) R3(C) R2(B) C2 A1 W3(A)
3. W1(A) W2(B) R3(A) W1(B) C3 W1(A)
Justify the decision for each combination of property and schedule in one to two sentences.
Schedule 1 is serial (with transaction order T1, T2, then T3). Since serial schedules are conflict-serializable, Schedule 1 is conflict-serializable. Since serial schedules are view- serializable, Schedule 1 is also view-serializable.
For Schedule 1, only T1 commits. T1 does not read any data, hence the schedule is recoverable. On the other side, T3 reads a value (A) written by T1 before T1 commits. Hence, the schedule does not avoid cascading aborts. As strict schedules are a subset of schedules avoiding cascading aborts, the schedule is not strict either.
Schedule 2 is not serial (operation of T2 between operations of T3). The schedule is conflict-serializable. The corresponding conflict graph only contains one edge from T1 to T3, therefore no cycle. Hence, the schedule is also view-serializable.
For Schedule 2, only T2 commits. T2 reads only B which has not been written by any transaction. Hence, the schedule is recoverable. No transaction reads data written by another transaction. Hence, the schedule avoids cascading aborts. Also, no schedule overrides values written by another transaction. Hence, the schedule is strict.
Schedule 3 is not serial (operations of T2 and T3 between operations from T1). The conflict graph contains three nodes for the three transactions. It contains edges from T1 to T3 (conflict on A), from T2 to T1 (conflict on B), and from T3 to T1 (second conflict on A). Hence, the conflict graph contains a cycle (covering T3 and T1). This means that the schedule is not conflict-serializable. Finally, we test whether the
schedule is view-serializable. Putting the operations from T1 before the ones from T3 does not lead to a view-equivalent schedule (since R3(A) sees now the second value written by T1 for A, not the first value). Putting the operations from T3 before the ones from T1 does not lead to a view-equivalent schedule either (since now, T3 reads the initial value of A before the first write by T1). Hence, the schedule is not view- serializable. Note that, alternatively, we could have proven first that the schedule is not view-serializable (which implies the other negative results).
Question 2 [30 Points] Consider the following log entries (the initial number is the LSN), produced by the ARIES algorithm:
10 T1 Updates P1
20 T2 Updates P1
30 T3 Updates P1
40 T1 Updates P5
50 T1 Updates P2
Page Steal – Page P5 gets written to disk 60 T2 Commits
70 T1 Commits
In the following questions, we analyze the run time behavior of the ARIES algorithm (before the system crash).
A) Assume the ARIES algorithm only writes log entries to hard disk if required by the rules of write-ahead logging. The initial value of flushedLSN is 0. For each log entry above, describe the value of flushedLSN after generating the log entry at run time.
We need to write log entries to hard disk whenever a transaction commits or pages get written to hard disk. Hence, we must update flushedLSN to 40 before writing page P5 to hard disk. Also, we must update flushedLSN to 60 at LSN 60 and to 70 at LSN 70.
B) What is the value of pageLSN for each of the pages referenced in the log above (assuming that all pageLSN values are initially set to zero)? For each page, outline the value of pageLSN for the hard disk version and for the in-memory version of the page after processing all log entries.
P1 is updated last at LSN 30 and not written to hard disk. Hence, the in-memory version has pageLSN 30 while the hard disk version has LSN 0. P5 is updated at LSN 40 and later written to hard disk. Hence, disk and in-memory version have finally a pageLSN of 40. P2 is changed at LSN 50 and not written to disk. Hence, the in- memory version has pageLSN 50 (while the disk version has pageLSN 0).
C) What is the content of the transaction table after processing all log entries?
The transaction table contains transactions T1, T2, and T3. T1 and T2 have status committed while T3 has status running. The lastLSN counter of T1 is 70, the one of T2 is 60, and the one of T3 is 30.
D) What is the content of the dirty page table after processing all log entries?
The dirty page table contains pages P1 with recLSN 10 and P2 with recLSN 50. P5 is written to disk and therefore not in the dirty page table.
We assume the system crashes after generating the log entries above. In the following questions, we analyze the behavior of ARIES at recovery (after the system comes back online after the crash).
E) What is the state of the transaction table and dirty page table after the analysis phase of recovery?
The state of both tables is the same as before since all log entries were written to hard disk. [Also OK: P5 is in DP table assuming that the write to disk was not in the log]
F) Which operations from the log need to be redone during the redo phase?
LSN 10, 20, and 30 are redone since P1 is in the dirty page table with recLSN 10 and pageLSN 0 on disk. We can avoid redoing LSN 40 as it relates to P5 which was written to hard disk. We need to redo LSN 50.
G) Which compensation log entries are written during the undo phase?
We only need to undo T3 which was running at the system crash. Hence, the associated CLR entry refers to the update of page P1 by T3 (LSN 30).
Submit answers to all questions as a single .pdf document on CMS. This homework can be submitted by groups of up to three students.
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com