Sample Solutions for Quiz 4, Question 2, Sort-Merge Join
3 MARKS
Marking Scheme:
Students don’t have to show the number of times that we break down the sorted-
run calculations (e.g., 3 passes because …)
Just as an example, they can provide answers in the format:
60,000 * 2 * 2 or 60,000(2)(2) or 240,000 page I/Os
Don’t double-deduct for propagated errors unless it trivializes the solution—
context dependent.
Errors:
-1 once only if they used the wrong buffer size B, and it turned out to be only 1 pass for each table
-0.5 once only if they used the wrong buffer size B, and it turned out to be 2 or more passes for at least one of the tables.
Don’t worry about the B-1 in the denominator unless it’s for a boundary case, like where 100 pages need to fit into B=100 buffers because then, the B-1 is very important.
-0.5 for an arithmetic error that results in an incorrect solution (don’t deduct more than two such penalties)
Version 1: Voter has 60,000 pages. Vote has 80,000 pages. B = 400.
a) [1 mark] Sort the Voter table:
60,000 * 2 (for read/write) * 2 (passes) = 240,000 page I/Os.
i. 2 passes come from ceiling(60,000 / 400) = 150 sorted runs, and then ceiling(150 / (400-1)) = 1 … so, we stop at 2 passes.
b) [1 mark] Sort the Vote table:
80,000 * 2 (for read/write) * 2 (passes) = 320,000 page I/Os.
i. 2 passes come from ceiling(80,000 / 400) = 200 sorted runs, and then ceiling(200 / (400-1)) = 1 … so, we stop at 2 passes.
c) [0.5 marks] Merge the 2 tables: 60,000 + 80,000 = 140,000 page I/Os
d) [0.5 marks] Grand Total = 240,000 + 320,000 + 140,000 = 700,000 page I/Os
1
Version 2: Voter has 50,000 pages. Vote has 40,000 pages. B = 200.
a) [1 mark] Sort the Voter table:
50,000 * 2 (for read/write) * 3 (passes) = 300,000 page I/Os.
i. 2 passes come from ceiling(50,000 / 200) = 250 sorted runs, and then ceiling(250 / (200-1)) = 2, and finally ceiling(2 / (200-1)) = 1 … so, we stop at 3 passes.
b) [1 mark] Sort the Vote table:
40,000 * 2 (for read/write) * 3 (passes) = 240,000 page I/Os.
i. 3 passes come from ceiling(40,000 / 200) = 200 sorted runs, and then ceiling(200 / (200-1)) = 2, and finally ceiling(2 / (200-1)) = 1 … so, we stop at 3 passes.
c) [0.5 marks] Merge the 2 tables: 50,000 + 40,000 = 90,000 page I/Os
d) [0.5 marks] Grand Total = 300,000 + 240,000 + 90,000 = 630,000 page I/Os
Version 3: Voter has 30,000 pages. Vote has 210,000 pages. B = 300.
a) [1 mark] Sort the Voter table:
30,000 * 2 (for read/write) * 2 (passes) = 120,000 page I/Os.
i. 2 passes come from ceiling(30,000 / 300) = 100 sorted runs, and then ceiling(100 / (300-1)) = 1 … so, we stop at 2 passes.
b) [1 mark] Sort the Vote table:
210,000 * 2 (for read/write) * 3 (passes) = 1,260,000 page I/Os.
i. 3 passes come from ceiling(210,000 / 300) = 700 sorted runs, and then ceiling(700 / (300-1)) = 3, and finally ceiling(3 / (300-1)) = 1 … so, we stop at 3 passes.
c) [0.5 marks] Merge the 2 tables: 30,000 + 210,000 = 240,000 page I/Os
d) [0.5 marks] Grand Total = 120,000 + 1,260,000 + 240,000 = 1,620,000 page
I/Os
2