CS5481 Data Engineering Tutorial 7
1. Suppose you need to sort a relation r using sort-merge and merge-join the result with an already sorted relation s. Is it possible to pipeline the output of sort-merge of r to the merge join? Explain.
2.
a)
b) c)
Consider the query A,B,C,D (R ⋈A=C S). Given the following information: R is 10 blocks long, and R tuples are 300 bytes long.
S is 100 blocks long, and S tuples are 500 bytes long.
No common attribute in R and S.
The block size is 1024 bytes.
Tuples do not span across blocks.
Each S tuple joins with exactly one R tuple.
The combined size of attributes A, B, C, and D is 450 bytes. AandBareinRandhaveacombinedsizeof200bytes;CandDareinS.
What is the size of the final result, in terms of number of tuples and number of blocks, assuming that the number of duplicates is negligible?
Transform the query so that projection is done before join.
Suppose that three memory blocks are available and block nested-loop join is used. Suppose that projection (and duplicate elimination) is based on sorting. Compute the cost in terms of number of block transfers for each of the following orders.
(i) Joinfollowedbyprojection
(ii) Projection followed by the join.