CS计算机代考程序代写 algorithm COMP9315 21T1

COMP9315 21T1
Exercises 07
Implementing Join: Nested-loop, Sort-merge, Hash
DBMS Implementation
[Show with no answers]   [Show with all answers]
1.
2. Does the buffer replacement policy matter for sort-merge join? Explain your answer. 
[show answer]


3.
4. Suppose that the relation R (with 150 pages) consists of one attribute a and S (with 90 pages) also consists of one attribute a. Determine the optimal join method for processing the following query: 
select * from R, S where R.a > S.a
5. 
Assume there are 10 buffer pages available to process the query and there are no indexes available. Assume also that the DBMS only has available the following join methods: nested-loop, block nested loop and sort-merge. Determine the number of page I/Os required by each method to work out which is the cheapest. 
[show answer]


6.
7. [Ramakrishnan, exercise 12.4] Consider the join Join[R.a=S.b](R,S) and the following information about the relations to be joined:
◦ Relation R contains 10,000 tuples and has 10 tuples per page
◦ Relation S contains 2,000 tuples and also has 10 tuples per page
◦ Attribute b of relation S is the primary key for S
◦ Both of the relations are stored as simple heap files
◦ Neither relation has any indexes built on it
◦ There are 52 buffer pages available
8. Unless otherwise noted, compute all costs as number of page I/Os, except that the cost of writing out the result should be uniformly ignored (it’s the same for all methods). 

a. What is the cost of joining R and S using page-oriented simple nested loops join? What is the minimum number of buffer pages required for this cost to remain unchanged? 
[show answer]


b. What is the cost of joining R and S using block nested loops join? What is the minimum number of buffer pages required for this cost to remain unchanged? 
[show answer]


c. What is the cost of joining R and S using sort-merge join? What is the minimum number of buffer pages required for this cost to remain unchanged? 
[show answer]


d. What is the cost of joining R and S using grace hash join? What is the minimum number of buffer pages required for this cost to remain unchanged? 
[show answer]


e. What would be the lowest possible I/O cost for joining R and S using any join algorithm? How much buffer space would be needed to achieve this cost? 
[show answer]


f. What is the maximum number of tuples that the join of R and S could produce, and how many pages would be required to store this result? 
[show answer]


g. How would your answers to the above questions change if you are told that R.a is a foreign key that refers to S.b? 
[show answer]


9.
10. [Ramakrishnan, exercise 12.5] Consider the join of R and S described in the previous question:
a.
b. With 52 buffer pages, if unclustered B+ tree indexes existed on R.a and S.b, would either provide a cheaper alternative for performing the join (e.g. using index nested loop join) than a block nested loops join? Explain. 

i. Would your answer change if only 5 buffer pages were available?
ii. Would your answer change if S contained only 10 tuples instead of 2,000 tuples?
c. [show answer]



d. With 52 buffer pages, if clustered B+ tree indexes existed on R.a and S.b, would either provide a cheaper alternative for performing the join (e.g. using index nested loop join) than a block nested loops join? Explain. 

i. Would your answer change if only 5 buffer pages were available?
ii. Would your answer change if S contained only 10 tuples instead of 2,000 tuples?
e. [show answer]



f. If only 15 buffers were available, what would be the cost of sort-merge join? What would be the cost of hash join? 
[show answer]



g. If the size of S were increased to also be 10,000 tuples, but only 15 buffer pages were available, what would be the cost for sort-merge join? What would be the cost of hash join? 
[show answer]



h. If the size of S were increased to also be 10,000 tuples, and 52 buffer pages were available, what would be the cost for sort-merge join? What would be the cost of hash join? 
[show answer]


11.
12. Consider performing the join:
select * from R, S where R.x = S.y
where we have bR = 1000, bS = 5000 and where R is used as the outer relation. 
Compute the page I/O cost of performing this join using hybrid hash join:
a. if we have N=100 memory buffers available
[show answer]


b. if we have N=512 memory buffers available
[show answer]


c. if we have N=1024 memory buffers available
[show answer]