Discussion Session on
Indexing and Query Processing
Amirhossein Aleyasen
Outline
● Q1. B+ Tree
● Q2. Hash Tables
● Q4. Query Processing
Q1. B+ Tree
Consider the following B+ tree, where the index blocks are identified as B1, B2,B3,
and the data entry blocks are B4,…, B9. (d = 2)
● For each of the following lookup operations, describe which blocks would be read and in which
order.
● For each of the insert and delete operations, apply them to the given B+ tree and describe (or draw)
the resulting tree.
1. Lookup record with key 15.
2. Lookup record with key 2.
3. Lookup records with keys in the range 20 to 40.
4. Insert records with keys 5 and 7.
5. Delete record with key 22.
Q1.1 B+ Tree
Q1.1 B+ Tree (Solution)
Lookup record with key 15.
B1, B3, B7
Q1.2 B+ Tree (Solution)
Lookup record with key 2.
B1, B2, B4
Q1.3 B+ Tree (Solution)
Lookup records with keys in the range 20 to 40.
B1, B3, B8, B9
Q1.4 B+ Tree
Insert records with keys 5 and 7.
Review: B+ Tree Insertion
● Perform a search to determine what bucket the new record should go into.
● If the bucket is not full (at most b-1 entries after the insertion), add the
record.
● Otherwise, before inserting the new record
● split the bucket.
● original node has ⎡(L+1)/2⎤ items
● new node has ⎣(L+1)/2⎦ items
● Move ⎡(L+1)/2⎤-th key to the parent, and insert the new node to the
parent.
● Repeat until a parent is found that need not split.
● If the root splits, treat it as if it has an empty parent and split as outline
above.
Q1.4 B+ Tree (Solution)
Insert records with keys 5 and 7.
Q1.4 B+ Tree (Solution)
Insert records with keys 5 and 7.
Insert 5
Q1.4 B+ Tree (Solution)
Insert records with keys 5 and 7.
Insert 7
Q1.5 B+ Tree
Delete record with key 22.
Review: B+ Tree Deletion
● Start at root, find leaf L where entry belongs.
● Remove the entry.
● If L is at least half-full, done
● If L has fewer entries than it should,
● If sibling (adjacent node with same parent as L) is more than
half-full, re-distribute, borrowing an entry from it.
● Otherwise, sibling is exactly half-full, so we can merge L and
sibling.
● If merge occurred, must delete entry (pointing to L or sibling) from parent of
L.
● Merge could propagate to root, decreasing height.
Q1.5 B+ Tree (Solution)
Delete record with key 22.
Q2: Hash Table
Consider the following Extensible
Hashing index and answer the questions
(note that keys are least significant bits)
1. Show the index after inserting an
entry with hash value 68.
2. Show the index after inserting
entries with hash values 17 and 69
into the original tree.
3. Show the index after deleting the
entry with hash value 21 into the
original tree.
Q2: Hash Table (Solution) – inserting 68
Q2: Hash Table – inserting 17 & 69
Q2: Hash Table (Solution) – inserting 17 & 69
Q2: Hash Table – deleting 21
Q2: Hash Table (Solution) – deleting 21
Q3. Query Processing
Consider the join X⨝X.a=Y.bY given the following information.
● The cost metric is the number of page I/Os.
● X contains 5000 tuples and has 10 tuples/page.
● Y contains 1000 tuples and has 10 tuples/page.
● Neither relation has any index.
● 52 buffer pages are available.
Q3.1 Query Processing
What is the cost of X⨝X.a=Y.bY using a block nested loops join (with X as the outer
relation)?
Q3.1 Query Processing (Solution)
What is the cost of X⨝X.a=Y.bY using a block nested loops join (with X as the outer
relation)?
Q3.1 Query Processing (Solution)
B(X) = 5000/10 = 500
B(Y) = 1000/10 = 100
B(X) + B(Y)B(X)/(M-2) = 500 + 500 * 100 / (52 – 2) = 1500 IOs
Q3.2 Query Processing
What is the cost of X⨝X.a=Y.bY using a block nested loops join (with Y as the outer
relation)?
Q3.2 Query Processing (Solution)
What is the cost of X⨝X.a=Y.bY using a block nested loops join (with Y as the outer
relation)?
Q3.2 Query Processing (Solution)
B(X) = 5000/10 = 500
B(Y) = 1000/10 = 100
M = 52
B(Y) + B(X)B(Y)/(M-2) = 100 + 500 * 100 / (52 – 2) = 1100 IOs
So it is better to iterate over the smaller relation first.
Q3.3 Query Processing
What is the cost of X⨝X.a=Y.bY using a sort merge join?
Q3.3 Query Processing (Solution)
What is the cost of X⨝X.a=Y.bY using a sort merge join?
Q3.3 Query Processing (Solution)
What is the cost of X⨝X.a=Y.bY using a sort merge join?
Q3.3 Query Processing (Solution)
B(X) = 5000/10 = 500
B(Y) = 1000/10 = 100
M = 52
We can do both solutions.
Take 1: 5 (500 + 100) = 3000 IOs
Take 2: 3 (500 + 100) = 1800 IOs
Q3.4 Query Processing
What is the cost of X⨝X.a=Y.bY using a hash join?
Q3.4 Query Processing (Solution)
What is the cost of
X⨝X.a=Y.bY using a hash
join?
Q3.4 Query Processing (Solution)
B(X) = 5000/10 = 500
B(Y) = 1000/10 = 100
M = 52
3 (500 + 100) = 1800 IOs