CS143: B+Tree
Professor Junghoo “John” Cho
• Most popular index structure in RDBMS
Copyright By PowCoder代写 加微信 powcoder
• Advantage
• Suitable for dynamic updates
• Balanced
• Minimum space usage guarantee
• Disadvantage
• Non-sequential index blocks
B+Tree (n=3)
• n:#ofpointer spaces in a node
• Balanced:Allleaf nodes are at the same level
Leaf Node (n=3)
Last pointer: to the next leaf node
points to tuple
• All pointers (except the last one) point to tuples • At least half of the pointer spaces are used.
(moreprecisely, (𝑛+1)/2 pointers)
Non-leaf Node (n=3)
Tuples with key <23
• Points to the nodes one-level below - No direct pointers to tuples
• At least half of the pointer spaces used (precisely, - except root, where at least 2 pointer spaces used
Tuples with 23£ key <56
Tuples with 56£ key
Space Usage Guarantee
• B+Tree nodes have at least
• Leaf (non-root): (𝑛 + 1)/2 pointers, (𝑛 + 1)/2 − 1 keys • Non-leaf (non-root): 𝑛/2 pointers, 𝑛/2 − 1 keys
• Root: 2 pointers, 1 key
Search on B+tree
• Find 30, 60, 70?
• Find a greater key and follow the link on the left (Algorithm: Figure 14.11 on textbook)
B+Tree Insertion
1. no overflow
2. leaf overflow
3. non-leaf overflow
4. new root
1. No Overflow
• Insert 60
2. Leaf Overflow
• Insert 55
Overflow! 55
• No space to store 55
2. Leaf Overflow
• Insert 55
Overflow! 55
• Split the leaf into two. Put the keys half and half
2. Leaf Overflow
• Insert 55
• Split the leaf into two. Put the keys half and half
2. Leaf Overflow
• Insert 55
• Copy the first key of the new node to parent
2. Leaf Overflow
• Insert 55
No overflow. Stop
• Q: After split, leaf nodes always half full?
3. Non-leaf Overflow
• Insert 52
Leaf overflow. Split and copy the first key of the new node
3. Non-leaf Overflow
• Insert 52
3. Non-leaf Overflow
• Insert 52
3. Non-leaf Overflow
• Insert 52
3. Non-leaf Overflow
• Insert 52
Split the node into two. Move up the key in the middle.
3. Non-leaf Overflow
• Insert 52
Middle key
3. Non-leaf Overflow
• Insert 52
No overflow. Stop
Q: After split, non-leaf at least half full?
• Insert 25
• Insert 25
• Insert 25
Split and move up the mid-key. Create new root
• Insert 25
• Q: At least 2 pointers at root?
B+Tree Insertion
• Leaf node overflow
• The first key of the new node is copied to the parent
• Non-leaf node overflow
• The middle key is moved to the parent
• Detailed algorithm: Figure 14.17
B+Tree Deletion
1. No underflow
2. Leaf underflow (coalesce with neighbor)
3. Leaf underflow (redistribute with neighbor)
4. Non-leaf underflow (coalesce with neighbor)
5. Non-leaf underflow (redistribute with neighbor)
6. Tree depth reduction
In the examples, n = 4
• Underflow for non-leaf when fewer than 𝑛/2 = 2 pointers
• Underflowforleafwhenfewerthan (𝑛+1)/2 =3pointers • Nodes are labeled as a, b, c, d, ...
1. No Underflow
• Delete 25
1. No Underflow
Underflow?
• Delete 25
• Underflow? Min 3 ptrs. Currently 3 ptrs
2. Coalesce Leaf with Neighbor
• Delete 50
2. Coalesce Leaf with Neighbor
Underflow?
• Delete 50
• Underflow? Min 3 ptrs, currently 2.
2. Coalesce Leaf with Neighbor
underflow!
• Delete 50 Can be merged? • Try to merge with a sibling
2. Coalesce Leaf with Neighbor
• Delete 50
• Merge c and d. Move everything on the right to the left.
2. Coalesce Leaf with Neighbor
• Delete 50
• Once everything is moved, delete d
2. Coalesce Leaf with Neighbor
• Delete 50
• After leaf node merge,
• From its parent, delete the pointer and key to the deleted node
2. Coalesce Leaf with Neighbor
Underflow? e
• Delete 50
• Check underflow at a. Min 2 ptrs, currently 3
3. Redistribute Leaf with Neighbor
• Delete 50
3. Redistribute Leaf with Neighbor
Underflow? Can be merged?
• Delete 50
• Underflow? Min 3 ptrs, currently 2
• Check if d can be merged with its sibling c or e
• If not, redistribute the keys in d with a sibling • Say,withc
3. Redistribute Leaf with Neighbor
Redistribute
• Delete 50
• Redistribute c and d, so that nodes c and d are roughly
“half full”
• Move the key 30 and its tuple pointer to the d
3. Redistribute Leaf with Neighbor
• Delete 50
• Update the key in the parent
3. Redistribute Leaf with Neighbor
Underflow?
• Delete 50
• No underflow at a. Done.
4. Coalesce Non-Leaf with Neighbor
• Delete 20
• Underflow!Mergedwithe.
• Move everything in the right to the left
4. Coalesce Non-Leaf with Neighbor
• Delete 20
• From the parent node, delete pointer and key to the deleted node
4. Coalesce Non-Leaf with Neighbor
b underflow!
d Can be merged?
• Delete 20
• Underflow at b? Min 2 ptrs, currently 1.
• Try to merge with its sibling.
• Nodesbandc:3ptrsintotal.Max4ptrs.
• Mergebandc.
4. Coalesce Non-Leaf with Neighbor
• Delete 20
• Merge b and c
• Pull down the mid-key 50 in the parent node • Move everything in the right node to the left.
• Very important: when we merge non-leaf nodes, we always pull down the mid-key in the parent and place it in the merged node.
4. Coalesce Non-Leaf with Neighbor
• Delete 20
• Merge b and c
• Pull down the mid-key 50 in the parent node • Move everything in the right node to the left.
• Very important: when we merge non-leaf nodes, we always pull down the mid-key in the parent and place it in the merged node.
4. Coalesce Non-Leaf with Neighbor
• Delete 20
– Delete pointer to the merged node.
4. Coalesce Non-Leaf with Neighbor
70 – Underflow at a? Min 2 ptrs. Currently 2. Done.
• Delete 20
5. Redistribute Non-Leaf with Neighbor
• Delete 20
• Underflow!Mergedwithe.
5. Redistribute Non-Leaf with Neighbor
• Delete 20
• After merge, remove the key and ptr to the deleted node from the parent
5. Redistribute Non-Leaf with Neighbor
b underflow!
d Can be merged?
• Delete 20
• Underflow at b? Min 2 ptrs, currently 1.
• Merge b with c? Max 4 ptrs, 5 ptrs in total.
• If cannot be merged, redistribute the keys with a sibling. • Redistributebandc
5. Redistribute Non-Leaf with Neighbor
redistribute
Redistribution at a non-leaf node is done in two steps.
Step 1: Temporarily, make the left node b “overflow” by pulling down the mid-key and moving everything to the left.
5. Redistribute Non-Leaf with Neighbor
redistribute
b temporary overflow dfg
• Delete 20
Step 2: Apply the “overflow handling algorithm” (the same
algorithm used for B+tree insertion) to the overflowed node • Detailed algorithm in the next slide
5. Redistribute Non-Leaf with Neighbor
redistribute
• Delete 20
Step 2: “overflow handling algorithm”
• Pick the mid-key (say 90) in the node and move it to parent.
• Move everything to the right of 90 to the empty node c.
5. Redistribute Non-Leaf with Neighbor
• Delete 20
• Underflow at a? Min 2 ptrs, currently 3. Done
6. Reduce Tree Depth
• Delete 20
• Underflow! Merge d with e.
• Move everything in the right node to the left
6. Reduce Tree Depth
• Delete 20
• From the parent node, delete pointer and key to the deleted node
6. Reduce Tree Depth
• Delete 20
• Merge b and c
• Pull down the mid-key 50 in the parent node • Move everything in the right node to the left.
6. Reduce Tree Depth
• Delete 20
• After merging b and c, remove empty root node • Tree depth is decreased by one
6. Reduce Tree Depth
• Delete 20
Important Points
• Remember:
• For leaf node merging, we delete the mid-key from the parent
• For non-leaf node merging/redistribution, we pull down the mid-key from their parent.
• Exact algorithm: Figure 14.21
Where does n come from?
• n determined by
• Size of a node
• Size of search key
• Size of an index pointer
• Q: 1024B node, 10B key, 8B ptràn?
Range Search on B+tree
• SELECT *
FROM Student WHERE sid > 60?