Removal
Find the entry in the leaf node and remove it
▪ This may result in the node being less than half full
▪ If so, select an adjacent sibling of the node and either
▪ Redistribute values between the two nodes or
▪ Coalesce the two nodes
If the node has two adjacent siblings it may be necessary
to choose between redistributing or coalescing
▪ Redistributing values is less work but does not reduce the
height of the tree
▪ Coalescing nodes is more work and may reduce tree height
Implementations typically choose to redistribute
Reductions in height are infrequent
Siblings have the same parent
A value and pointer are taken from a sibling ▪ Either the left or the right adjacent sibling
Leaf Node
redistribute from R sibling
Leaf nodes
▪ Redistribute value and pointer in nodes
remove
▪ One of the nodes’ first search key values changes
▪ And is assigned to the corresponding value in the parent Interior nodes
new value in parent
▪ The value taken from the sibling writes over the corresponding value in the parent
▪ The original value in the parent is inserted in the node with too few entries
▪ Along with the pointer taken from the sibling
redistribute from L sibling
remove
value in parent between nodes
Interior Node
When redistribution is not possible, two nodes are combined (coalesced)
▪ Insert all of the values (and pointers) from one node in the other
▪ Re-connect links between leaves
▪ For interior nodes this results in too few values
▪ Insert the value in the parent between the coalesced nodes into the coalesced node
Make recursive call to remove
▪ Remove the value between the coalesced nodes and the pointer to the deleted node in the parent
coalescing leaves (after removal of entry)
becomes
remove value and pointer to deleted node from parent
coalescing interior nodes (after removal of entry)
becomes
▪ This, in turn, may propagate further changes up the tree
remove value and pointer to deleted node from parent
set to parent value between nodes
remove 19 n = 3
11
60
6
8
23
45
77
77
93
2
5
6
7
8
9
11
1291
21
23
31
39
45
51
60
64
use search to find the value remove the record in the data file first
remove 45 n = 3
11
60
6
8
23
45
77
77
93
2
5
6
7
8
9
11
19
21
23
31
39
4551
51
60
64
the node is less than half full: (n + 1)/2 pointers to records
remove 45 n = 3
11
60
6
8
23
45
77
77
93
2
5
6
7
8
9
11
19
21
23
31
39
51
60
64
take a value from the left sibling node
remove 45 n = 3
11
60
change the value in the parent node
6
8
23
4395
77
77
93
2
5
6
7
8
9
11
19
21
23
31
39
3591
51
60
64
take a value from the left sibling node
remove 9 n = 3
11
60
6
8
23
45
77
77
93
2
5
6
7
8
9
11
19
21
23
31
39
45
51
60
64
leaf nodes must have (n + 1)/2 values
not a sibling
so must coalesce the node with its L sibling
remove 9 n = 3
11
60
6
8
23
45
77
now remove entry and pointer from parent
77
93
2
5
6
7
8
8
9
11
19
21
23
31
39
45
51
60
64
leaf nodes must have (n + 1)/2 values
not a sibling
so must coalesce the node with its L sibling
remove 9 n = 3
11
60
these nodes have enough (2) pointers
6
8
23
45
77
now remove entry and pointer from parent
77
93
2
5
6
7
8
11
19
21
23
31
39
45
51
60
64
remove 6 n = 3
11
60
6
8
23
45
77
parent entry does not need to change
77
93
2
5
67
87
8
11
19
21
23
31
39
45
51
60
64
remove 8 n = 3
11
60
6
8
23
45
77
now remove entry and pointer from parent
77
93
2
5
7
7
8
11
19
21
23
31
39
45
51
60
64
… remove 8 …
which value? n = 3
11
60
so take a pointer from sibling
just 1 pointer
6
23
45
77
now remove entry and pointer from parent
77
93
2
5
7
11
19
21
23
31
39
45
51
60
64
… remove 8 … which value? n = 3
2113
60
11
4235
45
77
77
93
2
5
7
11
19
21
23
31
39
45
51
60
64
… remove 8 finished n = 3
23
60
11
45
77
2
5
7
11
19
21
23
31
39
45
51
60
64
77
93
remove 23 and 31? n = 3
23
60
11
45
77
2
5
7
11
19
21
23
31
39
45
51
60
64
77
93
Splitting and merging of index blocks is rare
▪ Typically the value of n will be much greater than 3!
▪ Most splits or merges are limited to two leaves and one parent
The number of disk I/Os is based on the tree height
▪ It is reasonable to assume that most B+ trees have a height of 3 to 5
▪ And one level is the root (i.e. one block) which may reside in main memory
Tree Height Example block size = 4,096
key = 8 bytes pointer = 8 bytes
n = 340 Size if tree height = 3? 2553 = 16,600,000 Size if tree height = 4?
2554 = 4,228,250,625
Some B+ tree implementations do not coalesce nodes
▪ If a leaf has too few keys and pointers it is allowed to remain
less than half full
▪ It is assumed that most DB files tend to grow not shrink
▪ It also allows efficient access to records replaced by tombstones in the data file
Insertions and removals may result in changes to parent nodes which must therefore be accessed
▪ Insertion and removal can be implemented recursively
▪ Or node structure can be changed to include a pointer to the
parent node parent
ppp0 K0 p1 K1 p2 K2 p3
The search algorithm assumes that all search key entries with a given key are in the same node
▪ If duplicate values are allowed this may not be the case
Three methods for dealing with duplicate values are
▪ Maintain overflow pages for duplicates where necessary
▪ Include the rid as part of the search key
▪ Which ensures that there will not be duplicates!
▪ Modify the tree to change the meaning of interior nodes
▪ Keys in interior nodes represent new keys, that is keys where the
same value does not appear to the left ▪ In some cases this requires null keys
But increases search key size
find 11 and 19 and 23
i.e. no new keys in 2nd child
never follow this pointer!
19
6
–
31
57
2
5
6
11
11
19
23
23
23
23
31
41
57
64