COMP20007 Design of Algorithms
B-trees
Daniel Beck
Lecture 15
Semester 1, 2021
1
Primary vs. Secondary Memories
• Primary memory (RAM)
2
Primary vs. Secondary Memories
• Primary memory (RAM)
• Key comparisons are the most significant operation. • The paradigm we’ve seen so far.
2
Primary vs. Secondary Memories
• Primary memory (RAM)
• Key comparisons are the most significant operation. • The paradigm we’ve seen so far.
• Secondary memory (Hard Drive)
• Disk accesses are the most significant operation.
• Motivates the use of data structures that minimise disk
accesses
2
BSTs and Disk accesses
2–3 trees
• Traversing between nodes requires a new disk access: expensive.
3
BSTs and Disk accesses
2–3 trees
• Traversing between nodes requires a new disk access: expensive.
• Traversing within a node requires search in a contiguous array: cheap.
3
BSTs and Disk accesses
2–3 trees
• Traversing between nodes requires a new disk access: expensive.
• Traversing within a node requires search in a contiguous array: cheap.
• 2–3 trees are ideal because their height is reduced.
3
BSTs and Disk accesses
2–3 trees
• Traversing between nodes requires a new disk access: expensive.
• Traversing within a node requires search in a contiguous array: cheap.
• 2–3 trees are ideal because their height is reduced.
• Can we extend them to reduce their height even more?
3
BSTs and Disk accesses
2–3 trees
• Traversing between nodes requires a new disk access: expensive.
• Traversing within a node requires search in a contiguous array: cheap.
• 2–3 trees are ideal because their height is reduced.
• Can we extend them to reduce their height even more?
• Just allow even more keys per node!
3
B-trees
• A generalisation of 2–3 trees.
4
B-trees
• A generalisation of 2–3 trees.
• Defined in terms of an order m, which dictates minimum and maximum number of children (and therefore keys per node).
4
B-trees
• A generalisation of 2–3 trees.
• Defined in terms of an order m, which dictates minimum and maximum number of children (and therefore keys per node).
• There is not a single accepted definition of order… We will adopt the convention in Levitin.
• A 2–3 tree is a B-tree of order 3.
4
B-tree – Example
A B-tree of order 6 (and also orders 5 and 4).
(Figure taken from the Cormen book “Introduction to Algorithms”)
5
B-tree – Definitions
Given an order m:
6
B-tree – Definitions
Given an order m:
• The root is either a leaf or has between 2 and m children.
6
B-tree – Definitions
Given an order m:
• The root is either a leaf or has between 2 and m children.
• Each node which is not the root or a leaf has between �m/2� and m children.
• And therefore between �m/2� − 1 and m − 1 keys.
6
B-tree – Definitions
Given an order m:
• The root is either a leaf or has between 2 and m children.
• Each node which is not the root or a leaf has between �m/2� and m children.
• And therefore between �m/2� − 1 and m − 1 keys.
• The tree is perfectly balanced: all leaves are at the same
level.
Remember: all operations are still O(log n).
6
B-tree – Search
• Search is done in a similar way as a standard BST.
7
B-tree – Search
• Search is done in a similar way as a standard BST. • With an additional search at each node.
• Since keys are kept sorted, binary search can be used.
7
B-tree – Search
• Search is done in a similar way as a standard BST. • With an additional search at each node.
• Since keys are kept sorted, binary search can be used.
• Practical property: B-trees are very efficient when searching for a range of values.
• Disk reading operations typically read full sectors in a single operation, storing all data in the disk cache.
7
B-tree – Insertion
• The main risk when inserting a key is having a node with more keys than allowed.
8
B-tree – Insertion
• The main risk when inserting a key is having a node with more keys than allowed.
• Similar to 2–3 trees, we split a node when this happens.
8
B-tree – Insertion
• The main risk when inserting a key is having a node with more keys than allowed.
• Similar to 2–3 trees, we split a node when this happens. • The median key is promoted to the parent node.
• This in turn can trigger further splits all the way up. • Requires more disk accesses → expensive
8
B-tree – Insertion
• The main risk when inserting a key is having a node with more keys than allowed.
• Similar to 2–3 trees, we split a node when this happens. • The median key is promoted to the parent node.
• This in turn can trigger further splits all the way up. • Requires more disk accesses → expensive
• This can be mitigated by preemptively splitting full nodes during an insertion.
8
B-tree – Insertion
• The main risk when inserting a key is having a node with more keys than allowed.
• Similar to 2–3 trees, we split a node when this happens. • The median key is promoted to the parent node.
• This in turn can trigger further splits all the way up. • Requires more disk accesses → expensive
• This can be mitigated by preemptively splitting full nodes during an insertion.
• The height of a tree only increases when the root is split.
8
B-tree – Insertion
Insert B
9
B-tree – Insertion
This will trigger a split.
Insert Q
10
B-tree – Insertion
Insert L
11
B-tree – Insertion
Here the root is split preemptively, allowing insertion through a single pass through the tree and reducing disk accesses.
Insert F
12
B-tree – Insertion
13
B-tree – Deletion
• In deletion, the main risk is having a node with less than the minimum number of keys.
14
B-tree – Deletion
• In deletion, the main risk is having a node with less than the minimum number of keys.
• Two operations to solve that:
14
B-tree – Deletion
• In deletion, the main risk is having a node with less than the minimum number of keys.
• Two operations to solve that:
• If a sibling has “spare” keys, perform a rotation.
14
B-tree – Deletion
• In deletion, the main risk is having a node with less than the minimum number of keys.
• Two operations to solve that:
• If a sibling has “spare” keys, perform a rotation.
• If all siblings have the minimum number of keys, perform
a merge.
14
B-tree – Deletion
• In deletion, the main risk is having a node with less than the minimum number of keys.
• Two operations to solve that:
• If a sibling has “spare” keys, perform a rotation.
• If all siblings have the minimum number of keys, perform
a merge.
• Merges can also be done preemptively to reduce disk
accesses.
14
B-tree – Deletion
Delete F
15
B-tree – Deletion
Delete M
This will trigger a rotation (left children has spare keys)
16
B-tree – Deletion
Delete G
This will trigger a merge (both children have minimum number of keys)
17
B-tree – Deletion
Delete D
This will also trigger a merge, but because an internal node has minimum number of keys (preemptive merge).
18
B-tree – Deletion
Tree shrinks
19
B-tree – Deletion
Delete B This will trigger a rotation.
20
B-tree – Deletion
21
B-tree – Analysis
• The bounds for operations in a B-tree are all O(log n).
22
B-tree – Analysis
• The bounds for operations in a B-tree are all O(log n).
• However, it is the number of disk accesses that’s important.
• This is proportional to the height of the B-tree.
22
B-tree – Analysis
• The bounds for operations in a B-tree are all O(log n).
• However, it is the number of disk accesses that’s important.
• This is proportional to the height of the B-tree.
• How many nodes can you have in a B-tree of order m and height h?
22
B-tree – Analysis
23
B-tree – Analysis
For a B-tree of order m and height h:
�h−1 i=1
n ≥ 1 +
2�m/2�i−1 (�m/2� − 1) + 2�m/2�h−1
24
B-tree – Analysis
For a B-tree of order m and height h:
�h−1 i=1
n ≥ 1 + 2 �h−1 ��m/2�i−1�m/2� − �m/2�i−1� + 2�m/2�h−1 i=1
n ≥ 1 +
2�m/2�i−1 (�m/2� − 1) + 2�m/2�h−1
24
B-tree – Analysis
For a B-tree of order m and height h:
�h−1 i=1
2�m/2�i−1 (�m/2� − 1) + 2�m/2�h−1 �h−1
n ≥ 1 +
n ≥ 1 + 2 � ��m/2�i−1�m/2� − �m/2�i−1� + 2�m/2�h−1
i=1
h−1 � i i−1� h−1
n ≥ 1+2 �m/2� −�m/2� +2�m/2� i=1
24
B-tree – Analysis
For a B-tree of order m and height h:
h−1
n ≥ 1 + � 2�m/2�i −1 (�m/2� − 1) + 2�m/2�h−1
h−1 � i i−1� h−1 n ≥ 1+2 �m/2� −�m/2� +2�m/2�
i=1
n ≥ 1 + 2 ��m/2�h−1 − �m/2�0� + 2�m/2�h−1
i=1 �h−1
n ≥ 1 + 2 � ��m/2�i−1�m/2� − �m/2�i−1� + 2�m/2�h−1 i=1
24
B-tree – Analysis
For a B-tree of order m and height h:
�h−1 i=1
2�m/2�i−1 (�m/2� − 1) + 2�m/2�h−1 �h−1
n ≥ 1 +
n ≥ 1 + 2 � ��m/2�i−1�m/2� − �m/2�i−1� + 2�m/2�h−1
i=1
h−1 � i i−1� h−1
n ≥ 1+2 �m/2� −�m/2� +2�m/2� i=1
n ≥ 1 + 2 ��m/2�h−1 − �m/2�0� + 2�m/2�h−1 n ≥ 4�m/2�h−1 − 1
24
B-tree – Analysis
4�m/2�h−1 − 1 ≤ n
25
B-tree – Analysis
4�m/2�h−1 − 1 ≤ n
�m/2�h−1 ≤ n + 1 4
25
B-tree – Analysis
4�m/2�h−1 − 1 ≤ n �m/2�h−1 ≤ n + 1
4 �n + 1� h − 1 ≤ �log�m/2� 4 �
25
B-tree – Analysis
4�m/2�h−1 − 1 ≤ n �m/2�h−1 ≤ n + 1
4 �n + 1� h−1≤�log�m/2�� 4 ��
h≤�log n+1 �+1 �m/2� 4
25
B-tree – Analysis
Assume 100 million records:
order m 50 100 250 upper bound on h 6 5 4
26
B-tree – Analysis
Assume 100 million records:
order m 50 100 250 upper bound on h 6 5 4
In practice: very high order, usual height upper bound of 3.
26
B-tree – Extensions
• B*-tree: reduces splits at insertion time by using “spills”
27
B-tree – Extensions
• B*-tree: reduces splits at insertion time by using “spills” • When a node has too many keys, try to rotate with the left or right sibling instead of splitting (as long as there
is a sibling with less than m nodes).
27
B-tree – Extensions
• B*-tree: reduces splits at insertion time by using “spills” • When a node has too many keys, try to rotate with the left or right sibling instead of splitting (as long as there
is a sibling with less than m nodes).
• B+-tree: internal nodes only contain keys, records are stored at the leaves.
27
B-tree – Extensions
• B*-tree: reduces splits at insertion time by using “spills” • When a node has too many keys, try to rotate with the left or right sibling instead of splitting (as long as there
is a sibling with less than m nodes).
• B+-tree: internal nodes only contain keys, records are stored at the leaves.
• Keys appear a second time at the leaves.
27
B-tree – Extensions
• B*-tree: reduces splits at insertion time by using “spills” • When a node has too many keys, try to rotate with the left or right sibling instead of splitting (as long as there
is a sibling with less than m nodes).
• B+-tree: internal nodes only contain keys, records are stored at the leaves.
• Keys appear a second time at the leaves.
• This is the version explained in the Levitin book.
27
B-tree – Extensions
• B*-tree: reduces splits at insertion time by using “spills” • When a node has too many keys, try to rotate with the left or right sibling instead of splitting (as long as there
is a sibling with less than m nodes).
• B+-tree: internal nodes only contain keys, records are stored at the leaves.
• Keys appear a second time at the leaves.
• This is the version explained in the Levitin book.
• Sometimes leaves are connected as a linked list: useful if
reading large blocks of data.
27
B+-tree
(from Levitin)
28
BSTs – Summary
• Dictionaries store (key, value) pairs.
29
BSTs – Summary
• Dictionaries store (key, value) pairs.
• BSTs provide Θ(log n) search, insert and delete operations.
29
BSTs – Summary
• Dictionaries store (key, value) pairs.
• BSTs provide Θ(log n) search, insert and delete operations.
• Standard BSTs can degrade to linked lists and Θ(n) worst case performance.
29
BSTs – Summary
• Dictionaries store (key, value) pairs.
• BSTs provide Θ(log n) search, insert and delete operations.
• Standard BSTs can degrade to linked lists and Θ(n) worst case performance.
• Self-balancing: AVL trees, Red-Black Trees • Change of representation: 2–3 trees, B-trees
29
BSTs – Summary
• Self-balancing trees tend to be better when the dictionary is fully in memory.
30
BSTs – Summary
• Self-balancing trees tend to be better when the dictionary is fully in memory.
• C++ maps: implemented via Red-black trees.
30
BSTs – Summary
• Self-balancing trees tend to be better when the dictionary is fully in memory.
• C++ maps: implemented via Red-black trees.
• Multiple elements per node are a better choice when secondary memory is involved.
30
BSTs – Summary
• Self-balancing trees tend to be better when the dictionary is fully in memory.
• C++ maps: implemented via Red-black trees.
• Multiple elements per node are a better choice when secondary memory is involved.
• B-trees (and its variants) are widely used in SQL databases (PostgreSQL, SQLite) and filesystems (Ext4, Reiser4).
30
BSTs – Summary
• Self-balancing trees tend to be better when the dictionary is fully in memory.
• C++ maps: implemented via Red-black trees.
• Multiple elements per node are a better choice when secondary memory is involved.
• B-trees (and its variants) are widely used in SQL databases (PostgreSQL, SQLite) and filesystems (Ext4, Reiser4).
Next week: C++ maps use BSTs. What about Python dicts, do they also use BSTs? (spoiler: no)
30