CS计算机代考程序代写 data structure algorithm COMP2521

COMP2521
Data Structures & Algorithms

Week 4.3
2-3-4 Trees

(B-Tree of order 4)

1

In this lecture

Why?

Storing one item per node results in a lot of traversals
happening, which can lead to slower lookups in trees

What?

2-3-4 Trees
Data Structure
Insertion
Pseudocode

 

2

Search Cost

When analysing search costs in BSTs:

Worst case: Length of longest path
Average case: Roughly average path length

 
In both cases, path length (tree depth) is a critical factor.

Perfectly balanced tree has maximum path length =
The 2 in the path length is what we call “branching factor”. Branching factor is

the same as the number of children.
 

Branching factor = 2: Binary Search Tree
Branching factor = 4: 2-3-4 tree
Branching factor > 2: B-tree

log (n)2

3 . 1

Branching Factor

While we usually generalise logs with different bases as being the same time
complexity, in reality it will have an impact on the real time (this is true for most

coefficients).
 

If we compare branching factors for a balanced tree with 4096 nodes:
 

log (4096) =2 12 steps
log (4096) =4 6 steps
log (4096) =8 4 steps

3 . 2

2-3-4 Tree Structure

2-3-4 trees have one of 3 kinds of nodes:

2-nodes, which have 1 item and 2 children (BST)
3-nodes, which have 2 items and 3 children
4-nodes, which have 3 items and 4 children

4 . 1

2-3-4 Tree Structure

2-3-4 tree nodes are ordered similarly to a BST. Smaller on the left, bigger on
the right. The items in the nodes are also in sorted order.

 
A balanced 2-3-4 tree has all nodes the same distance from the root.

4 . 2

2-3-4 Insertion

General approach:
Firstly, find the leaf node where item belongs (normal leaf-insert), then either:

 
Case 1: If the node of insert is not full (i.e. less than 3 items):

Insert item at this node in the correct location                    
 
Case 2: If the node is full (i.e. has 3 items):

Node split
Apply checks to parents recursively until no more splitting

 

5

2-3-4 Insertion – Case 1

Insertion into a 2-node or 3-node

6

2-3-4 Insertion – Case 2 (Node Split)

Insertion into a 4-node (requires a node split).

Middle value (or one just to the left)  is propagated to parent node
Values in original node split across original node and new node
Recursively apply to parent until no more splitting is required

7 . 1

2-3-4 Insertion – Case 2 (Node Split)

Simple View

Showing
Intermediate Step

7 . 2

2-3-4 Insertion – Case 2 (Node Split)

This example shows the recursive node splitting, include the splitting of the
root node.

7 . 3

2-3-4 Insertion – Full Example

8

2-3-4 Insertion – Pseudocode

insert(tree,item):
if tree is empty:
return new node containing item
node = Search(tree,item)
parent = parent of node
if node.order < 4: insert item into node increment node.order else: promote = node.data[1] // middle value nodeL = new node containing data[0] nodeR = new node containing data[2] delete node if item < promote: insert(nodeL, item) else: insert(nodeR, item) insert(parent, promote) while parent.order = 4: continue promote/split upwards if parent is root ∧ parent.order = 4: split root, making new root 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 search(tree,item): if tree is empty: return NULL scan tree.data to find i such: tree.data[i-1] < item ≤ tree.data[i] if item=tree.data[i]: // item found return address of tree.data[i] else: // keep looking in relevant subtree return search(tree.child[i],item) 1 2 3 4 5 6 7 8 9 9 2-3-4 Tree Data Structure typedef struct node { int order; // 2, 3 or 4 int data[3]; // items in node struct node *child[4]; // links to subtrees } node; 1 2 3 4 5 10 . 1 Finding which branch to follow // n is a pointer to a (struct node) int i; for (i = 0; i < n->order-1; i++) {
if (item <= n->data[i]) break;
}
// go to the ith subtree, unless item == n->data[i]

1
2
3
4
5
6

10 . 2

Performance analysis

Searching a 2-3-4 tree still has a complexity of O(log(n)). However, the base on
the logarithm is bigger (i.e. higher coefficient) which results in an effectively

faster search.
 

For Insertion, we need to break it down a bit more:

Search for leaf node to insert = O(log(n))
If not not full, insert into node = O(1)
If not full, node split, create two new nodes = O(1)
For worst case propagation on node split (root) = O(log(n))

Overall insertion cost = O(log(n))

11

Variations on 2-3-4 trees

2-3-4 trees are actually a special case of a B-tree. A B-tree has a particular
“order” that denotes the number of max children each node can have.

B-tree, order 2 = Binary Search Tree
B-tree, order 3 = ….
B-tree, order 4 = 2-3-4 tree
B-tree, order 5 = ….
etc

 

12