17_18_Trees_Combined.pdf
Lectures 17-18
Trees, Binary Search Trees
EECS 281: Data Structures & Algorithms
Data Structures & Algorithms
Tree Fundamentals
Context for Trees
Tree: a mathematical abstraction that
• Captures common properties of data
• Critical to the design and analysis of algorithms
4
(x + 3) * (y – 2)
*
y 2x 3
+ –
A graph consists of nodes (sometimes called vertices)
connected together by edges.
Each node can contain some data.
A tree is:
(1) a connected graph (nodes + edges) w/o cycles.
(2) a graph where any 2 nodes are connected by
a unique shortest path.
(the two definitions are equivalent)
In a directed tree, we can identify child
and parent relationships between nodes.
In a binary tree, a node has at most
two children.
Trees
5
A
I FD E
H G
B C
Types of Trees
1. (Simple) tree
– Acyclic connected graph
– Considered undirected
2. Rooted tree
– A simple tree with a selected node (root)
– All edges are directed away from root
• Any node could be selected as root
6
Tree Terminology
Root: the “topmost” node in the tree
Parent: Immediate predecessor of a node
Child: Immediate successor of a node
Siblings: Children of the same parent
Ancestor: parent of a parent (closer to root)
Descendent: child of a child (further from root)
Internal node: a node with children
External (Leaf) node: a node without
children
7
A
I FD E
H G
B C
Trees are Recursive Structures
Any subtree is just
as much a “tree”
as the original!
8
2
3 07 1
4 3
6 3
Tree Properties
Height:
height(empty) = 0
height(node) = max(height(left_child), height(right_child)) + 1
Size:
size(empty) = 0
size(node) = size(left_child) + size(right_child) + 1
Depth:
depth(empty) = 0
depth(node) = depth(parent) + 1
9
A
I FD E
H G
B C
Tree Quiz
• Root:
• Internal nodes:
• Leaf nodes:
• Max depth:
• Label one subtree
• Is this a binary tree?
A
B DC
G HE F
I J K
10
Data Structures & Algorithms
Tree Fundamentals
Data Structures & Algorithms
Binary Trees
Binary Trees
• Ordered Tree: linear ordering for the
children of each node
• Binary Tree: ordered tree in which every
node has at most two children
• Multiple binary tree implementation styles
14
Complete Binary Tree Property
Definition: complete binary tree
• A binary tree with depth d where
– Tree depths 1, 2, …, d – 1 have the max
number of nodes possible
– All internal nodes are to the left of the external
nodes at depth d – 1
– That is, all leaves are at d – 1 or leftmost at
depth d
15
Binary Tree Implementation
Binary tree array implementation
• Root at index 1
• Left child of node i at 2 * i
• Right child of node i at (2 * i) + 1
• Some indices may be skipped
• Can be space prohibitive for sparse trees
16
Binary Tree Implementation
17
Complexity of array implementation
• Insert key (best-case) O(1)
• Insert key (worst-case) O(n)
• Remove key (worst-case) O(n)
• Parent O(1)
• Child O(1)
• Space (best-case) O(n)
• Space (worst-case) O(2n)
Binary Tree Implementation
• Pointer-based binary tree implementation
1 template
2 struct Node {
3 KEY key;
4 Node *left = nullptr;
5 Node *right = nullptr;
6 Node(const KEY &k) : key{k} {}
7 }; // Node{}
• A node contains some information, and points to
its left child node and right child node
• Efficient for moving down a tree from parent to
child
18
Binary Tree Implementation
19
Complexity of pointer implementation
• Insert key (best-case) O(1)
• Insert key (worst-case) O(n)
• Remove key (worst-case) O(n)
• Parent O(n)
• Child O(1)
• Space (best-case) O(n)
• Space (worst-case) O(n)
Binary Tree++
Another way to do it (not common)
1 template
2 struct Node {
3 KEY key;
4 Node *parent, *left, *right;
5 }; // Node()
• If node is root, then *parent is nullptr
• If node is external, then *left and *right are
nullptr
20
Translating General Trees
into Binary Trees
T: General tree
T’: Binary tree
Intuition:
– Take set of siblings {v1,v2,…,vk} in T, that are
children of v
– v1 becomes left child of v in T’
– v2 … vk become chain of right children of v1, in
T’
– Recurse from v2
Left: new “generation”; Right: sibling
21
Tree Quiz
22
A
E C
F G D
I
B
H
J
K
A
B DC
G HE F
I J K
Data Structures & Algorithms
Binary Trees
Data Structures & Algorithms
Tree Traversals
Tree Traversal
Systematic method of processing every node in a tree
• Preorder
1. Visit node
2. Recursively visit left subtree
3. Recursively visit right subtree
• Inorder
1. Recursively visit left subtree
2. Visit node
3. Recursively visit right subtree
25
Tree Traversal
Systematic method of processing every node in a tree
• Postorder
1. Recursively visit left subtree
2. Recursively visit right subtree
3. Visit node
• Level order
– Visit nodes in order of increasing depth in tree
26
Recursive Implementations
19 void inorder(Node *p) {
20 if (!p) return;
21 inorder(p->left);
22 visit(p->key);
23 inorder(p->right);
24 } // inorder()
13 void postorder(Node *p) {
14 if (!p) return;
15 postorder(p->left);
16 postorder(p->right);
17 visit(p->key);
18 } // postorder()
7 void preorder(Node *p) {
8 if (!p) return;
9 visit(p->key);
10 preorder(p->left);
11 preorder(p->right);
12 } // preorder()
27
1 template
2 struct Node {
3 KEY key;
4 Node *left = nullptr;
5 Node *right = nullptr;
6 }; // Node{}
Summary of Tree Traversals
Tree traversal: Systematically process all
nodes in a tree
• Preorder
• Inorder
• Postorder
• Level order (breadth-first traversal)
All are depth-first traversals
28
Exercise
In what order are nodes visited?
– Preorder
– Inorder
– Postorder
– Level order
29
0
42
1 3
Applied Tree Traversals
Draw the binary tree from traversals
• Preorder: 7 3 6 9 8 13 27
• Inorder: 3 6 7 8 9 13 27
31
An Iterative Traversal
Write a function to do a level order traversal,
printing the key at each node
void levelorder(Node *p) {
} // levelorder()
33
Data Structures & Algorithms
Tree Traversals
Search
• Retrieval of a particular piece of information from
large volumes of previously stored data
• Purpose is typically to access information within
the item (not just the key)
• Recall that arrays, linked lists are worst-case
O(n) for either searching or inserting
• Even a hash table has worst-case O(n)
Need a data structure with optimal efficiency
for searching and inserting.
What if order is important?
36
Symbol Table: ADT
• insert a new item
• search for an item (items) with a given key
• remove an item with a specified key
• sort the symbol table
• select the item with the kth largest key
• join two symbol tables
Also may want construct, test if empty, destroy,
copy…
37
Data Structures & Algorithms
Binary Search Trees
Binary Search Tree
• The keys in a binary search tree satisfy
the Binary Search Tree Property
– The key of any node is:
> keys of all nodes in its left subtree and
≤ keys of all nodes in its right subtree
• Essential property of BST is that insert()
is as easy to implement as search()
39
Binary Search Tree Property
The key of any node is:
> keys of all nodes in its left subtree and
≤ keys of all nodes in its right subtree
40
Exercise
Write the output for inorder,
preorder and post-order
traversals of this BST
1 void inorder(Node *x) {
2 if (!x) return;
3 inorder(x->left);
4 print(x->key);
5 inorder(x->right);
6 } // inorder()
7
8 void preorder(Node *x) {
9 if (!x) return;
10 print(x->key);
11 preorder(x->left);
12 preorder(x->right);
13 } // preorder()
14
15 void postorder(Node *x) {
16 if (!x) return;
17 postorder(x->left);
18 postorder(x->right);
19 print(x->key);
20 } // postorder()
41
inorder: 2, 5, 5, 6, 7, 8
preorder: 6, 5, 2, 5, 7, 8
postorder: 2, 5, 5, 8, 7, 6
Sort: Binary Search Tree
Can you think of an easy method of
sorting using a binary search tree?
42
Search
• How can we find a key in a binary search tree?
// return a pointer to node with key k if
// one exists; otherwise, return nullptr
Node *tree_search(Node *x, KEY k);
• BST Property – the key of any node is:
– > keys of all nodes in its left subtree
– ≤ keys of all nodes in its right subtree
• What are the average- and worst-case complexities?
44
Search
1 // return a pointer to node with key k if
2 // one exists; otherwise, return nullptr
3 Node *tree_search(Node *x, KEY k) {
4 while (x != nullptr && k != x->key) {
5 if (k < x->key)
6 x = x->left;
7 else
8 x = x->right;
9 } // while
10 return x;
11 } // tree_search()
45
Search
1 // return a pointer to node with key k if
2 // one exists; otherwise, return nullptr
3 Node *tree_search(Node *x, KEY k) {
4 if (x == nullptr || x->key == k)
5 return x;
6 if (k < x->key)
7 return tree_search(x->left, k);
8 return tree_search(x->right, k);
9 } // tree_search()
• To search for a key k, we trace a downward path starting
at the root
• The next node visited depends on the outcome of the
comparison of k with the key of the current node
46
Search Example
tree_search(root, 9);
47
Search
• Complexity is O(h), where h is the
(maximum) height of the tree
• Average-case complexity: O(log n)
– Balanced tree
• Worst-case complexity: O(n)
– “Stick” tree
48
Insert
• How do we insert a new key into the tree?
• Similar to search
• Start at the root, and trace a path
downwards, looking for a null pointer to
append the node
49
Insert Example
tree_insert(root, 15);
15
50
Insert with Duplicates
• For sets with no duplicates, use (<, >)
• For duplicates, we need a deterministic policy
– Choose (<= , >) or (<, >=)… slides, STL use (<, >=)
51
Insert
1 void tree_insert(Node *&x, KEY k) {
2 if (x == nullptr)
3 x = new Node(k);
4 else if (k < x->key)
5 tree_insert(x->left, k);
6 else
7 tree_insert(x->right, k);
8 } // tree_insert()
• New node inserted at leaf
• Note the use of reference-to-pointer-to-Node
• Exercise: modify this code to set the parent pointer
52
Exercise
• Start with an empty tree
• Insert these keys, in this order:
12, 5, 18, 2, 9, 15, 19, 17, 13
• Draw the tree
• Write a new order to insert the same keys
which generates a worst-case tree
• How many worst-case trees are possible
for n unique values?
53
Complexity
• The complexity of insert (and many other
tree functions) depends on the height of
the tree
• Average-case (balanced): O(log n)
• Worst-case (unbalanced “stick”): O(n)
• Average-case:
– Random data
– Likely to be well-balanced
56
Exercise
• Write a function to find the Node with the
smallest key
• What are the average- and worst-case
complexities?
// Return a pointer to the Node with the min key
Node *tree_min(Node *x);
57
12
15 192 9
13 17
5 18
Remove
• What if we want to remove a node?
• To remove node z:
1. z has no children (trivial)
2. z has no left child
3. z has no right child
4. z has two children
• Complete algorithm is in CLRS 12.3
59
Remove: Easy Case #1
z has no left child: replace z by right child
60
Remove: Easy Case #2
z has no right child: replace z by left child
61
Remove: Hard Case
• z has left and right children
• Replace with a “combined” tree of both
• Key observation
– All in LHS subtree < all in RHS subtree
– Transplant smallest RHS node to root
• Called the inorder successor
• Must be some such node, since RHS is not empty
• New root might have a right child, but no left child
– Make new root’s left child the LHS subtree
62
Remove: Hard Case
Replace V
with smallest
node’s value
Find smallest
node in RHS
Remove V
from RHS
To Remove V:
63
Single-Function Remove 1/3
1 template
2 void BinaryTree
3 Node *nodeToDelete = tree;
4 Node *inorderSuccessor;
5
6 // Recursively find the node containing the value to remove
7 if (tree == nullptr)
8 return;
9 else if (val < tree->value)
10 remove(tree->left, val);
11 else if (tree->value < val) 12 remove(tree->right, val);
13 else {
64
Single-Function Remove 2/3
14 // Check for simple cases where at least one subtree is empty
15 if (tree->left == nullptr) {
16 tree = tree->right;
17 delete nodeToDelete;
18 } // if
19 else if (tree->right == nullptr) {
20 tree = tree->left;
21 delete nodeToDelete;
22 } // else if
65
Single-Function Remove 3/3
23 else {
24 // Node to delete has both left and right subtrees
25 inorderSuccessor = tree->right;
26
27 while (inorderSuccessor->left != nullptr)
28 inorderSuccessor = inorderSuccessor->left;
29
30 // Replace value with the inorder successor’s value
31 nodeToDelete->value = inorderSuccessor->value;
32 // Remove the inorder successor from right subtree
33 remove(tree->right, inorderSuccessor->value);
34 } // else
35 } // else
36 } // BinaryTree::remove()
66
Summary: Binary Search Trees
• Each node points to two children (left,
right), and possibly a parent
• All nodes are ordered: left < root ≤ right
• Modification of nodes
– External is easy
– Internal is more complicated
• In general, operations on BSTs are:
– O(log n) average-case
– O(n) worst-case
67
Data Structures & Algorithms
Binary Search Trees
Data Structures & Algorithms
AVL Trees
AVL Tree
• Self-balancing Binary Search Tree
• Named for Adelson-Velsky, and Landis
• Start with a BST
• Add the Height Balance Property
– For every internal node v of T, the heights of
the children of v differ by at most 1
– Use rotations to correct imbalance
• Worst-case search/insert is now O(log n)
70
Tree Height
• Measured upward from leaf nodes; all of
which have height equal to 1
• Independent from depth
• Recursive formula for a recursive data
structure
– height(empty) = 0;
– height(node) = max(height(children)) + 1;
71
Is this an AVL tree?
Height Balance Property: For every internal node
v of T, the heights of the children of v differ by at
most 1.
72
5
2
3
5 5
2
Tree 0 Tree 1 Tree 2 Tree 3 ✘✔✔✔
73
5
2
3
7
8
9
5
2
3
7
6
5
2
3
7
Tree 4 Tree 5 Tree 6
Is this an AVL tree?
Height Balance Property: For every internal node
v of T, the heights of the children of v differ by at
most 1.
✘✔✔
AVL Tree: Proof Setup
• h: height of tree
• n(h): minimum number of nodes in AVL tree of height h
• n(0) = 0 (Empty tree)
• n(1) = 1 (Root-only tree)
• For h > 1, an AVL tree of height h contains:
– Root Node
– AVL Subtree of height h – 1
– AVL Subtree of height h – 2
• Thus for h > 1, n(h) = 1 + n(h – 1) + n(h – 2)
74
5
2
3
7
Proof: Height Balance Property
• h: height of tree
• n(h): minimum number of nodes in a tree of height h
• n(h) = 1 + n(h – 1) + n(h – 2)
• Knowing n(h – 1) > n(h – 2) and n(h) > 2n(h – 2),
then by induction, n(h) > 2in(h – 2i)
• Closed form solution, n(h) > 2h/2 – 1
• Taking logarithms: h < 2 log n(h) + 2 • Thus the height of the tree is O(log n) 75 5 2 3 7 AVL Tree Algorithms • Search is the same as a BST • Sort (same as BST) – Insert nodes one at a time • What is worst-case complexity now? – Perform an inorder traversal • Still O(n) for this step 76 AVL Tree Insert • The basic idea: 1. Insert like a BST 2. Rearrange tree to balance height • Each node records its height • Can compute a node’s balance factor – balance(n) = height(n->left) – height(n->right)
• A node that is AVL-balanced:
– balance(n) = 0: both subtrees equal
– balance(n) = +1: left taller by one
– balance(n) = –1: right taller by one
• |balance(n)| > 1: node is out of balance
77
Balance Factor Example
• What is the height for each node?
height(node) = max(height(children)) + 1;
• What is the balance factor?
balance(n) = height(n->left) – height(n->right)
78
5
2
3
7
5
2
3
7
4
1
12
3
1
1
2
3
4
0
0-1
+1
0
-1
-2 0
2
Balance Factor Exercise
Label the balance factor on each node
79
bal(n) = height(n->left) – height(n->right)
Insert (begins like BST)
tree_insert(root, 14);
80
Unbalanced!
Rotations
• We use rotations to rebalance the binary tree
– Swap a parent and one of its children
– BUT preserve the BST ordering property
• Rotation is a local change involving only
three pointers and two nodes
81
5
2
3
7
4
5
2
3 7
4
Rotations
• We use rotations to rebalance the binary
tree
– Interchange the role of a parent and one of its
children in a tree…
– Preserve the BST ordering among the keys in
the nodes
• The second part is tricky
– Right rotation: copy the right pointer of the left
child to be the left pointer of the old parent
– Left rotation: copy the left pointer of the right
child to be the right pointer of the old parent
82
Rotations
Rotate Right: RR(P) Rotate Left: RL(P)
P
LC
P
RST
C
RST
C
LST
P
LC
P
RST
C
RST
C
LST
P
RC
P
LST
C
RST
C
LST
P
RC
P
LST
C
RST
C
LST
83
Rotation Exercise
Use rotations to balance these trees
A
C
B
A
C
B
84
Solution
• First tree is easy:
rotate-left(A)
• single rotation
• Second is harder:
rotate-right(C)
rotate-left(A)
• double rotation
A
C
B A C
B
A
C
B
A
C
B
A C
B
85
RR(C)
RL(A)
RL(A)
Insert
Four Cases
1. Single left rotation
• RL(a)
2. Single right rotation
• RR(a)
3. Double rotation right-left
a. RR(c)
b. RL(a)
4. Double rotation left-right
a. RL(a)
b. RR(c)
86
Single Rotations
T0 T1 T2
T3
c
b
a
T0
T1
T2
T3
c
b
a
single rotation
RL(a)
T3T2T1
T0
a
b
c
T
T2
T1
T0
a
b
c
3
single rotation
RR(c)
87
Double Rotations
RR(c)
T0
T2
T3T1
a
b
c
RL(a)
a
b
c
T0
T
2 T
3
T1
a
b
c
T0
T2 T3
T1
double rotation
is needed
88
Double Rotations
RR(c)
T3
T1
T0 T2
c
b
a
RL(a)
89
c
b
a
T3
T1
T2
T0
c
b
a
T3
T1
T2
T0
double rotation
is needed
90
http://upload.wikimedia.org/wikipedia/commons/c/c4/Tree_Rebalancing.gif
Checking and Balancing
• Outermost if:
Q: Is node out of
balance?
A: > +1: left too big
< -1: right too big • Inner ifs: Q: Do we need a double rotation? A: Only if signs disagree 91 Algorithm checkAndBalance(Node *n) if balance(n) > +1
if balance(n->left) < 0 rotateL(n->left)
rotateR(n)
else if balance(n) < -1
if balance(n->right) > 0
rotateR(n->right)
rotateL(n)
Checking and Balancing
• As insert finishes, after every recursive
call, update height of current node, then
call checkAndBalance() on every node
along the insertion path
• What is the time complexity of this?
• How many “fixes” are needed after insert?
92
Rotation Exercise
• Insert these keys into an AVL tree,
rebalancing when necessary
• 3, 2, 1, 4, 5, 6, 7, 16, 15, 14
93
Rotation Exercise
94
Animation screen capture from visualgo.net
• Insert 3… 2… 1…
• Unbalanced: RR(3)
• Insert 4… 5…
• Unbalanced: RL(3)
• Insert 6…
• Unbalanced: RL(2)
• Insert 7…
• Unbalanced: RL(5)
• Insert 16… 15…
• Unbalanced: RR(16), RL(7)
• Insert 14…
• Unbalanced: RR(15), RL(6)
AVL Tree Remove
1. Remove like a BST
– Key observation: All keys in LHS ≤ all in keys RHS
– Rearrange RHS so that its smallest node is its root
• Must be some such node, since RHS is not empty
• New RHS root has a right child, but no left child
– Make the RHS root’s left child the LHS root
2. Rearrange tree to balance height
– Travel up the tree from the parent of removed node
– At every unbalanced node encountered, rotate as
needed
– This restructuring may unbalance multiple ancestors,
so continue check and rebalance up to the root
– Note that we wanted < on the left, but rebalancing may 95 • Remove as in a binary search tree • Rebalance if an imbalance has been created 44 17 7832 50 8848 62 54 Before: balanced After: unbalanced 96 Rebalance Required! 44 17 7850 8848 62 54 Remove 32 AVL Tree Remove Rebalancing after a Remove • Travel up the tree from w, the parent of the removed node • At every unbalanced node encountered, rotate as needed • This restructuring may unbalance multiple ancestors, so continue checking and rebalancing up to the root 97 44 17 7850 8848 62 54 62 17 78 50 88 48 44 54 w 32 RL(44) How many “fixes” are needed after remove? Rebalancing after a Remove 98 62 17 78 50 88 48 44 54 78 17 78 50 88 48 44 54 78 17 88 50 48 44 54 Copy 78 over 62 Remove 78 from RHS Want to remove 62 Find inorder successor Rebalancing after a Remove 99 78 17 88 50 48 44 54 78 44 88 54 17 50 48 First: RL(44) 78: +2 Bal 44: -1 Bal Need double rotation: RL(44), RR(78) 78 17 88 50 48 44 54 Second: RR(78) Rebalancing: 1 or More? • A “fix” (single or double rotation) shortens a subtree that is too tall • When inserting, the new node is the source of any imbalance, and a single fix will counteract it and repair the entire tree • When removing, a deleted node can only create an imbalance by making subtrees shorter • If a shortened subtree was already shorter than its sibling, a fix is needed at a higher level, so multiple fixes may be required 100 Rebalancing: 1 or More? • Values 1-33 inserted into an AVL in a particular order • Resulting tree is balanced • Multiple fixes required when 4 is removed 101Animation screen capture from visualgo.net Useful Website • https://visualgo.net/en/bst • Close the help • Click on “AVL TREE” near the top • You can insert/remove several nodes at once, or one at a time, slow down or speed up the demo 102 Data Structures & Algorithms AVL Trees Summary • Binary Search Tree – Worst-case insert or search is O(n) • AVL Tree – Worst-case insert or search is O(log n) – Must guarantee height balance property • Operations – Search: O(log n) (same algorithm as BST, but faster) – Insert: O(log n) (Starts like BST, then may rebalance) – Remove: O(log n) (Starts like BST, then may rebalance) – Sort: O(n log n) to build the tree, O(n) to do inorder traversal • Rotation: – O(1) cost for single rotation, O(1) cost for double rotation 104