COMP3600/6466 – Algorithms
Abstract Data Structures:
Binary Search Tree + Heaps [CLRS 12.1-12.3, ch. 6]
Hanna Kurniawati
https://cs.anu.edu.au/courses/comp3600/ Comp_3600_6466@anu.edu.au
Abstract Data Structures
• Abstract Data Structures can be thought of as a mathematical model for data representation.
• An Abstract Data Structure consists of two components:
• A container that holds the data
• A set of operations on the data. These operations are defined based on their behavior (i.e., input-output and time (or sometimes space) complexity), rather than their exact implementation
Why bother with so many of them?
• Recall that algorithms transform input to output. This usually means the algorithm will perform certain operations on the input data, so as to transform them into the desired output
• The way these data are represented in the algorithm could often influence the efficiency of the algorithm
• Often, transforming the way the data are being represented makes solving becomes more efficient • This technique is called Representation Change
• This is particularly useful when the data is dynamic, i.e., the input might change (the input size might increase, decrease, or the content changes) during run-time
Topics
• Binary Search Tree • Heaps
• AVL Tree
• Red-black Tree
Topics
• Binary Search Tree • What is it?
• Tree walk & Querying • Insertion & Deletion
• Heaps
• AVL Tree
• Red-black Tree
What is Binary Search Tree (BST)?
• An abstract data structure that represents the data as
a binary tree with the following property:
• Each node in the tree contains the data itself, as key + satellite data, and pointers to its left child, right child, and parent node, usually denoted as left, right, p
• Key: Data used for BST operations. In BST, all keys are distinct
• Satellite data: Data carried around but not used in the implementation
• The root node is the only node with p = NIL
• Suppose𝑥isanodeoftheBST.If𝑦istheleftsub-treeof𝑥 then, 𝑦. 𝑘𝑒𝑦 ≤ 𝑥. 𝑘𝑒𝑦. If 𝑦 is the right sub-tree of 𝑥 then, 𝑦.𝑘𝑒𝑦 ≥𝑥.𝑘𝑒𝑦
What is Binary Search Tree (BST)?
• Operations: Search, Min, Max, Successor, Predecessor, Insert, Delete in 𝑂 h time, where h is the height of the tree
Topics
• Binary Search Tree üWhat is it?
• Tree walk & Querying • Insertion & Deletion
• Heaps
• AVL Tree
• Red-black Tree
Binary Search Tree (BST)
• Its property allows us to easily output all the keys in a sorted order in Θ(𝑛) via in-order tree walk
Example
In-order tree walk: 2 3 4 6 7 9 13 15 17 18 20
Proof for Θ(𝑛) time complexity •Suppose𝑇𝑛 isthetotaltimetakenbyinorder-tree-
walk
• The lower bound is straight-forward, each node is
visitedatleastonce,hence𝑇𝑛 =Ω(𝑛) •Now,weshow𝑇 𝑛 =𝑂(𝑛)
Proof for Θ(𝑛) time complexity •Suppose𝑇𝑛 isthetimetakenbyinorder-tree-walk
• The lower bound is straight-forward, each node is visitedatleastonce,hence𝑇𝑛 =Ω(𝑛)
•Now,weshow𝑇 𝑛 =𝑂(𝑛)
• Noticethat𝑇 𝑛 =𝑇 𝑘 +𝑇(𝑛−𝑘−1),where𝑘is#nodesin
the left sub-tree. We can assume 𝑇 1 = 1
• Now, let’s use substitution method, with base 𝑇 1
• Inductive hypothesis: 𝑇 𝑛 ≤ 𝑐𝑛
• Inductionstep: 𝑇 𝑛+1 =𝑇 𝑘 +𝑇 𝑛+1−𝑘−1 ≤𝑐𝑘+𝑐 𝑛+1−𝑘−1
=𝑐𝑘+𝑐 𝑛+1 −𝑐𝑘−𝑐
=𝑐 𝑛+1 −𝑐 ≤ 𝑐(𝑛 + 1)
= 1
Binary Search Tree (BST)
• Its property allows us to easily output all the keys in a sorted order in Θ(𝑛) via in-order tree walk
• What will the output be if we swap lines 2 & 3? • Pre-order tree walk
• What will the output be if we swap lines 3 & 4? • Post-order tree walk
Binary Search Tree: Search Operation
• To search for a key whose value is 𝑘:
Binary Search Tree : Min & Max
• Where’s the Min in a BST? • Where’s the Max in a BST?
Topics
• Binary Search Tree üWhat is it?
üTree walk & Querying • Insertion & Deletion
• Heaps
• AVL Tree
• Red-black Tree
Binary Search Tree: Insertion
• Suppose we want to insert a new node 𝑧, where 𝑧.𝑘𝑒𝑦 = 𝑣,𝑧.𝑙𝑒𝑓𝑡 = 𝑁𝐼𝐿,𝑧.𝑟𝑖𝑔h𝑡 = 𝑁𝐼𝐿
• Traverse the tree until the right position of 𝑧 is found • Add 𝑧 to the tree
Binary Search Tree: Insertion
Binary Search Tree: Successor &
Predecessor
• Successor of a node 𝑥 is the node with the smallest key greater than 𝑥. 𝑘𝑒𝑦 –that is, the node visited right after 𝑥 in in-order tree walk
• If 𝑥 has a non-empty right sub-tree, its successor is the minimum of its right sub-tree
• Otherwise, 𝑥’s successor is the lowest ancestor of 𝑥 whose left child is also an ancestor of 𝑥
• Predecessor of a node 𝑥 is the node with the largest key smaller than 𝑥. 𝑘𝑒𝑦 –that is, the node visited right before 𝑥 in inorder tree walk
• If 𝑥 has a non-empty left sub-tree, its predecessor is the maximum of its left sub-tree
• Otherwise, 𝑥’s predecessor is the lowest ancestor of 𝑥 whose right child is also an ancestor of 𝑥
Binary Search Tree: Deletion
• Suppose we want to delete 𝑧. There’s 3 cases:
• When 𝑧 has no children: Remove 𝑧 and modify its parent to
replace 𝑧 with NIL
• When 𝑧 has only 1 child: Elevate the child to take 𝑧’s position in the tree by modify its parent to replace 𝑧 with 𝑧’s child
Binary Search Tree: Deletion
• Suppose we want to delete 𝑧. There’s 3 cases:
• When 𝑧 has no children: Remove 𝑧 and modify its parent to
replace 𝑧 with NIL
• When 𝑧 has only 1 child: Elevate the child to take 𝑧’s position
in the tree by modify its parent to replace 𝑧 with 𝑧’s child
• When 𝑧 has 2 children, consider its successor 𝑦 (since 𝑧 has 2 children, 𝑦 must be in its right subtree and 𝑦 does not have a left child) :
• If 𝑦 is 𝑧’s right child, replace 𝑧 with 𝑦
• 𝑧’s left child becomes 𝑦’s left child, and 𝑦 replaces 𝑧’s position
• If 𝑦 is not 𝑧’s right child, first replace 𝑦 by its right child and then replace 𝑧 with 𝑦
• The requirement to find a successor cause deletion to take 𝑂(h), rather than constant
Is Deletion Commutative?
• Suppose T is a binary search tree and we want to delete two elements, a1 and a2 from T. Would the resulting tree always be the same if we delete a1 first and then a2, compared to if we delete a2 first and then a1?
• No, it’s not commutative. Example: Deleting 2 and then 1 in the tree below would result in a different tree compared
to if we delete 1 and then 2
2
14 3
Is Deletion Commutative?
3
14
2
4 3
Delete 1
3
4
2
14 3
Delete 2
4 3
Delete 2
Delete 1
Topics
• Binary Search Tree üWhat is it?
üTree walk & Querying üInsertion & Deletion
• Heaps
• AVL Tree
• Red-black Tree
Topics
üBinary Search Tree • Heaps
• AVL Tree
• Red-black Tree
Topics
üBinary Search Tree
• Heaps
• What is it?
• Heapify
• Insertion & Building • Extract/Deletion
• AVL Tree
• Red-black Tree
What is A Heap?
• A heap is a binary tree that satisfies heap property:
• A heap is a complete binary tree
• The nodes contain data similar to binary search tree:
• Each node has a key, in which node order are based on
• Each node may contain additional information
• The parent node has key greater than the keys in its children
• Complete binary tree: A perfect binary tree where at
the last level, some rightmost leaves may be missing
• Perfect binary tree: A tree where all interior nodes have 2 children and all leaves are at the same level
• Since a heap is a complete tree, it can be implemented easily with an array
• Left & right children becomes index of the array that holds the left & right child nodes
What is A Heap?
• There’s 2 types of heap: Max-heap and Min-heap
• The one we discussed in the previous slide is
Max-heap
• Throughout this class, we’ll assume the heap is Max-heap unless stated otherwise
• Min-heap is similar to Max-heap but, the parent node has key less than the keys in its children
Example [CLRS sec. 6.1]
What is A Heap?
• Main Operations:
• Heapify to ensure heap properties are satisfied
• Insert a node to an existing heap
• ExtractMax (for Min-heap, this operation is
ExtractMin) to retrieve an element from the heap
• All of the above operations are 𝑂(log 𝑛), where 𝑛 is the number of nodes in the tree
Topics
üBinary Search Tree
• Heaps üWhat is it?
• Heapify
• Insertion & Building • Extract/Deletion
• AVL Tree
• Red-black Tree
Heapify: Maintaining Heap Property
• Intuitively, heapify a node means traversing the tree from root until a suitable place (to ensure heap order) is found for the node
• Pseudo-code [CLRS sec. 6.2]
Example
Does max-heapify takes 𝑂(log 𝑛) time?
•Thetotaltime𝑇𝑛 isarecurrence
• The question is what’s the size of the sub-problem
• Since a heap is a complete binary tree, the largest imbalance, in which one sub-problem is maximised (in proportion to the total size) happens when the last level is half full. When this happens, we have:
𝑛 = 1 + SizeLeftSubTree + SizeRightSubTree SizeLeftSubTree = ∑$ 2! = 2$%& − 1
SizeRightSubTree = ∑$’& 2! = 2$ − 1 !”#
Where h is the height of the tree
!”#
Does max-heapify takes 𝑂(log 𝑛) time?
𝑛 = 1 + SizeLeftSubTree + SizeRightSubTree 𝑛 = 1 + 2$%& − 1 + 2$ − 1
𝑛 = 2$(2 + 1) − 1
2$ = 𝑛 + 1 3
SizeLeftSubTree = ∑$ 2! = 2$%& − 1 !”#
= 2 (%& − 1 ≤ *( ))
Proof that max-heapify takes 𝑂(log 𝑛) • Therefore the total time for max-heapify is
𝑇𝑛 ≤𝑇 2𝑛 +𝐶 3
•Solvingtherecurrencegivesus𝑇𝑛 =𝑂(log𝑛)
• We can also specify the time complexity in terms of thetreeheight,h,inwhichcase𝑇𝑛 =𝑂(h)
Topics
üBinary Search Tree
• Heaps üWhat is it?
üHeapify
• Insertion & Building • Extract/Deletion
• Applications
• AVL Tree
• Red-black Tree
Inserting a node to a heap
• Can also be used to build a heap when data is received dynamically
• Intuitively, add a new node at the bottom right most heap and then find a suitable position for the new node
• Pseudo-code [CLRS pp.164]
Example
Time Complexity?
• In the worst case, line 4—6 of heap- increase-key is called as many as the height of the tree. Therefore, the time complexity is 𝑂 h = 𝑂 log 𝑛
Building a heap
• We can insert elements to the heap one by one using the insertion method we’ve discussed
• Time complexity: 𝑂(𝑛 log 𝑛)
• If we have all the elements a priori, can we do better?
• Yes:
Example
Time complexity for build-max-heap
• We’ll compute the total time as the number of times max- heapify is called multiply by the cost of each max-heapify: