08_Heaps_and_Heapsort
Lecture 8
Heaps, Priority Queues, and Heapsort
EECS 281: Data Structures & Algorithms
Data Structures & Algorithms
Introduction to Trees
A graph consists of nodes (sometimes called vertices)
connected together by edges.
Each node can contain some data.
A tree is:
(1) a connected graph (nodes + edges) w/o cycles.
(2) a graph where any 2 nodes are connected by
a unique shortest path.
(the two definitions are equivalent)
In a directed tree, we can identify child
and parent relationships between nodes.
In a binary tree, a node has at most
two children.
Trees
3
A
I FD E
H G
B C
Tree Terminology
Root: the “topmost” node in the tree
Parent: Immediate predecessor of a node
Child: Node where current node is parent
Ancestor: parent of a parent (closer to root)
Descendent: child of a child (further from root)
Internal node: a node with children
Leaf node: a node without children
4
A
I FD E
H G
B C
Data Representation: Nodes and Pointers
• A node contains some information, and
points to its left child node and right child node
– Can also include a pointer for parent node
– Can include pointers to 3 children, or a vector of pointers
• Efficient for moving down a tree from parent to child
1 template
2 struct Node { // binary tree node
3 Node *left; // pointer to left child
4 Node *right; // pointer to right child
5 Item item; // data or KEY
6 }; // Node{}
5
Trees are Recursive Structures
Any subtree is just
as much a “tree”
as the original!
6
2
3 07 1
4 3
6 3
Tree Properties
Height:
height(empty) = 0
height(node) = max(height(left_child), height(right_child)) + 1
Size:
size(empty) = 0
size(node) = size(left_child) + size(right_child) + 1
Depth:
depth(empty) = 0
depth(node) = depth(parent) + 1
7
A
I FD E
H G
B C
Data Structures & Algorithms
Introduction to Trees
Data Structures & Algorithms
Heaps Fundamentals
Complete (Binary) Trees
Binary tree: every node has 2 or fewer children
Complete binary tree: every level, except possibly the last,
is completely filled, and all nodes are as far left as possible
A
B C
H G
I FD E
complete NOT complete NOT complete
10
A
B C
H G
FD E
A
B C
H G
I FD E
Data Representation: Complete Binary Trees
• A complete binary tree can be stored efficiently in a growable array
(i.e. vector) by indexing nodes according to level-ordering.
– The completeness ensures no gaps in the array.
– We index starting at 1 because it makes some math work out better…
– To gracefully achieve 1-based indexing with a 0-indexed vector, you can
just add a dummy element at position 0 and ignore it.
11
Notice: root is stored at position [1], not [0]
A
I FD E
H G
B C
1
2 3
4 5 6 7
8 9
B C D E I F GA
[5] [6] [9][1] [2] [3] [4] [7]
H
[8]
Inductive Learning: Parent/Child Math
• Given the index of a node in the tree, find formulas for:
1. The index of its parent
2. The index of its left child
3. The index of its right child
4. Whether that index represents a leaf node?
12
Notice: root is stored at position [1], not [0]
A
I FD E
H G
B C
1
2 3
4 5 6 7
8 9
B C D E I F GA
[5] [6] [9][1] [2] [3] [4] [7]
H
[8]
Heap-Ordered Trees, Heaps
Definition: A tree is (max) heap-ordered if each node’s
priority is not greater than the priority of the node’s parent
Definition: A heap is a set of nodes with keys arranged
in a complete heap-ordered tree, represented as an array
Property: No node in a heap has a key larger than the
root’s key
13
81
1 59 7
2 4
18 14
1
2 3
4 5 6 7
8 9
Heaps
Notice: root is stored at position [1], not [0]
A heap has two crucial properties (representation invariants):
1. Completeness
2. Heap-ordering
We’ll leverage these two properties to create an efficient
priority queue and an efficient sorting algorithm using a heap!
heapsize = 9
14
81
1 59 7
2 4
18 14
1
2 3
4 5 6 7
8 9
18 14 9 7 1 5 481
[5] [6] [9][1] [2] [3] [4] [7]
2
[8]
Heaps – Executive Summary
“Max Heap”: heap with largest element being the “most extreme”
“Min Heap”: heap with smallest element being the “most extreme”
*Do not confuse with binary SEARCH trees
Loose defini)on: data structure that gives easy access
to the most extreme element, e.g., maximum or minimum
Heaps use complete (binary) trees* as the underlying
structure, but are o
6 swap(heap[k], heap[j]);
7 k = j; // move down
8 } // while
9 } // fixDown()
1) Pass index k of array element with decreased priority
2) Exchange the key in the given node with the highest
priority key among the node’s children, moving down until:
a. the node has no children (leaf node), or
b. the node has no children with a higher key 22
Data Structures & Algorithms
Modifying Heaps
Data Structures & Algorithms
Priority Queue Implementations
Priority Queue (PQ)
Definition: a priority queue is a data structure that
supports three basic operations:
• insertion of a new item push()
• inspection of the highest priority item top()
• removal of the item with the highest priority pop()
Priority queues are often implemented using heaps because
insertion/removal operations have the same time complexity
PQ essential for upcoming algorithms,
e.g., shortest-path, Dijkstra’s algorithm
PQ useful for past and current (this lecture) algorithms,
e.g., heapsort, sorting in reverse order
25
PQ – Insertion
26
pq.push(100)
1 59
2 4
14
1
2 3
4 5 6 7
8 9
14 9 1 5 4
[5] [6] [9][1] [2] [3] [4] [7]
2
[8] [10]
10
100
7
18
81
7 1001881
PQ – Insertion
1 void push(Item newItem) {
2 heap[++heapsize] = newItem;
3 fixUp(heap, heapsize);
4 } // push()
1) Insert newItem into bottom of tree/heap, i.e., last
element
2) newItem “bubbles up” tree swapping with parent while
parent’s key is less (use greater for min-heap)
Insertion operation must maintain heap invariants,
– “max (or min) element must be root”
27
PQ – Deletion
28
pq.pop()
1 5
2
14
1
2 3
4 5 6 7
8 9
7
18
81
14 1 5
[5] [6] [9][1] [2] [3] [4] [7]
2
[8]
71881
4
9
49
PQ – Deletion
1 void pop() {
2 heap[1] = heap[heapsize–];
3 fixDown(heap, heapsize, 1);
4 } // pop()
Deletion operation can only remove root and must
maintain heap invariants
– “max (or min) element must be root”
1) Remove root element – results in disjoint heap
2) Move the last element into the root position
3) New root “sinks” down the tree swapping with highest
priority child whose key is greater (less for min-heap)
29
http://xkcd.com/835/
“Not only is that terrible in general, but you just KNOW
Billy’s going to open the root present first, and then
everyone will have to wait while the heap is rebuilt.” 30
PQ – Summary
• Priority queue is an ADT
– Supports insertion, inspection of top item, and removal
• Unordered array PQ
– O(1) insertion of an item
– O(n) inspection of top item
– O(n) removal of top item, or O(1)
• Sorted array PQ
– O(n) insertion of an item
– O(1) inspection of top item
– O(1) removal of top item
• Heap
– Efficient O(log n) insertion of an item using fixUp()
– O(1) inspection of top item
– Efficient O(log n) removal of top item using fixDown()
– Must maintain heap property
31
Data Structures & Algorithms
Priority Queue Implementations
Data Structures & Algorithms
Heapify & Heapsort
Building a Heap: Heapify
• You could start with an empty vector, and add
elements one at a time, keeping the heap
property valid after each push()
– Insert n elements, O(log n) for each push() produces
O(n log n) time
– Requires an extra vector, or O(n) extra memory
• Sort in reverse order: O(n log n); O(1) memory
• Instead, modify the given array: proceed from
bottom to top or top to bottom, using fixDown()
or fixUp()
– 4 possibilities; two work and two don’t
– Those that work have different complexities
34
81
9 54 7
2 1
18 14
1
2 3
4 5 6 7
8 9
1
9 54 7
2 81
18 14
1
2 3
4 5 6 7
8 9
18
9 54 1
2 81
7 14
1
2 3
4 5 6 7
8 9
2
9 54 1
18 81
7 14
1
2 3
4 5 6 7
8 9
14
2 54 1
18 81
7 9
1
2 3
4 5 6 7
8 9
Sorting With a Heap
1) Swap: 81 ↔ 1
2) Fix-down [1]…[8]
3) Swap: 18 ↔ 2
4) Fix-down [1]…[7]
5) Swap: 14 ↔ 5
6) Fix-down [1]…[6]
7) Swap: 9 ↔ 2
8) Fix-down [1]…[5]
38
18 14 4 7 9 5 181
[5] [6] [9][1] [2] [3] [4] [7]
2
[8]
18 14 4 7 9 5 811
[5] [6] [9][1] [2] [3] [4] [7]
2
[8]
7 14 4 1 9 5 8118
[5] [6] [9][1] [2] [3] [4] [7]
2
[8]
7 14 4 1 9 5 812
[5] [6] [9][1] [2] [3] [4] [7]
18
[8]
7 9 4 1 2 5 8114
[5] [6] [9][1] [2] [3] [4] [7]
18
[8]
7 9 4 1 2 14 815
[5] [6] [9][1] [2] [3] [4] [7]
18
[8]
7 5 4 1 2 14 819
[5] [6] [9][1] [2] [3] [4] [7]
18
[8]
7 5 4 1 9 14 812
[5] [6] [9][1] [2] [3] [4] [7]
18
[8]
4 5 2 1 9 14 817
[5] [6] [9][1] [2] [3] [4] [7]
18
[8]
5
2 144 1
18 81
7 9
1
2 3
4 5 6 7
8 9
9
2 144 1
18 81
7 5
1
2 3
4 5 6 7
8 9
2
9 144 1
18 81
7 5
1
2 3
4 5 6 7
8 9
7
9 142 1
18 81
4 5
1
2 3
4 5 6 7
8 9
7
9 142 1
18 81
4 5
1
2 3
4 5 6 7
8 9
1
9 142 7
18 81
4 5
1
2 3
4 5 6 7
8 9
5
9 142 7
18 81
4 1
1
2 3
4 5 6 7
8 9
2
9 145 7
18 81
4 1
1
2 3
4 5 6 7
8 9
4
9 145 7
18 81
2 1
1
2 3
4 5 6 7
8 9
1
9 145 7
18 81
2 4
1
2 3
4 5 6 7
8 9
1
9 145 7
18 81
2 4
1
2 3
4 5 6 7
8 9
4 5 2 1 9 14 817
[5] [6] [9][1] [2] [3] [4] [7]
18
[8]
4 5 2 7 9 14 811
[5] [6] [9][1] [2] [3] [4] [7]
18
[8]
4 1 2 7 9 14 815
[5] [6] [9][1] [2] [3] [4] [7]
18
[8]
4 1 5 7 9 14 812
[5] [6] [9][1] [2] [3] [4] [7]
18
[8]
2 1 5 7 9 14 814
[5] [6] [9][1] [2] [3] [4] [7]
18
[8]
2 4 5 7 9 14 811
[5] [6] [9][1] [2] [3] [4] [7]
18
[8]
2 4 5 7 9 14 811
[5] [6] [9][1] [2] [3] [4] [7]
18
[8]
9) Swap: 7 ↔ 1
10) Fix-down [1]…[4]
11) Swap: 18 ↔ 2
12) Fix-down [1]…[3]
13) Swap: 14 ↔ 5
14) Fix-down [1]…[2]
39
Sorting With a Heap
Heapsort
Intuition: repeatedly relocate the highest-priority element
from PQ to the back
Easily implemented as an array; entire sort done in place
1 void heapsort(Item heap[], int n) {
2 heapify(heap, n);
3 for (int i = n; i >= 2; –i) {
4 swap(heap[i], heap[1]);
5 fixDown(heap, i – 1, 1);
6 } // heapsort()
7 } // heapsort()
Part 1: Transform unsorted array into heap (heapify)
Part 2: Remove the highest priority item from heap,
add it to sorted sequence, and fix the heap, repeat…
Root is
index 1
40
18 14 4 7 9 5 181
[5] [6] [9][1] [2] [3] [4] [7]
2
[8]
Heapsort Summary
• Take the given n elements, convert into a heap
– Bottom-up fixDown(), heapify takes O(n) time
• Remove elements one at a time, filling original
array from back to front
– Swap element at top of heap with last unsorted: O(1)
– fixDown() to bottom: each takes O(log n) time, n of
them
• Total runtime: O(n log n)
– requires no additional space
41
Data Structures & Algorithms
Heapify & Heapsort