COMP20007 Design of Algorithms
Sorting – Part 3
Daniel Beck
Lecture 13
Semester 1, 2020
1
Priority Queues
• A set of elements, each one with a priority key.
2
Priority Queues
• A set of elements, each one with a priority key. • Inject: put a new element in the queue.
2
Priority Queues
• A set of elements, each one with a priority key.
• Inject: put a new element in the queue.
• Eject: find the element with the highest priority and remove it from the queue.
2
Priority Queues
• A set of elements, each one with a priority key.
• Inject: put a new element in the queue.
• Eject: find the element with the highest priority and remove it from the queue.
• Used as part of an algorithm (ex: Dijkstra)…
2
Priority Queues
• A set of elements, each one with a priority key.
• Inject: put a new element in the queue.
• Eject: find the element with the highest priority and remove it from the queue.
• Used as part of an algorithm (ex: Dijkstra)…
• …or on its own (ex: OS job scheduling).
2
Sorting using a Priority Queue
• Different implementations result in different sorting algorithms.
3
Sorting using a Priority Queue
• Different implementations result in different sorting algorithms.
• If using an unsorted array/list, we obtain Selection Sort.
3
Sorting using a Priority Queue
• Different implementations result in different sorting algorithms.
• If using an unsorted array/list, we obtain Selection Sort. • If using an heap, we obtain Heapsort.
3
The Heap
It’s a tree with a set properties:
4
The Heap
It’s a tree with a set properties:
• Binary (at most two children allowed per node)
4
The Heap
It’s a tree with a set properties:
• Binary (at most two children allowed per node)
• Balanced (the difference in height between two leaf nodes is never higher than 1)
4
The Heap
It’s a tree with a set properties:
• Binary (at most two children allowed per node)
• Balanced (the difference in height between two leaf nodes is never higher than 1)
• Complete (all levels are full except for the last, where only rightmost leaves can be missing) (implies Balanced)
4
The Heap
It’s a tree with a set properties:
• Binary (at most two children allowed per node)
• Balanced (the difference in height between two leaf nodes is never higher than 1)
• Complete (all levels are full except for the last, where only rightmost leaves can be missing) (implies Balanced)
• Parental dominance (the key of a parent node is always higher than the key of its children)
4
Heaps and Non-Heaps
Which of these are heaps?
10 25 25
5 20 20 10 20 10
3 7 12 25 3 12 7 5 3 12 5
25 25 25
7 20 20 5 25 25
312 712 257257
5
Heaps as Arrays
We can utilise the completeness of the tree and place its elements in level-order in an array A.
Note that the children of node i will be nodes 2i and 2i + 1.
0
X
1 TO 23
GSMN
4567 AERAI
8 9 10 11 12
X
T
O
G
S
M
N
A
E
R
A
I
A:
0 1 2 3 4 5 6 7 8 9 10 11 12
6
Heaps as Arrays
This way, the heap condition is simple:
∀i ∈ {0,1,…,n}, we must have
A[i] ≤ A[⌊i/2⌋].
0
X
1 TO 23
GSMN
4567 AERAI
8 9 10 11 12
X
T
O
G
S
M
N
A
E
R
A
I
A:
0 1 2 3 4 5 6 7 8 9 10 11 12
7
Heapsort
function Heapsort(A[1..n]) ▷ Assume A[0] as a sentinel Heapify(A[1..n])
for i ← n to 0 do
Eject(A[1..i])
8
Heapify
function BottomUpHeapify(A[1..n]) for i ← ⌊n/2⌋ downto 1 do
k←i
v ← A[k]
heap ← False
while not heap and 2 × k ≤ n do
j←2×k
if j < n then
▷ two children if A[j] < A[j + 1] then j ← j + 1
if v ≥ A[j] then heap ← True elseA[k] ← A[j]; k ← j
A[k] ← v
9
Analysis of Bottom-Up Heapify
Assume the heap is a full binary tree: n = 2h+1 − 1.
10
Analysis of Bottom-Up Heapify
Assume the heap is a full binary tree: n = 2h+1 − 1.
Here is an upper bound on the number of “down-sifts” needed
h−1 h−1 ∑∑∑i
2(h−i) = 2(h−i)2 = 2(n−log2(n+1)) i=0
i=0 nodes at level i
The last equation can be proved by mathematical induction.
10
Analysis of Bottom-Up Heapify
Assume the heap is a full binary tree: n = 2h+1 − 1.
Here is an upper bound on the number of “down-sifts” needed
h−1 h−1 ∑∑∑i
2(h−i) = 2(h−i)2 = 2(n−log2(n+1)) i=0
i=0 nodes at level i
The last equation can be proved by mathematical induction.
Note that 2(n − log2(n + 1)) ∈ Θ(n), hence we have a linear-time algorithm for heap creation.
10
Eject
function Eject(A[1..i]) Swap(A[i], A[1]) k←1
v ← A[k]
heap ← False
while not heap and 2 × k ≤ i − 1 do
j←2×k
if j < i − 1 then
if A[j] < A[j + 1] then j ← j + 1 if v ≥ A[j] then heap ← True
elseA[k] ← A[j]; k ← j A[k] ← v
▷ two children
11
Heapsort - Properties
Transform-and-Conquer paradigm
12
Heapsort - Properties
Transform-and-Conquer paradigm
• Exactly as Selection Sort, but with a preprocessing step.
12
Heapsort - Properties
Transform-and-Conquer paradigm
• Exactly as Selection Sort, but with a preprocessing step.
Questions!
12
Heapsort - Properties
Transform-and-Conquer paradigm
• Exactly as Selection Sort, but with a preprocessing step.
Questions!
• In-place?
12
Heapsort - Properties
Transform-and-Conquer paradigm
• Exactly as Selection Sort, but with a preprocessing step.
Questions!
• In-place? • Stable?
12
Heapsort - Properties
• In-place?
13
Heapsort - Properties
• In-place? Yes! Only additional O(1) memory for auxiliary variables.
13
Heapsort - Properties
• In-place? Yes! Only additional O(1) memory for auxiliary variables.
• Stable?
13
Heapsort - Properties
• In-place? Yes! Only additional O(1) memory for auxiliary variables.
• Stable? No. Non-local swaps break stability.
13
Heapsort - Complexity
• We already saw in the videos that the complexity of Heapsort in the worst case is Θ(n log n).
14
Heapsort - Complexity
• We already saw in the videos that the complexity of Heapsort in the worst case is Θ(n log n).
• What’s the complexity in the best case?
14
Heapsort - Complexity
• Heapify is Θ(n) in the worst case.
15
Heapsort - Complexity
• Heapify is Θ(n) in the worst case. • Eject is Θ(log n) in the worst case.
15
Heapsort - Complexity
• Heapify is Θ(n) in the worst case.
• Eject is Θ(log n) in the worst case.
• In the worst case, heapsort is Θ(n)+n×Θ(logn) ∈ Θ(nlogn).
15
Heapsort - Complexity (best case)
16
Heapsort - In practice
17
Heapsort - In practice
• Vs. Mergesort: fully in-place but not stable
17
Heapsort - In practice
• Vs. Mergesort: fully in-place but not stable
• Vs. Quicksort: guaranteed Θ(n log n) performance but empirically slower.
17
Heapsort - In practice
• Vs. Mergesort: fully in-place but not stable
• Vs. Quicksort: guaranteed Θ(n log n) performance but empirically slower.
• Used in the Linux kernel.
17
Heapsort - In practice
• Vs. Mergesort: fully in-place but not stable
• Vs. Quicksort: guaranteed Θ(n log n) performance but empirically slower.
• Used in the Linux kernel.
Take-home message: Heapsort is the best choice when low-memory footprint is required and guaranteed
Θ(n log n) performance is needed (for example, security reasons).
17
Sorting - Practical Summary
• Selection Sort: slow, but O(n) swaps.
18
Sorting - Practical Summary
• Selection Sort: slow, but O(n) swaps.
• Insertion Sort: low-memory, stable, excellent for small and almost sorted data.
18
Sorting - Practical Summary
• Selection Sort: slow, but O(n) swaps.
• Insertion Sort: low-memory, stable, excellent for small
and almost sorted data.
• Mergesort: good for mid-size data and when stability is required. Also good with secondary memory devices. Extra O(n) memory cost can be a barrier.
18
Sorting - Practical Summary
• Selection Sort: slow, but O(n) swaps.
• Insertion Sort: low-memory, stable, excellent for small
and almost sorted data.
• Mergesort: good for mid-size data and when stability is required. Also good with secondary memory devices. Extra O(n) memory cost can be a barrier.
• Quicksort: best for more general cases and large amounts of data. Not stable.
18
Sorting - Practical Summary
• Selection Sort: slow, but O(n) swaps.
• Insertion Sort: low-memory, stable, excellent for small
and almost sorted data.
• Mergesort: good for mid-size data and when stability is required. Also good with secondary memory devices. Extra O(n) memory cost can be a barrier.
• Quicksort: best for more general cases and large amounts of data. Not stable.
• Heapsort: slower in practice but low-memory and guaranteed Θ(n log n) performance.
18
Sorting - Paradigm Summary
• Selection Sort: brute force
19
Sorting - Paradigm Summary
• Selection Sort: brute force
• Insertion Sort: decrease-and-conquer
19
Sorting - Paradigm Summary
• Selection Sort: brute force
• Insertion Sort: decrease-and-conquer
• Mergesort, Quicksort: divide-and-conquer
19
Sorting - Paradigm Summary
• Selection Sort: brute force
• Insertion Sort: decrease-and-conquer
• Mergesort, Quicksort: divide-and-conquer • Heapsort: transform-and-conquer
19
Sorting - Paradigm Summary
• Selection Sort: brute force
• Insertion Sort: decrease-and-conquer
• Mergesort, Quicksort: divide-and-conquer • Heapsort: transform-and-conquer
There are all comparison-based sorting algorithms.
19
Sorting - Paradigm Summary
• Selection Sort: brute force
• Insertion Sort: decrease-and-conquer
• Mergesort, Quicksort: divide-and-conquer • Heapsort: transform-and-conquer
There are all comparison-based sorting algorithms.
Next lecture: Ok, my data is sorted. Now how do I keep it sorted?
19