程序代做CS代考 data structure algorithm FIT2004

FIT2004
Algorithms and Data Structures
Course notes by NDERSON
a 5 b 8 c
2
0 25
4 7 7 10
69
ds
e
16 49
1
4
8 10
f g
3 28 159
47 6
Last updated September 22, 2021

Contents
Analysis of Algorithms
1 Analysis of Algorithms: Part I 1
1.1 ProgramVerification……………………………………… 1 1.1.1 Arguingcorrectness …………………………………. 2 1.1.2 Arguingtermination…………………………………. 3
1.2 ComplexityAnalysis ……………………………………… 4 1.2.1 Solutionstocommonrecurrencerelations ………………….. 5
1.3 AsymptoticNotation……………………………………… 6
1.4 MeasuresofComplexity …………………………………… 7
2 Analysis of Algorithms: Part II 9
2.1 QuadraticSortingAlgorithms ……………………………….. 9 2.1.1 Selectionsort……………………………………… 10 2.1.2 Insertionsort……………………………………… 12
2.2 PropertiesofAlgorithms …………………………………… 14
Sorting and Selecting
3 Fast Sorting Algorithms 15
3.1 Heapsort(Revision) ……………………………………… 16
3.2 MergeSort(Revision) …………………………………….. 19
3.3 Quicksort…………………………………………….. 20
3.4 ComplexityLowerBoundsforSorting ………………………….. 27
3.5 SortingIntegersFast……………………………………… 28
3.5.1 Countingsort……………………………………… 29 3.5.2 Radixsort ……………………………………….. 30
4 Order Statistics and Selection 35
4.1 OrderStatisticsandtheSelectionProblem……………………….. 35
4.2 TheQuickselectAlgorithm …………………………………. 37
4.3 RandomisedPivotSelection ………………………………… 37
4.4 MedianofMedians(NotexaminableinSemesterOne,2020). . . . . . . . . . . . . . . 39
Dynamic Programming
5 Dynamic Programming: Part I 43
5.1 TheKeyElementsofDynamicProgramming ……………………… 44 5.1.1 Memoisation-ComputingFibonaccinumbers ……………….. 44
5.1.2 Optimalsubstructure-Thecoinchangeproblem . . . . . . . . . . . . . . . . . . 46

5.2 Top-downvs.Bottom-upDynamicProgramming ………………….. 48 5.3 Reconstructing Optimal Solutions to Optimisation Problems . . . . . . . . . . . . . . . 49
6 Dynamic Programming: Part II 51
6.1 TheUnboundedKnapsackProblem …………………………… 51
6.2 The0-1KnapsackProblem …………………………………. 53
6.3 TheEditDistanceProblem …………………………………. 55
6.4 Matrix Multiplication (Not examinable in Semester One, 2020) . . . . . . . . . . . . . 58
6.5 TheSpace-SavingTrick……………………………………. 61
Data Structures
7 Hashing and Hashtables 63
7.1 Hashtables(Revision) …………………………………….. 64 7.2 WhatisaGoodHashFunction?………………………………. 66 7.3 HashingIntegers………………………………………… 67 7.4 HashingStrings ………………………………………… 68 7.5 UniversalHashing ………………………………………. 70 7.6 CuckooHashing………………………………………… 71
8 Balanced Binary Search Trees 73
8.1 AVLTrees…………………………………………….. 73 8.1.1 Definitions……………………………………….. 74 8.1.2 Rebalancing………………………………………. 75
Strings
9 Prefix Tries and Suffix Trees 79
9.1 ThePrefixTrie/RetrievalTreeDataStructure …………………….. 80 9.1.1 Applicationsoftries …………………………………. 81 9.2 SuffixTrees……………………………………………. 83 9.2.1 Buildingasuffixtree…………………………………. 85 9.2.2 Applicationsofsuffixtrees …………………………….. 85
10 Suffix Arrays 87
10.1 BuildingaSuffixArray…………………………………….. 89
10.1.1 Thenaiveapproach …………………………………. 89
10.1.2 Theprefixdoublingalgorithm ………………………….. 89
10.1.3 Faster suffix array algorithms (Not examinable in Semester One, 2020) . . 92
10.2 ApplicationsofSuffixArrays ………………………………… 93 10.2.1 Patternmatchingwithsuffixarrays……………………….. 93
11 The Burrows- 95
11.1 TheBurrows-WheelerTransform……………………………… 95 11.1.1 UsefulpropertiesoftheBWT …………………………… 96 11.1.2 ComputationoftheBWT ……………………………… 97

11.2TheInverseBurrows-WheelerTransform………………………… 98 11.2.1 Thenaiveinversionmethod ……………………………. 98 11.2.2 EfficientinversionoftheBWT…………………………… 100
11.3 PatternMatchingwiththeBurrows-WheelerTransform . . . . . . . . . . . . . . . . . . . 104
Graph and Network Algorithms
12 Graph Basics 109
12.1 ModellingwithGraphs ……………………………………. 109
12.2 RepresentationandStorageofGraphs ………………………….. 113
12.3 GraphTraversalandApplications …………………………….. 116
12.3.1 Depth-firstsearch ………………………………….. 116 12.3.2 Findingconnectedcomponents …………………………. 118 12.3.3 Cyclefinding ……………………………………… 119 12.3.4 Breadth-firstsearch …………………………………. 119 12.3.5 Unweightedshortestpaths…………………………….. 121
13 Shortest Paths 123
13.1 PropertiesofShortestPaths…………………………………. 124
13.2 Distanceestimatesandtherelaxationtechnique …………………… 127
13.3 TheBellman-FordAlgorithm………………………………… 127
13.4 Dijkstra’sAlgorithm………………………………………. 129
13.5 TheAll-PairsShortestPathProblem……………………………. 133
13.5.1 TheFloyd-Warshallalgorithm…………………………… 134 13.5.2 Transitiveclosure…………………………………… 136
14 Incremental Graph Connectivity 137
14.1 ConnectivityinGraphs ……………………………………. 137 14.2 TheIncrementalConnectivityProblem …………………………. 138 14.3 TheUnion-FindDisjoint-SetDataStructure………………………. 139
15 Minimum Spanning Trees 143
15.1 Prim’sAlgorithm………………………………………… 144 15.2 Kruskal’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
16 Directed Acyclic Graphs 149
16.1 TheTopologicalSortingProblem……………………………… 150 16.2 TheCriticalPathProblem ………………………………….. 152
17 Network Flow 155
17.1 TheFord-FulkersonAlgorithm ………………………………. 158 17.1.1 Theresidualnetwork ………………………………… 158 17.1.2 Augmentingpaths ………………………………….. 159 17.1.3 Implementationusingdepth-firstsearch……………………. 160
17.2 TheMinimumCutProblem…………………………………. 162 17.2.1 Themin-cutmax-flowtheorem …………………………. 163
17.3 BipartiteMatching ………………………………………. 165

List of Algorithms
1 BinarySearch………………………………………….. 2
2 Selectionsort………………………………………….. 10
3 Insertionsort ………………………………………….. 12
4 Heapsort …………………………………………….. 16
5 Heapify ……………………………………………… 17
6 Heap:Insert…………………………………………… 17
7 Heap:Delete ………………………………………….. 18
8 Heap:Rise ……………………………………………. 18
9 Heap:Fall…………………………………………….. 18
10 Merge ………………………………………………. 19
11 Mergesort ……………………………………………. 19
12 Naivepartitioning……………………………………….. 21
13 Hoarepartitioning ………………………………………. 21
14 Dutchnationalflagpartitioning ……………………………… 23
15 Quicksort…………………………………………….. 23
16 Countingsort………………………………………….. 29
17 Radixsort…………………………………………….. 31
18 Selectminimum………………………………………… 36
19 Selectminimumandmaximum ……………………………… 36
20 Quickselect……………………………………………. 37
21 Medianofmedians………………………………………. 40
22 RecursiveFibonaccinumbers ……………………………….. 44
23 MemoisedFibonaccinumbers ………………………………. 45
24 Bottom-upFibonaccinumbers ………………………………. 46
25 Top-downcoinchange ……………………………………. 47
26 Bottom-upcoinchange …………………………………… 48
27 Coinchangesolutionreconstructionusingbacktracking . . . . . . . . . . . . . . . . . . 49
28 Bottom-up coin change with solution reconstruction using decision table . . . . . 50
29 Bottom-upunboundedknapsack …………………………….. 53
30 Bottom-up0-1knapsack…………………………………… 55
31 Bottom-upeditdistance …………………………………… 57
32 Optimalsequencealignment………………………………… 58
33 Optimalmatrixmultiplication……………………………….. 61
34 Cuckoohashing:Lookup ………………………………….. 71
35 Cuckoohashing:Deletion………………………………….. 71
36 Cuckoohashing:Insertion …………………………………. 72
37 AVLtree:Rightrotation……………………………………. 75
38 AVLtree:Leftrotation…………………………………….. 76
39 AVLtree:Double-rightrotation………………………………. 76
40 AVLtree:Double-leftrotation ……………………………….. 76
41 AVLtree:Rebalance ……………………………………… 77
42 Prefixtrie:Insertion ……………………………………… 81
43 Prefixtrie:Lookup ………………………………………. 82
44 Prefixtrie:Stringsorting …………………………………… 82

45 Naivesuffixarrayconstruction ………………………………. 89
46 Prefix-doublingsuffixarrayconstruction ………………………… 92
47 Suffixarray:Patternmatching……………………………….. 94
48 Burrows-Wheelertransform ………………………………… 98
49 InverseBurrows-Wheelertransform …………………………… 104
50 ComputationoftheBWTstatistics…………………………….. 107
51 PatternmatchingusingtheBWT ……………………………… 108
52 Genericdepth-firstsearch………………………………….. 117
53 Findingconnectedcomponentsusingdepth-firstsearch . . . . . . . . . . . . . . . . . . 119
54 Cycle detection in an undirected graph using depth-first search . . . . . . . . . . . . . 120
55 Genericbreadth-firstsearch ………………………………… 121
56 Single-sourceshortestpathsinanunweightedgraph . . . . . . . . . . . . . . . . . . . . . 121
57 Reconstructshortestpath ………………………………….. 122
58 Edgerelaxation…………………………………………. 127
59 Bellman-Ford………………………………………….. 128
60 Bellman-Ford:Handlingnegativecycles ………………………… 129
61 Dijkstra’salgorithm………………………………………. 131
62 ImprovedDijkstra’salgorithm ……………………………….. 134
63 Floyd-Warshall………………………………………….135
64 Warshall’stransitiveclosurealgorithm………………………….. 136
65 Connectivitycheck……………………………………….138
66 Union-find using disjoint-set forests (without optimisations) . . . . . . . . . . . . . . . 140
67 TheFINDoperationusingpathcompression ……………………… 140
68 TheUNIONoperationusingunionbyrank……………………….. 141
69 Prim’salgorithm…………………………………………145
70 Kruskal’salgorithm………………………………………. 146
71 TopologicalsortingusingKahn’salgorithm ………………………. 151
72 TopologicalsortingusingDFS……………………………….. 151
73 Bottom-uplongestpathinaDAG …………………………….. 152
74 RecursivelongestpathinaDAG ……………………………… 153
75 TheFord-Fulkersonmethod ………………………………… 160
76 Ford-Fulkersonimplementedusingdepth-firstsearch . . . . . . . . . . . . . . . . . . . . 162

Chapter 1
Analysis of Algorithms: Part I
When we write programs, we usually like to test them to give us confidence that they are cor- rect. But testing a program can only ever demonstrate that it is wrong (if we find a case that breaks it). To prove that a program always works, we need to be more mathematical. The two major focuses of algorithm analysis are proving that an algorithm is correct, and determining the amount of resources (time and space) used by the algorithm.
Summary: Analysis of Algorithms : Part I
In this chapter, we cover:
• Whatitmeanstoanalyseanalgorithm
• Provingcorrectnessofanalgorithm
• Analysingthetimecomplexityofanalgorithm
• Ananalysisofbinarysearchasanexample
• Revisionofrecurrencerelations,asymptoticnotation,andcomplexity
Recommended Resources: Analysis of Algorithms : Part I
• http://www.csse.monash.edu.au/~lloyd/tildeAlgDS/Math/ • CLRS,IntroductiontoAlgorithms,Pages17-19,Section2.1
1.1 Program Verification
Program verification or correctness hinges on two main principles. We must on one hand prove that our algorithm, if it terminates always produces the correct result. On the other, we must prove that it does indeed always terminate. We will explore these issues by analysing one of the most well-known fundamental search algorithms, binary search. An implementation of binary search is shown in Algorithm 1.
Binary search is an easy algorithm to get wrong. The most common mistakes are “off-by-one” errors in which the final index reached by the search is one away from where it should have been. For example, what if line 5 had read key > array[mid] instead? This would be a bug, but given how similar it is to the correct code, it would be easily missed. So how then do we reason that this algorithm is in fact the correct one? We will reason the correctness by establishing an invariant in the algorithm. The useful invariant for binary search is shown below.

2
Algorithm 1 Binary Search
1.1. ProgramVerification
// key is located at array[lo] // key not found
1: 2: 3: 4: 5: 6: 7: 8:
functionBINARY_SEARCH(array[1..n],key) lo = 1 and hi = n + 1
while lo < hi −1 do mid = ⌊(lo + hi)/2)⌋ if key ≥ array[mid] then lo = mid else hi = mid if array[lo] = key then return lo else return null Invariant: Binary Search If key ∈ array, then at each iteration: 1. array[lo]≤key, 2. ifhi̸=n+1,thenarray[hi]>key.
Combined with the sortedness of array, this implies that key (if it exists) lies within the range [lo..hi).
We can now re-interpret the binary search algorithm such that every action it takes is simply aiming to maintain these invariants. This observation can lead us to prove that the algorithm is indeed correct and that it terminates.
1.1.1 Arguing correctness
When we wish to prove an algorithm’s correctness via invariants, we are required to prove three things:
1. Initialisation:Wemustprovethattheinvariantsholdatthebeginning
2. Maintenance:Wemustprovethattheinvariantsremaintruethroughoutthealgorithm
3. Termination: We must prove that the invariants at termination imply correctness of the algorithm
We will argue the proof in two parts. First we will show show binary search is correct if the key that we are searching for is in the array. Then we will prove that it is correct when the key is not in the array.
Case 1: key ∈ array
We must first argue that the invariants established above are correct when the algorithm begins. Once established, we subsequently argue that the invariants are maintained throughout the algorithm. When we first begin, we initialise lo = 1, hi = n + 1, so:
1. If key ∈ array, since the array is sorted, lo = 1 implies that array[lo] is the minimum ele- ment, and hence array[lo] ≤ key.
2. Initially,hi=n+1,sopart2issatisfiedvacuously.

Chapter 1. Analysis of Algorithms: Part I 3
Therefore the invariant is true at the beginning of the binary search algorithm. As we perform iterations of the main loop of the binary search, what happens to this invariant? At each step, we compute mid = ⌊(lo + hi)⌋/2 and compare array[mid] to key. The conditional statement in the loop then enforces the invariant.
1. Ifkey≥array[mid],thenaftersettinglo=mid,wewillstillhavearray[lo]≤key
2. Ifkeykey.
Hence the invariant holds throughout the iterations of the loop. Therefore, if the loop termi- nates, it is true that array[lo] ≤ key < array[hi] (or hi = n + 1). Since the condition for the loop to terminate is that lo ≥ hi −1, combined with the previous inequality, it must be true that lo = hi - 1. Therefore if key exists in array, it must exist at position lo, and hence the binary search algorithm correctly identifies whether or not key is an element of array. Case 2: key ∈/ array Since key is not in the array, provided that the algorithm terminates, regardless of the value of lo, array[lo] ̸= key and hence the algorithm correctly identifies that key is not an element of array. 1.1.2 Arguing termination We have argued that binary search is correct if it terminates, but we still need to prove that the algorithm does not simply loop forever, otherwise it is not correct. Let us examine the loop con- dition closely. The loop condition for binary search reads lo < hi - 1, hence while the algorithm is still looping, this inequality holds. We now recall that mid=􏰚lo+hi􏰛. 2 Theorem: Finiteness of Binary Search If lo < hi - 1, then Proof lo1,
T(n)= 2 (1.1)
b if n = 1,
where a is some constant number of operations that we perform each step, and b is some con-
stant number of operations required at the end of the algorithm.
Theorem: Time Complexity of Binary Search
The recurrence relation (1.1) has the explicit solution T (n ) = a log2 (n ) + b
Proof
WewillprovethatT(n)=alog2(n)+b byinductiononn.
Base case:
Suppose n = 1, then a log2 (n ) + b = b which agrees with our definition of T (n ). Inductive step:

Chapter 1. Analysis of Algorithms: Part I 5
Supposethatforsomen,itisthecasethatT􏰅n􏰆=alog 􏰅n􏰆+b.Itisthentruethat 222
T (n ) = T 􏰉 n 􏰊 + a 2
=alog 􏰉n􏰊+b+a 22
=a(log2(n)−log2(2))+b +a = a log2 (n ) − a + b + a
= a log2 (n ) + b
as desired. Hence by induction on n , it is true that
T (n ) = a log2 (n ) + b .
Given this, we can conclude that the time complexity of binary search is O (log(n )). 1.2.1 Solutions to common recurrence relations
When analysing the time and space complexity of our algorithms, we will frequently encounter some very similar recurrence relationships whose solutions we should be able to recall.
Logarithmic Complexity
The solution to the recurrence relation
T(n)= 2 ,
is given by
This is the complexity of binary search that we proved above.
Linear Complexity
is given by
Superlinear Complexity
The solution to the recurrence relation
􏰜2T􏰅n􏰆+an ifn>1, T(n)= 2
b ifn = 1,
The solution to the recurrence relation
T(n)= b ifn=0,,
􏰜T􏰅n􏰆+a ifn>1, b if n = 1,
T (n ) = a log2 (n ) + b .
􏰜T(n−1)+a ifn>0, T (n ) = a n + b ,

6
is given by
1.3. AsymptoticNotation
T(n)=anlog(n)+bn.
This is the complexity of many divide and conquer sorting algorithms, such as Mergesort.
Quadratic Complexity
The solution to the recurrence relation
􏰜T(n−1)+cn ifn>0,
is given by
T(n)= b ifn=0, T (n ) = c 􏰖 n (n + 1) 􏰘 + b .
2
This recurrence shows up for example, as the worst-case complexity of Quicksort.
Exponential Complexity
The solution to the recurrence relation
􏰜2×T(n−1)+a ifn>0,
is given by
T(n)= b, ifn=0,, T(n)=(a +b)×2n −a.
1.3 Asymptotic Notation
When analysing the complexity of algorithms, we typically concern ourselves only with the or- der of magnitude of the running time for large values of n . We express time and space complex- ities in terms of asymptotic notation, with which you should be familiar. We use the following notation throughout the course.
Big-O notation
Big-O is the notation that we will use most frequently. Informally, big-O notation denotes an upper bound on the size of a function. That is, f (n) = O(g(n)) means f (n) is no bigger than thanorderofmagnitudeofg(n). Formally,wesaythat f (n)=O(g(n))asn →∞if
f (n ) ≤ c · g (n ) for all n ≥ n0
for some constants c and n0. This says that for values of n above some threshold, f is always no bigger than some constant multiple of g . Note that this means the behaviour of f for small values of n is irrelevant. Also, remember that big-O bounds can be overestimates. For example, it is correct to say that 2n + 1 = O (n 3 ), and also that 2n + 1 = O (n ).

Chapter 1. Analysis of Algorithms: Part I 7
Big-Ω notation
Big-Ω (Omega) notation will be used once or twice throughout the course. It is a counterpart to big-O notation that denotes a lower bound instead of an upper bound. That is, f (n ) = Ω(g (n )) means that f (n) is at least as big as the order of magnitude of g (n). Formally, we say that f (n) = Ω(g(n))asn →∞if
f (n)≥c ·g(n) foralln ≥n0,
for some constants c and n0. Note that this is equivalent to saying that g(n) = O(f (n)). Just like big-O bounds could be overestimates, big-Θ bounds can be underestimates. For example, n5 = Ω(n2), and also n5 = Ω(n5).
Big-Θ notation
We will make use of big-Θ occasionally. Big-Θ means that the bound given is both big-O and big-Ω at the same time, i.e. both an upper bound and a lower bound. Informally, f (n ) = Θ(g (n )) means that f (n) is the same order of magnitude as g (n). Formally, we say that f (n) = Θ(g (n)) as n → ∞ if
f (n)=O(g(n))and f (n)=Ω(g(n))asn →∞, f (n)=O(g(n))andg(n)=O(f (n))asn →∞.
or equivalently if
Big-Θ notation implies that the bound given is precise, i.e. it is neither an overestimate nor an
underestimate. So n2 + 3n + 5 = Θ(n2), but n2 + 3n + 5 ̸= Θ(n3), and n2 + 3n + 5 ̸= Θ(n). 1.4 Measures of Complexity
Best-case, worst-case, and average-case complexity
When measuring the complexity of a deterministic algorithm, we commonly distinguish be- tween three kinds: best case, average case, and worst case. We can define these formally as follows. Let TA (I ) denote the running time of some algorithm A on some input I , and let 􏰞(n ) denote the set of all possible valid inputs to the algorithm of size n .
Best-case complexity. The best-case complexity of an algorithm is fewest possible instruc- tions that the algorithm will execute over any possible input, as a function of the input size. Formally, the best case complexity is
tbest(n)= minTA(I). I∈􏰞(n)
Do not fall into the common trap of thinking that the best-case complexity of an algorithm is determined by its behaviour for small inputs (the best case is not “when the input is empty”).
Worst-case complexity. The worst-case complexity is likewise, the most instructions that the algorithm could possibly execute on any possible input, as a function of the input size. Formally,
tworst(n)= maxTA(I). I∈􏰞(n)

8 1.4. MeasuresofComplexity
Average-case complexity. Average-case complexity is slightly more tricky to define, and in fact there is not a universally agreed upon definition. The simplest definition, and the one that we will use is that the average-case complexity is the average number of operations that the algo- rithm will execute, averaged over all possible inputs of a given size (i.e., a uniformly random probability distribution over all possible inputs of a given size), as a function of the input size. Formally,
taverage(n)= 􏰝 [TA(I)]. I∈􏰞(n)
This is typically much more difficult to compute than the other two.
Complexity of randomized algorithms
The use of randomization is a powerful tool for algorithm designers. When the time taken by an algorithm is influenced by random decisions, we typically analyse its time complexity in terms of its expected complexity, or we give analyses that hold with high probability.
Expected complexity. Expected complexity tells us the average time that a randomised algo- rithm will take over all possible random decisions that it could make. Note that this is distinct from average-case complexity, which averages over all possible inputs. Unless stated otherwise, we will always analyse expected worst-case performance, i.e. we will analyse the expected per- formance over all possible random choices for the worst possible input. We could also define best-case, and even average-case expected time complexity, but these are seldom used.
As before, let TA (I ) denote the running time of some algorithm A on some input I . Note that TA (I ) is no longer a single value, but a probability distribution since the algorithm may take dif- ferent times based on the outcomes of random choices. The expected (worst-case) time com- plexity of A can then be defined as
max􏰝[TA(I)]. I∈􏰞(n)
Notice how we still take the maximum value since we care about the worst-case behaviour. It is important to not conflate worst-case behaviour (which depends on the input to the algorithm) and unlucky random decisions.
High probability analyses. We will not cover high probability analysis in this course, but we’ll
define it here for completeness, since it is frequently used in the analysis of randomised algo-
rithms. We say that an algorithm takes O ( f (n )) time with high probability if it takes O (c · f (n ))
time with probability at least 1 − 1 for all c ≥ 1. Note that this is much stronger than having nc
expected time complexity O ( f (n )). That is, if an algorithm runs in O ( f (n )) time with high prob- ability, then its expected time complexity is at most O ( f (n )) as well (and may even be lower).

Chapter 2
Analysis of Algorithms: Part II
Invariants are a powerful tool for establishing the correctness of algorithms. Arising frequently and being of particular importance are loop invariants, which are invariants that are maintained by an iterative procedure (or loop) of a program. Loop invariants help us to analyse and estab- lish the correctness of many simple sorting algorithms, which we will explore as an example.
Summary: Analysis of Algorithms : Part II
In this chapter, we cover:
• Usinginvariantstoanalysealgorithms
• Strongandweakloopinvariants
• Analysingsortingalgorithmsusingtheseinvariants
• Best,averageandworst-caseperformanceofsimplesortingalgorithms
Recommended Resources: Analysis of Algorithms : Part II
2.1
• CLRS,IntroductiontoAlgorithms,Chapter2
• Weiss,DataStructuresandAlgorithmAnalysis,Chapter7
• https://youtu.be/Kg4bqzAqRBM-MIT6.006:insertionsort&Mergesort • https://visualgo.net/en/sorting-Visualisationofsortingalgorithms • http://www.csse.monash.edu.au/~lloyd/tildeAlgDS/Math/
• http://www.csse.monash.edu.au/~lloyd/tildeAlgDS/Sort/
Quadratic Sorting Algorithms
Sorting algorithms are some of the most important and fundamental routines that we will study. They are also very good case studies for the analysis of algorithms. You may be familiar with some of the quadratic (O(n2)) sorting algorithms such as:
• Bubblesort
• Selectionsort • Insertionsort

10 2.1. QuadraticSortingAlgorithms
You probably also know some much faster O (n log(n )) sorting algorithms that we will deal with later. In this chapter, we will analyse selection sort and insertion sort, compare the two, and explore their fundamental underlying properties.
2.1.1 Selection sort
The selection sort algorithm is based on the following ideas:
Key Ideas: Selection sort
• Considerthelisttobesortedasbeingdividedintotwoparts
1. Thefirstpartconsistsofthepartofthelistthatiscurrentlysorted
2. Theremainingpartconsistsoftheelementsthatareyettobesorted
• Initially, the first (sorted) part is empty, while the remaining part is the entire (to be sorted) list
• Wethensortthelistlikeso:
1. Searchtheunsortedpartforthesmallestelement
2. Swapthissmallestelementwiththefirstelementoftheunsortedpart 3. Thesortedpartisnowoneelementlonger
Expressed more concretely, an implementation of selection sort is depicted in Algorithm 2. Algorithm 2 Selection sort
1: 2: 3: 4: 5: 6:
functionSELECTION_SORT(array[1..n]) fori=1ton do
min=i forj=i+1ton do
if array[j] < array[min] then min=j swap(array[i], array[min]) Let’s try to understand what is going on here more logically. At some given value of i in the main loop of the algorithm, we have the two sublists • array[1..i−1],thesortedpart • array[i..n],theyettobesortedpart The key strategy behind selection sort simply says to progressively make i larger while main- taining the following invariants. 7: Chapter 2. Analysis of Algorithms: Part II 11 Invariant: Selection sort For any given value of i in the main loop of selection sort, at the beginning of the itera- tion, the following invariants hold: 1. array[1..i−1]issorted 2. For any x in array[1..i −1] and y in array[i..n], x ≤ y Using these invariants, arguing the correctness of selection sort is easy. Initial state of the invariants Initially, when i = 1, the sorted part of the array array[1..0] is empty, so the invariants trivially hold. Note that we will often use the notation array[1..0] (or similarly array[n + 1..n ]) to mean an empty array for convenience. Maintenance of the invariants At each iteration of the main loop, we seek the minimum array[j] for i ≤ j ≤ n. Since at the beginning of iteration i, it is true that array[1..i − 1] ≤ array[i ..n ], the value of array[min] is no smaller than any x ∈ array[1..i −1], and hence the extended sorted part array[1..i ] remains sorted after swapping array[i ] and array[min], so invariant 1 is maintained. Since we are selecting array[min] as a minimum of array[i ..n ], it must be true that after swap- ping array[i] and array[min], that array[min] ≤ array[i +1..n], so invariant 2 is also maintained. Termination When the i loop terminates, the invariant must still hold, and since it was maintained in itera- tion i = n , this implies that array[1..n ] is sorted, i.e. the entire list is sorted, and we are done. Time and space complexity of selection sort Selection sort always does a complete scan of the remainder of the array on each iteration. It is therefore not hard to see that the number of operations performed by selection sort is indepen- dent of the contents of the array. Therefore the best, average and worst-case time complexities are all exactly the same. Theorem: Time complexity of selection sort The time complexity of selection sort is O(n2). 12 2.1. QuadraticSortingAlgorithms Proof The number of iterations performed is n - i for each i from 1 to n , therefore we perform total operations. nn 􏰐(n − i ) = n 2 − 􏰐 i i=1 i=1 = n 2 − n (n + 1) 2 = n2 −n 2 =O(n2) We also only require one extra variable and some loop counters, so the amount of extra space required for selection sort is constant. Best case Time O(n2) Auxiliary Space O (1) 2.1.2 Insertion sort Average Case O(n2) O (1) Worst Case O(n2) O (1) Insertion sort is a very similar algorithm to selection sort, based on a similar but weaker invari- ant. The idea behind insertion sort, much like selection sort is to maintain two sublists, one that is sorted and one that is yet to be sorted. At each iteration of the algorithm, we move one new element from the yet-to-be sorted sublist into its correct position in the sorted sublist. An implementation of insertion sort might look something the code shown in Algorithm 3. Algorithm 3 Insertion sort 1: 2: 3: 4: 5: 6: 7: 8: functionINSERTION_SORT(array[1..n]) fori=2ton do key = array[i ] j=i-1 while j > 0 and array[ j ] > key do
array[j +1] = array[j] j=j-1
array[ j + 1] = key
Selection sort was based on the idea that we maintain two invariants with respect to the sorted and yet-to-be-sorted sublists. We required that the first sublist be sorted, and that the second sublist contain elements greater or equal to those in the sorted sublist. Insertion sort maintains a weaker invariant shown below. We call the invariant weaker since it maintains a subset of the invariants that selection sort did.

Chapter 2. Analysis of Algorithms: Part II 13
Invariant: Insertion sort
For any given value of i in the main loop of insertion sort, at the beginning of the itera- tion, the following invariant holds:
1. array[1..i−1]issorted
Let’s prove that this invariant holds and is sufficient to prove that insertion sort is correct.
Initial state of the invariant
Again, the sorted sublist is empty at the beginning of the main loop, so the invariant holds.
Maintenance of the invariant
At the beginning of iteration i, the sublist array[1..i − 1] is sorted. The inner while loop of in- sertion sort swaps the item array[j] one place to the left until array[ j − 1] ≤ array[ j ]. Since array[1..i − 1] was originally sorted, it is therefore true that after the termination of the while loop, array[1.. j ] is sorted. Since the sublist array[ j + 1..i ] was originally sorted, and for each x ∈ array[ j + 1..i ], we had array[ j ] < x, it is also true that array[ j ..i ] is sorted. Combining these observations, we can conclude that array[1..i ] is now sorted, and hence the invariant has been maintained. Termination At each iteration of the inner while loop of insertion sort, j is decremented, therefore either the loop condition j ≥ 2 must eventually be false, or it will be the case that array[ j − 1] ≤ array[ j ]. Since the loop invariant was maintained when i = n , it is true that array[1..n ] is sorted, and we are done. Time and space complexity of insertion sort Unlike selection sort, insertion sort behaves differently depending on the contents of the array. Suppose we provide insertion sort with an already-sorted list. The inner loop will never iterate since it will always be true that array[ j − 1] ≤ array[ j ]. Therefore the only operations required by insertion sort are to perform the outer loop from from 1 to n, so in the best case, insertion sort only takes O(n) time. In the worst case, suppose we provide an array that is sorted in re- verse order. We can see that in this case, the inner while loop of insertion sort will have to loop the entire way from i to 1 since it will always be true that array[ j ] < array[ j − 1]. We observe that in this case, the amount of work done is the same as selection sort, since we perform i - 1 operations for each i from 2 to n. Therefore in the worst case, insertion sort takes O(n2) time. For a random list, on average we will have to swap array[ j ] with half of the proceeding elements, meaning that we need to perform roughly half of the amount of operations as the worst case. This means that in the average case, insertion sort still takes O (n 2 ) time1 . Finally, just like selec- tion sort, the amount of extra space required is constant, since we keep only one extra variable and a loop counter. 1This is not a rigorous argument and hence not a formal proof. Formally proving average-case complexities is actu- ally quite tricky, but we don’t usually care about average-case complexity, so we won’t delve into the details for now. 14 2.2. PropertiesofAlgorithms Worst Case O(n2) O (1) Time Auxiliary Space Best case O(n) O (1) Average Case O(n2) O (1) 2.2 Properties of Algorithms Stable Sorting A sorting algorithm is said to be stable if the relative ordering of elements that compare equal is maintained by the algorithm. What does this mean? Say for example that we are going to sort a list of people’s names by their length. If multiple names have the same length, a stable sorting algorithm will guarantee that for each class of names with the same length, that their ordering is preserved. For example, if we want to sort the names Bill, Bob, Jane, Peter, Kate by their length, the names Bill, Jane, Kate all share the same length. A stable sorting algorithm will produce the following sorted list Bob, Bill, Jane, Kate, Peter noticing that Bill, Jane, Kate are relatively in the same order as the initial list. An unstable sorting algorithm may have shuffled the order of Bill, Jane, Kate while still producing a list of names that is sorted by length. Insertion sort is an example of a stable sorting algorithm. Selection sort on the other hand is not a stable sorting algorithm. Online Algorithms An algorithm (sorting or otherwise) is called online if it can process its input sequentially with- out knowing it all in advance, as opposed to an offline algorithm which must know the entire input in advance. Insertion sort is an example of an online sorting algorithm. If you only know some of the input, you can begin to run insertion sort, and continue to read in the rest of the input while continuing to sort, and the algorithm will still work. On the other hand, selection sort is not an online algorithm, since if any new elements are added to the input, the algorithm would have to start all over. In-place Algorithms There are multiple definitions of an in-place algorithm used by different authors. One such definition is that an algorithm is in place if it uses O (1) auxiliary space. This is quite hard to sat- isfy, so some authors use the following definition instead. An algorithm (sorting or otherwise) is called in place if it does not store or copy the input into any additional data structures, but modifies the input directly to produce the output. More concretely, an algorithm is in place if it never stores more than O(1) elements of the input outside of the input. Insertion sort and selection sort are both examples of in-place algorithms for both definitions. Chapter 3 Fast Sorting Algorithms Sorting is a fundamental and interesting problem in computer science. Many sorting algo- rithms exist, each having their own advantages and disadvantages. Sorting algorithms like se- lection sort and insertion sort take O(n2) time but are very easy to implement and run very fast for very small input sizes. Faster sorting algorithms exists which only take O (n log(n )) time, but which may be more difficult to implement and may run slower for small lists. We will analyse some of these algorithms and also explore some interesting theoretical results regarding sorting. We will also take a look at how in many cases, integers can be sorted in just O (n ) time. Summary: Fast Sorting Algorithms In this chapter, we cover: • Complexitylowerboundsforsorting • TheQuicksortalgorithm • Sortingintegersinlineartime:Countingsortandradixsort Recommended Resources: Fast Sorting Algorithms • CLRS,IntroductiontoAlgorithms,Chapters6,7,8 • Weiss,DataStructuresandAlgorithmAnalysis,Chapter7 • https://youtu.be/vK_q-C-kXhs-MIT18.410JlectureonQuicksort • https://youtu.be/Nz1KZXbghj8-MIT6.006lectureonfastsorting • https://visualgo.net/en/sorting-Visualisationofsortingalgorithms Revision: Fast Sorting Algorithms You should be familiar from your previous studies with: • Theheapsortalgorithm(revisedbelow) • Themergesortalgorithm(revisedbelow) 16 3.1. Heapsort(Revision) 3.1 Heapsort (Revision) You should be familiar with the heap data structure from your previous units. If you’ve forgot- ten, here is a quick reminder: Definition: Binary Heap Data Structure The binary heap data structure can be described as follows. • Abinaryheapisacompletebinarytree(alllevelsexceptforthelastarecompletely filled). • Every element in a max heap is no smaller than its children (i.e. the maximum element is at the top). Property: Binary Heap Data Structure The binary heap data structure has the following properties. • Due to its structure, a binary heap can be represented as a flat array array[1..n] where the root node is array[1] and for each node array[i ], its children (if they exist) are elements array[2i ] and array[2i + 1]. • AnexistingarraycanbeconvertedintoaheapinplaceinO(n)time. • AnewitemcanbeinsertedintoabinaryheapinO(log(n))time. • ThemaximumelementcanberemovedfromabinaryheapinO(log(n))time. Making use of the heap data structure, one can easily piece together the heapsort algorithm, which simply involves converting the given sequence into a max heap (heapifying it) and then successively removing the maximum element and placing it at the back of the array. Algorithm 4 Heapsort 1: functionHEAPSORT(array[1..n]) 2: 3: 4: HEAPIFY(array[1..n]) fori=n to1do array[i ] = EXTRACT_MAX(array[1..i ]) That’s it! Of course you’ll need to recall how to perform the HEAPIFY and EXTRACT_MAX opera- tions. They are provided below for convenience. Time and space complexity of heapsort SinceheapsortperformsoneHEAPIFY whichtakesO(n)time,andn invocationsofEXTRACT_MAX taking up to O (log(n )) time each, the total time spent by heapsort is O (n log(n )). Note that the behaviour of heapsort is independent of the structure of the input array, so its average and worst-case performances are asymptotically the same. Notably however, since HEAPIFY is done Chapter3. FastSortingAlgorithms 17 in place and each element extracted is placed at the end of the array, heapsort requires only constant extra memory and hence its auxillary space complexity is O (1). There is one rare situation in which heapsort can run in just O (n ), when the entire input array consists of identical elements. In this case, each call to EXTRACT_MAX will take constant time since nothing has to be swapped to maintain the heap property. Heapsort is not an online al- gorithm since we begin by performing heapify which requires knowing the entire array. It is also not stable since the swaps made when heapifying and removing elements may move equal elements out of their relative order. It is however an in-place algorithm since heapify is in place and each call to EXTRACT_MAX places the element back into the array. Best case Time O(n) Auxiliary Space O (1) Binary Heap Operations (Revision) Average Case O(n log(n)) O (1) Worst Case O(n log(n)) O (1) The standard operations on a min/max heap, implementing the priority queue abstract data type are: 1. HEAPIFY:Convertagiven(notnecessarilysorted)arrayintoamin/maxheap 2. INSERT:Insertanelementintotheheap 3. REMOVE:Removethemin/maxvaluefromtheheap These operations are supported by two internal procedures: 1. RISE:Moveanelementhigheruptheheapuntilitsatisfiestheheapproperty 2. FALL:Moveanelementdowntheheapuntilitsatisfiestheheapproperty The following pseudocode implements a max heap, i.e. a heap that pops the maximum value. Algorithm 5 Heapify 1: functionHEAPIFY(array[1..n]) 2: for i = n/2 to 1 do 3: FALL(array[1..n], i) Algorithm 6 Heap: Insert 1: 2: 3: 4: function INSERT(array[1..n], x) array.append(x ) n += 1 RISE(array[1..n], n) 18 3.1. Heapsort(Revision) Algorithm 7 Heap: Delete 1: functionEXTRACT_MAX(array[1..n]) 2: swap(array[1], array[n ]) 3: n=n-1 4: FALL(array[1..n], 1) 5: return array.pop_back() Algorithm 8 Heap: Rise 1: 2: 3: 4: 5: 6: 7: 8: 9: functionRISE(array[1..n],i) parent = ⌊i /2⌋ while parent ≥ 1 do if array[parent] < array[i ] then swap(array[parent], array[i ]) i = parent parent = ⌊i/2⌋ else break Algorithm 9 Heap: Fall 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: functionFALL(array[1..n],i) child = 2i while child ≤ n do if child < n and array[child+1] > array[child] then
child += 1
if array[i ] < array[child] then swap(array[i ], array[child]) i = child child = 2i else break Chapter3. FastSortingAlgorithms 19 3.2 Merge Sort (Revision) Merge sort is a divide-and-conquer sorting algorithm that sorts a given sequence by dividing it into two halves, sorting those halves and then merging the sorted halves back together. How does merge sort sort the two halves? Using merge sort of course! Merge sort is an example of a recursive sorting algorithm, where each half of the sequence is sorted recursively until we reach a base case (a sequence of only one element). The key part of merge sort is the merge rou- tine, which takes two sorted sequences and interleaves them together to obtain a single sorted sequence. The merge algorithm is shown in Algorithm 10. Algorithm 10 Merge 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: functionMERGE(A[i..n1],B[j..n2]) result = empty array whilei≤n1 orj≤n2 do if j >n2 ori ≤n1 andA[i]≤B[j]then result.append(A[i ])
i += 1
else
result.append(B[ j ]) j += 1
return result
Observe that this merge routine is stable (since we write A[i ] ≤ B[ j ], rather than <), and hence our merge sort implementation will also be stable. Using this merge routine, an implementa- tion of merge sort can be expressed like this. The recursive merge sort pseudocode is shown in Algorithm 11. Algorithm 11 Merge sort 1: 2: 3: 4: 5: 6: functionMERGE_SORT(array[lo..hi]) ifhi>lothen
mid = ⌊(lo + hi)/2⌋
MERGE_SORT(array[lo..mid])
MERGE_SORT(array[mid + 1..hi])
array[lo..hi] = MERGE(array[lo..mid], array[mid + 1..hi])
Time and space complexity of merge sort
Merge sort repeatedly splits the input sequence in half until it can no longer do so any more. The number of times that this can occur is log2(n), so the depth of the recurrence is O(log(n)). Merging n elements takes O (n ) time, so doing this at each level of recursion results in a total of O (n log(n )) time. This happens regardless of the input, hence the best, average and worst-case performances for merge sort are equal. Although the MERGE_SORT routine itself uses no extra space, the MERGE routine uses O(n) extra space, hence the auxiliary space complexity of merge sortisO(n).

20
3.3. Quicksort
Time Auxiliary Space
Best case
O(n log(n)) O(n)
Average Case
O(n log(n)) O(n)
Worst Case
O(n log(n)) O(n)
Enhancements of merge sort
Merge sort can also be implemented bottom-up, in which we eliminate the recursion all-together and simply iteratively merge together the sub-arrays of size 1, 2, 4, 8, … n . This still requires O (n ) auxiliary memory, but eliminates the use of the program stack (for recursion) and hence should be slightly faster in practice. There are also in-place implementations of merge sort that require only O(1) space, but they are much more complicated1.
3.3 is perhaps the most well-known divide and conquer algorithms for sorting. Quicksort follows these key ideas to sort an array.
Key Ideas: Quicksort
• Selectsomeelementofthearray,whichwewillcallthepivot
• Partition the array so that all items less then the pivot are to its left, all elements equal to the pivot are in the middle, and all elements greater than the pivot are to its right
• Quicksorttheleftpartofthearray(elementslessthanthepivot)
• Quicksorttherightpartofthearray(elementsgreaterthanthepivot)
The key ideas are very simple. The two major components to correctly (and efficiently) imple- menting Quicksort are:
• the partitioning algorithm (how does the algorithm move elements lesser/greater than the pivot to the left/right efficiently?)
• howdoesthealgorithmselectthepivotelement? The partitioning problem
The partitioning problem is central to Quicksort. The choice of partitioning algorithm affects whether the algorithm is in place and whether it is stable, and can also influence the worst-case time complexity.
Naive partition algorithm
The naive partitioning algorithm simply partitions the array into three temporary arrays, one for elements that are less than the pivot, one for elements that are greater than the pivot, and
1See Nicholas Pippenger, Sorting and Selecting in Rounds, SIAM Journal on Computing 1987.

Chapter3. FastSortingAlgorithms 21
one for elements equal to the pivot. Naive partitioning is shown as Algorithm 12 Algorithm 12 Naive partitioning
1: 2: 3: 4: 5: 6: 7: 8:
9: 10:
functionPARTITION(array[lo..hi],pivot) left = empty array
pivots = empty array
right = empty array fori=lotohido
// Store elements less than the pivot // Store elements equal to the pivot // Store elements greater than the pivot
if array[i ] < pivot then left.append(array[i ]) else if array[i ] = pivot then pivots.append(array[i ]) else right.append(array[i ]) array[lo..hi] = left + pivots + right // Concatenate the arrays return length(left) + ⌈length(pivots)/2⌉ // Return the position of the middle pivot Naive partitioning takes O (n ) time and is stable, which are both nice properties. However, naive partitioning is not in place since it stores all n elements in new arrays, using O (n ) space. Creat- ing extra arrays means th