Algorithms and Data Structures
Course notes by NDERSON
Last updated September 22, 2021
Copyright By PowCoder代写 加微信 powcoder
Analysis of Algorithms
1 Analysis of Algorithms: Part I 1
1.1 ProgramVerification……………………………………… 1 1.1.1 Arguingcorrectness …………………………………. 2 1.1.2 Arguingtermination…………………………………. 3
1.2 ComplexityAnalysis ……………………………………… 4 1.2.1 Solutionstocommonrecurrencerelations ………………….. 5
1.3 AsymptoticNotation……………………………………… 6
1.4 MeasuresofComplexity …………………………………… 7
2 Analysis of Algorithms: Part II 9
2.1 QuadraticSortingAlgorithms ……………………………….. 9 2.1.1 Selectionsort……………………………………… 10 2.1.2 Insertionsort……………………………………… 12
2.2 PropertiesofAlgorithms …………………………………… 14
Sorting and Selecting
3 Fast Sorting Algorithms 15
3.1 Heapsort(Revision) ……………………………………… 16
3.2 MergeSort(Revision) …………………………………….. 19
3.3 Quicksort…………………………………………….. 20
3.4 ComplexityLowerBoundsforSorting ………………………….. 27
3.5 SortingIntegersFast……………………………………… 28
3.5.1 Countingsort……………………………………… 29 3.5.2 Radixsort ……………………………………….. 30
4 Order Statistics and Selection 35
4.1 OrderStatisticsandtheSelectionProblem……………………….. 35
4.2 TheQuickselectAlgorithm …………………………………. 37
4.3 RandomisedPivotSelection ………………………………… 37
4.4 MedianofMedians(NotexaminableinSemesterOne,2020). . . . . . . . . . . . . . . 39
Dynamic Programming
5 Dynamic Programming: Part I 43
5.1 TheKeyElementsofDynamicProgramming ……………………… 44 5.1.1 Memoisation-ComputingFibonaccinumbers ……………….. 44
5.1.2 Optimalsubstructure-Thecoinchangeproblem . . . . . . . . . . . . . . . . . . 46
5.2 Top-downvs.Bottom-upDynamicProgramming ………………….. 48 5.3 Reconstructing Optimal Solutions to Optimisation Problems . . . . . . . . . . . . . . . 49
6 Dynamic Programming: Part II 51
6.1 TheUnboundedKnapsackProblem …………………………… 51
6.2 The0-1KnapsackProblem …………………………………. 53
6.3 TheEditDistanceProblem …………………………………. 55
6.4 Matrix Multiplication (Not examinable in Semester One, 2020) . . . . . . . . . . . . . 58
6.5 TheSpace-SavingTrick……………………………………. 61
Data Structures
7 Hashing and Hashtables 63
7.1 Hashtables(Revision) …………………………………….. 64 7.2 WhatisaGoodHashFunction?………………………………. 66 7.3 HashingIntegers………………………………………… 67 7.4 HashingStrings ………………………………………… 68 7.5 UniversalHashing ………………………………………. 70 7.6 CuckooHashing………………………………………… 71
8 Balanced Binary Search Trees 73
8.1 AVLTrees…………………………………………….. 73 8.1.1 Definitions……………………………………….. 74 8.1.2 Rebalancing………………………………………. 75
9 Prefix Tries and Suffix Trees 79
9.1 ThePrefixTrie/RetrievalTreeDataStructure …………………….. 80 9.1.1 Applicationsoftries …………………………………. 81 9.2 SuffixTrees……………………………………………. 83 9.2.1 Buildingasuffixtree…………………………………. 85 9.2.2 Applicationsofsuffixtrees …………………………….. 85
10 Suffix Arrays 87
10.1 BuildingaSuffixArray…………………………………….. 89
10.1.1 Thenaiveapproach …………………………………. 89
10.1.2 Theprefixdoublingalgorithm ………………………….. 89
10.1.3 Faster suffix array algorithms (Not examinable in Semester One, 2020) . . 92
10.2 ApplicationsofSuffixArrays ………………………………… 93 10.2.1 Patternmatchingwithsuffixarrays……………………….. 93
11 The Burrows- 95
11.1 TheBurrows-WheelerTransform……………………………… 95 11.1.1 UsefulpropertiesoftheBWT …………………………… 96 11.1.2 ComputationoftheBWT ……………………………… 97
11.2TheInverseBurrows-WheelerTransform………………………… 98 11.2.1 Thenaiveinversionmethod ……………………………. 98 11.2.2 EfficientinversionoftheBWT…………………………… 100
11.3 PatternMatchingwiththeBurrows-WheelerTransform . . . . . . . . . . . . . . . . . . . 104
Graph and Network Algorithms
12 Graph Basics 109
12.1 ModellingwithGraphs ……………………………………. 109
12.2 RepresentationandStorageofGraphs ………………………….. 113
12.3 GraphTraversalandApplications …………………………….. 116
12.3.1 Depth-firstsearch ………………………………….. 116 12.3.2 Findingconnectedcomponents …………………………. 118 12.3.3 Cyclefinding ……………………………………… 119 12.3.4 Breadth-firstsearch …………………………………. 119 12.3.5 Unweightedshortestpaths…………………………….. 121
13 Shortest Paths 123
13.1 PropertiesofShortestPaths…………………………………. 124
13.2 Distanceestimatesandtherelaxationtechnique …………………… 127
13.3 TheBellman-FordAlgorithm………………………………… 127
13.4 Dijkstra’sAlgorithm………………………………………. 129
13.5 TheAll-PairsShortestPathProblem……………………………. 133
13.5.1 TheFloyd-Warshallalgorithm…………………………… 134 13.5.2 Transitiveclosure…………………………………… 136
14 Incremental Graph Connectivity 137
14.1 ConnectivityinGraphs ……………………………………. 137 14.2 TheIncrementalConnectivityProblem …………………………. 138 14.3 TheUnion-FindDisjoint-SetDataStructure………………………. 139
15 Minimum Spanning Trees 143
15.1 Prim’sAlgorithm………………………………………… 144 15.2 Kruskal’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
16 Directed Acyclic Graphs 149
16.1 TheTopologicalSortingProblem……………………………… 150 16.2 TheCriticalPathProblem ………………………………….. 152
17 Network Flow 155
17.1 TheFord-FulkersonAlgorithm ………………………………. 158 17.1.1 Theresidualnetwork ………………………………… 158 17.1.2 Augmentingpaths ………………………………….. 159 17.1.3 Implementationusingdepth-firstsearch……………………. 160
17.2 TheMinimumCutProblem…………………………………. 162 17.2.1 Themin-cutmax-flowtheorem …………………………. 163
17.3 BipartiteMatching ………………………………………. 165
List of Algorithms
1 BinarySearch………………………………………….. 2
2 Selectionsort………………………………………….. 10
3 Insertionsort ………………………………………….. 12
4 Heapsort …………………………………………….. 16
5 Heapify ……………………………………………… 17
6 Heap:Insert…………………………………………… 17
7 Heap:Delete ………………………………………….. 18
8 Heap:Rise ……………………………………………. 18
9 Heap:Fall…………………………………………….. 18
10 Merge ………………………………………………. 19
11 Mergesort ……………………………………………. 19
12 Naivepartitioning……………………………………….. 21
13 Hoarepartitioning ………………………………………. 21
14 Dutchnationalflagpartitioning ……………………………… 23
15 Quicksort…………………………………………….. 23
16 Countingsort………………………………………….. 29
17 Radixsort…………………………………………….. 31
18 Selectminimum………………………………………… 36
19 Selectminimumandmaximum ……………………………… 36
20 Quickselect……………………………………………. 37
21 Medianofmedians………………………………………. 40
22 RecursiveFibonaccinumbers ……………………………….. 44
23 MemoisedFibonaccinumbers ………………………………. 45
24 Bottom-upFibonaccinumbers ………………………………. 46
25 Top-downcoinchange ……………………………………. 47
26 Bottom-upcoinchange …………………………………… 48
27 Coinchangesolutionreconstructionusingbacktracking . . . . . . . . . . . . . . . . . . 49
28 Bottom-up coin change with solution reconstruction using decision table . . . . . 50
29 Bottom-upunboundedknapsack …………………………….. 53
30 Bottom-up0-1knapsack…………………………………… 55
31 Bottom-upeditdistance …………………………………… 57
32 Optimalsequencealignment………………………………… 58
33 Optimalmatrixmultiplication……………………………….. 61
34 Cuckoohashing:Lookup ………………………………….. 71
35 Cuckoohashing:Deletion………………………………….. 71
36 Cuckoohashing:Insertion …………………………………. 72
37 AVLtree:Rightrotation……………………………………. 75
38 AVLtree:Leftrotation…………………………………….. 76
39 AVLtree:Double-rightrotation………………………………. 76
40 AVLtree:Double-leftrotation ……………………………….. 76
41 AVLtree:Rebalance ……………………………………… 77
42 Prefixtrie:Insertion ……………………………………… 81
43 Prefixtrie:Lookup ………………………………………. 82
44 Prefixtrie:Stringsorting …………………………………… 82
45 Naivesuffixarrayconstruction ………………………………. 89
46 Prefix-doublingsuffixarrayconstruction ………………………… 92
47 Suffixarray:Patternmatching……………………………….. 94
48 Burrows-Wheelertransform ………………………………… 98
49 InverseBurrows-Wheelertransform …………………………… 104
50 ComputationoftheBWTstatistics…………………………….. 107
51 PatternmatchingusingtheBWT ……………………………… 108
52 Genericdepth-firstsearch………………………………….. 117
53 Findingconnectedcomponentsusingdepth-firstsearch . . . . . . . . . . . . . . . . . . 119
54 Cycle detection in an undirected graph using depth-first search . . . . . . . . . . . . . 120
55 Genericbreadth-firstsearch ………………………………… 121
56 Single-sourceshortestpathsinanunweightedgraph . . . . . . . . . . . . . . . . . . . . . 121
57 Reconstructshortestpath ………………………………….. 122
58 Edgerelaxation…………………………………………. 127
59 Bellman-Ford………………………………………….. 128
60 Bellman-Ford:Handlingnegativecycles ………………………… 129
61 Dijkstra’salgorithm………………………………………. 131
62 ImprovedDijkstra’salgorithm ……………………………….. 134
63 Floyd-Warshall………………………………………….135
64 Warshall’stransitiveclosurealgorithm………………………….. 136
65 Connectivitycheck……………………………………….138
66 Union-find using disjoint-set forests (without optimisations) . . . . . . . . . . . . . . . 140
67 TheFINDoperationusingpathcompression ……………………… 140
68 TheUNIONoperationusingunionbyrank……………………….. 141
69 Prim’salgorithm…………………………………………145
70 Kruskal’salgorithm………………………………………. 146
71 TopologicalsortingusingKahn’salgorithm ………………………. 151
72 TopologicalsortingusingDFS……………………………….. 151
73 Bottom-uplongestpathinaDAG …………………………….. 152
74 RecursivelongestpathinaDAG ……………………………… 153
75 TheFord-Fulkersonmethod ………………………………… 160
76 Ford-Fulkersonimplementedusingdepth-firstsearch . . . . . . . . . . . . . . . . . . . . 162
Analysis of Algorithms: Part I
When we write programs, we usually like to test them to give us confidence that they are cor- rect. But testing a program can only ever demonstrate that it is wrong (if we find a case that breaks it). To prove that a program always works, we need to be more mathematical. The two major focuses of algorithm analysis are proving that an algorithm is correct, and determining the amount of resources (time and space) used by the algorithm.
Summary: Analysis of Algorithms : Part I
In this chapter, we cover:
• Whatitmeanstoanalyseanalgorithm
• Provingcorrectnessofanalgorithm
• Analysingthetimecomplexityofanalgorithm
• Ananalysisofbinarysearchasanexample
• Revisionofrecurrencerelations,asymptoticnotation,andcomplexity
Recommended Resources: Analysis of Algorithms : Part I
• http://www.csse.monash.edu.au/~lloyd/tildeAlgDS/Math/ • CLRS,IntroductiontoAlgorithms,Pages17-19,Section2.1
1.1 Program Verification
Program verification or correctness hinges on two main principles. We must on one hand prove that our algorithm, if it terminates always produces the correct result. On the other, we must prove that it does indeed always terminate. We will explore these issues by analysing one of the most well-known fundamental search algorithms, binary search. An implementation of binary search is shown in Algorithm 1.
Binary search is an easy algorithm to get wrong. The most common mistakes are “off-by-one” errors in which the final index reached by the search is one away from where it should have been. For example, what if line 5 had read key > array[mid] instead? This would be a bug, but given how similar it is to the correct code, it would be easily missed. So how then do we reason that this algorithm is in fact the correct one? We will reason the correctness by establishing an invariant in the algorithm. The useful invariant for binary search is shown below.
Algorithm 1 Binary Search
1.1. ProgramVerification
// key is located at array[lo] // key not found
1: 2: 3: 4: 5: 6: 7: 8:
functionBINARY_SEARCH(array[1..n],key) lo = 1 and hi = n + 1
while lo < hi −1 do
mid = ⌊(lo + hi)/2)⌋
if key ≥ array[mid] then lo = mid else hi = mid
if array[lo] = key then return lo else return null
Invariant: Binary Search
If key ∈ array, then at each iteration: 1. array[lo]≤key,
2. ifhi̸=n+1,thenarray[hi]>key.
Combined with the sortedness of array, this implies that key (if it exists) lies within the range [lo..hi).
We can now re-interpret the binary search algorithm such that every action it takes is simply aiming to maintain these invariants. This observation can lead us to prove that the algorithm is indeed correct and that it terminates.
1.1.1 Arguing correctness
When we wish to prove an algorithm’s correctness via invariants, we are required to prove three things:
1. Initialisation:Wemustprovethattheinvariantsholdatthebeginning
2. Maintenance:Wemustprovethattheinvariantsremaintruethroughoutthealgorithm
3. Termination: We must prove that the invariants at termination imply correctness of the algorithm
We will argue the proof in two parts. First we will show show binary search is correct if the key that we are searching for is in the array. Then we will prove that it is correct when the key is not in the array.
Case 1: key ∈ array
We must first argue that the invariants established above are correct when the algorithm begins. Once established, we subsequently argue that the invariants are maintained throughout the algorithm. When we first begin, we initialise lo = 1, hi = n + 1, so:
1. If key ∈ array, since the array is sorted, lo = 1 implies that array[lo] is the minimum ele- ment, and hence array[lo] ≤ key.
2. Initially,hi=n+1,sopart2issatisfiedvacuously.
Chapter 1. Analysis of Algorithms: Part I 3
Therefore the invariant is true at the beginning of the binary search algorithm. As we perform iterations of the main loop of the binary search, what happens to this invariant? At each step, we compute mid = ⌊(lo + hi)⌋/2 and compare array[mid] to key. The conditional statement in the loop then enforces the invariant.
1. Ifkey≥array[mid],thenaftersettinglo=mid,wewillstillhavearray[lo]≤key
2. Ifkey
Hence the invariant holds throughout the iterations of the loop. Therefore, if the loop termi- nates, it is true that array[lo] ≤ key < array[hi] (or hi = n + 1). Since the condition for the loop to terminate is that lo ≥ hi −1, combined with the previous inequality, it must be true that lo = hi - 1. Therefore if key exists in array, it must exist at position lo, and hence the binary search algorithm correctly identifies whether or not key is an element of array.
Case 2: key ∈/ array
Since key is not in the array, provided that the algorithm terminates, regardless of the value of lo,
array[lo] ̸= key and hence the algorithm correctly identifies that key is not an element of array. 1.1.2 Arguing termination
We have argued that binary search is correct if it terminates, but we still need to prove that the algorithm does not simply loop forever, otherwise it is not correct. Let us examine the loop con- dition closely. The loop condition for binary search reads lo < hi - 1, hence while the algorithm is still looping, this inequality holds. We now recall that
mid=lo+hi. 2
Theorem: Finiteness of Binary Search
If lo < hi - 1, then
Since mid is the floor of (lo + hi)/2, it is true that
lo + hi − 1 < mid ≤ lo + hi .
Multiplying by two, we find
lo+hi−2<2×mid≤lo+hi.
Since lo < hi-1, we have lo ≤ hi-2. We replace hi-2 with lo on the left most term in the inequality.
2×lo<2×mid≤lo+hi.
1.2. ComplexityAnalysis
Next, we add 1 to the right most term. 2×lo<2×mid
T(n)= 2 (1.1)
b if n = 1,
where a is some constant number of operations that we perform each step, and b is some con-
stant number of operations required at the end of the algorithm.
Theorem: Time Complexity of Binary Search
The recurrence relation (1.1) has the explicit solution T (n ) = a log2 (n ) + b
WewillprovethatT(n)=alog2(n)+b byinductiononn.
Base case:
Suppose n = 1, then a log2 (n ) + b = b which agrees with our definition of T
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com