COMP 3711 Design and Analysis of Algorithms
Lecture 2: Introduction
Definition:
Copyright By PowCoder代写 加微信 powcoder
What is an Algorithm?
An algorithm is a recipe for doing something.
An algorithm is an explicit, precise, unambiguous, mechanically-executable
sequence of elementary instructions.
Comes from al-Khwārizmi, the name of a 9th century Persian mathematician, astronomer, geographer, and scholar
Input: Two numbers
and (potentially very long), each consisting of
such that .
Output: A number
Algorithm for Adding Two Numbers
for𝑖←1to𝑛 //iisanindexonthe decimal digits, starting from the right
𝑧 ← 𝑥 + 𝑦 + 𝑐
if 𝑧 ≥10 then 𝑐←1, 𝑧 ←𝑧 −10
You’ve been running algorithms all your life!
+612345678
1141846911
Algorithm for finding the minimum in an array
Input: An array of integer numbers
[4 8 2 7 5 6 9 3]
Output: The minimum element min-el and its position pos in the array min-el=2, pos=3
Find-Min(𝐴): min-el= 𝐴 1 for 𝑖←2 to 𝑛
if 𝐴 𝑖 < min−el then min-el= 𝐴 𝑖
pos=i Return(min-el,pos)
Q1: Which position is returned if there are multiple elements with the same minimum value? Q2: How many comparisons (𝐴 𝑖 < min−el) does the algorithm perform for input of size n? Q3: Do you think it is possible to have an algorithm with fewer comparisons (is Find-Min optimal or not)?
Exercise 1a Find-Max- Find-Min to Find-Max-Min that returns (max-el, max- pos, min-el, min-pos)
Q1: How many comparisons does Find- Max-Min perform for input of size n? A1: 2n-2 (n-1 for min, and n-1 for max) Q2: Do you think it is possible to have an algorithm with fewer comparisons?
Find-Min(𝐴): min-el= 𝐴 1 for 𝑖←2 to 𝑛
if 𝐴 𝑖 < min−el then min-el= 𝐴 𝑖
pos=i Return(min-el,pos)
Find-Max-Min (𝐴): max-el=A[1];min-el= 𝐴 1 , for 𝑖←2 to 𝑛
if 𝐴 𝑖 > max−el then max-el= 𝐴 𝑖 ; max-pos=i
if 𝐴 𝑖 < min−el then min-el= 𝐴 𝑖 ; pos-min=i
Return (max-el, max-pos min-el, min-pos)
Exercise 1b Find-Max-Min2
We can reduce the number of comparisons based on the observation that if an
element is larger than the maximum, it cannot be smaller than the minimum.
Q1: How many comparisons does Find- Max-Min2 perform in the best case? When does it happen?
A1: n-1, when the array elements are in increasing order
Q2: How many comparisons does Find- Max-Min2 perform in the worst case? When does it happen?
A2: 2n-2, when the array elements are in decreasing order
Q3: Can you find an algorithm with fewer than 2n-2 comparisons in the worst case?
Find-Max-Min2 (𝐴): max-el=A[1];min-el= 𝐴 1 , for 𝑖←2 to 𝑛
if 𝐴 𝑖 > max−el then max-el= 𝐴 𝑖 ; max-pos=i
else if 𝐴 𝑖 < min−el then min-el= 𝐴 𝑖 ; pos-min=i
Return (max-el, max-pos min-el, min-pos)
Exercise 1c Find-Max-Min3
Assume for simplicity that n is odd. We set A[1] to be both the max-el and the min- el. Then, we compare pairs of elements A[i] and A[i+1]. The largest of the pair is compared with max-el and the smallest with min-el.
Q1: How many comparisons does Find- Max-Min3 perform per pair A[i] and A[i+1]?
Q2: How many comparisons does Find- Max-Min2 perform in total?
A2: 3(n-1)/2
Find-Max-Min3 (𝐴): max-el=A[1];min-el= 𝐴 1 , for 𝑖←2 to 𝑛−1 step 2
if 𝐴𝑖 >𝐴𝑖+1 then
if 𝐴 𝑖 > max−el then
max-el= 𝐴 𝑖 ; max-pos=i if 𝐴 𝑖+1
max-el= 𝐴 𝑖 + 1 ; max-pos=i if 𝐴 𝑖 < min−el then
min-el= 𝐴 𝑖 ; min-pos=i Return (max-el, max-pos min-el, min-pos)
The Sorting Problem
Input: An array of elements
[4 8 2 7 5 6 9 3]
Output: Array of elements in sorted order (ascending)
[2 3 4 5 6 7 8 9]
We always use ascending order, but all algorithms work for descending order with minor modifications
Sorting is a very important (sub)routine that comes up all the time.
• One of the early Killer-Apps of Computing
• In the 1960s’, computer manufacturers estimated that more than 25% of
their machines’ running times were spent on sorting
• Even now, sorting is a common application AND a standard subroutine used in solutions to many other problems
• The Sorting Problem is also a good testbed to test out algorithmic techniques
How to Extend Find-Min for Sorting (in ascending order)
Step 1: Find the minimum in array
Cost of Step 1: n-1 comparisons
Step 2: Find the minimum in array
Cost of Step 2: n-2 comparisons
Step i: Find the minimum in array
Cost of Step i: n-i comparisons
Observation: after step i, Step n-1: Find the minimum in array
and swap and and swap and
and swap and
and swap and
Cost of Step n-1: 1 comparison
Total cost of the algorithm=Total number of comparisons=(n-1)+(n-2)+...+2+1
This is the arithmetic series.
Q: Is the cost (number of comparisons) always the same for any array of size n?
Closed Formula for Arithmetic Series
It is easy to obtain a closed formula for the arithmetic series (n-1)+(n- 2)+...+2+1 using the following trick:
• I create a copy of the arithmetic series in the reverse order, and I add the corresponding terms of the two copies
• Each column adds to n. The number of columns is n-1.
• Thus, 2((n-1)+(n-2)+...+2+1))=n(n-1) (n-1)+(n-2)+...+2+1 = n(n-1)/2=(n2)
Our next sorting algorithm is based on a similar idea: at step i the i-th minimum element will be put at position A[i], so that the elements at positions A[1..i] are sorted. The difference is that swapping is continuous: whenever we find an element that is smaller than A[i], we swap positions even though eventually it is not the minimum.
Selection Sort
Input: An array of elements
Output: Array of elements in sorted order (ascending)
(*) n-1 passes
In pass i, find smallest item in and swap it with
Implementation
1st Pass: (528671)→(258671)→(258671)→(258671) →(258671) →(158672)
2ndPass: (158672)→(158672)→(158672)→(158672) →(128675) 3rdPass: (128675)→(126875)→(126875)→(125876)
4th Pass: (125876)→(125786)→(125687)
5th Pass: (125687)→(125678)→(125678)
Selection-Sort(𝐴): for 𝑖←1 to 𝑛−1
for j←𝑖+1 to 𝑛
if 𝐴 𝑖 >𝐴[𝑗] then
swap 𝐴[𝑖] and 𝐴[𝑗]
Correctness of Selection Sort
Claim: When Selection sort terminates, the array is sorted. Proof: By induction on n.
When n = 1 the algorithm is obviously correct
Assume that the algorithm sorts every array of size n-1 correctly Now consider what the algorithm does on
1. It firsts puts the smallest item in 2. It then runs Selection sort on
– By induction, this sorts the items in
3. Since is smaller than every item in all the items in are now sorted.
Running Time of Selection Sort
What is meant by running “time”?
Total Operations Performed? Memory Accesses?
We will count number of comparisons, i.e., # of times This “dominates” all of the other operations.
Selection-Sort(𝐴): 1. for 𝑖←1 to 𝑛−1
for j←𝑖+1 to 𝑛
if 𝐴𝑖 >𝐴[𝑗] then
swap 𝐴[𝑖] and 𝐴[𝑗]
For any fixed line 3 calls
Total # comparisons is
Alternatively, note that line 3 is run exactly once for every possible (i,j) pair with This is exactly # of ways to choose a pair out of n items which is
Insertion Sort
Input: An array of elements
Output: Array of elements in sorted order (ascending)
Insertion is performed by successively swapping 𝐴[𝑖] with items to its left until an item smaller than 𝐴[𝑖] is found.
i=2: (518632)→(158632)→(158632)
i=3: (158632)→(158632)
i=4: (158632)→(156832)→(156832)
i=5: (156832)→(156382)→(153682)→(135682)→(135682)
i=6: (135682)→(135628)→(135268)→(132568)→(123568) →(123568)
Insertion-Sort(𝐴): for 𝑖←2to 𝑛 do
while 𝑗≥1 and 𝐴[𝑗]>𝐴[𝑗+1] do
swap 𝐴[𝑗] and 𝐴[𝑗 + 1] 𝑗←𝑗−1
Insertion Sort
Input: An array of elements
Output: Array of elements in sorted order (ascending)
Insertion is performed by successively swapping 𝐴[𝑖] with items to its left until an item smaller than 𝐴[𝑖] is found.
Insertion-Sort(𝐴): for 𝑖←2to 𝑛 do
while 𝑗≥1 and 𝐴[𝑗]>𝐴[𝑗+1] do
swap 𝐴[𝑗] and 𝐴[𝑗 + 1] 𝑗←𝑗−1
A[1..i-1] – sorted
A[i+1..n] – unsorted
Correctness: After step i, items in A[1..i] are in proper order. i ‘th iteration puts key=A[i] in proper place.
Insertion-Sort(𝐴):
1.for 𝑖←2to 𝑛 do
3. while 𝑗≥1 and 𝐴[𝑗]>𝐴[𝑗+1] do 4. swap 𝐴[𝑗] and 𝐴[𝑗 + 1]
Running Time of Insertion Sort
Number of comparisons at Line 3 depends on the array. It is at most
Q: When does the worst case (largest number of comparisons) happen? If original array is sorted, e.g., ( 1 2 3 4 5 6 ), then no item is ever moved.
Each item is only compared to its left neighbor, so line 3 is only run times. Unlike Selection Sort which always uses 𝒏 𝒏𝟏 comparisons for each array of size
input array, and ranges between and 𝒏 𝒏𝟏 . 𝟐
n, the number of comparisons (running time) of Insertion Sort depends on the
Wild-Guess Sort
Input: An array of elements
Output: Array of elements in sorted order (ascending)
Wild-Guess-Sort(𝐴):
𝜋←[4,7,1,3,8,11,5,…] //create random permutation check if 𝐴𝜋𝑖 ≤𝐴𝜋𝑖+1 for all 𝑖=1,2,…,𝑛−1 if yes, output 𝐴 according to 𝜋 and terminate else Insertion-Sort(𝐴)
Q: Can Wild-Guess Sort be faster than insertion sort?
A: Yes, if the guess happens to be correct (highly unlikely). A: Otherwise, it is slower.
How to evaluate an algorithm / compare two algorithms?
What to measure?
Memory (space complexity)
– total space
– working space (excluding the space for holding inputs)
Running time (time complexity)
How to measure?
Empirical – depends on actual implementation, hardware, etc.
Analytical – depends only on the algorithms, focus of this course
Comparing two algorithms is not (usually) a simple matter!
Even for same input size , different inputs can lead to different running times
Calculating exact # of instructions executed is very difficult/tedious
Best-Case Analysis
Best case: An instance for a given size that results in the fastest possible running time.
Example (insertion sort): Input already sorted
Insertion-Sort(𝐴): for 𝑖←2to 𝑛 do
while 𝑗≥1 and 𝐴[𝑗]>𝐴[𝑗+1] do
swap 𝐴[𝑗] and 𝐴[𝑗 + 1] 𝑗←𝑗−1
A[1..j-1] – sorted
A[j+1..n] – (actually sorted)
“key” is only compared to element to its immediate left, so
Q: What is the best case running time of Wild-Guess Sort?
Worst-Case Analysis
Worst case: An instance for a given size that results in the slowest possible running time.
Example (insertion sort): Input inversely sorted
Insertion-Sort(𝐴): for 𝑖←2to 𝑛 do
while 𝑗≥1 and 𝐴[𝑗]>𝐴[𝑗+1] do
swap 𝐴[𝑗] and 𝐴[𝑗 + 1] 𝑗←𝑗−1
A[1..j-1] – sorted
A[j+1..n] – (actually sorted)
“key” is compared to every element preceding it, so
() .
Average-Case Analysis
Average case: Running time averaged over all possible instances for the given size, assuming some probability distribution on the instances.
Example (insertion sort): assuming that each of the permutations of the numbers is equally likely
A[1..j-1] – sorted
A[j+1.. N] – unsorted
Rigorous analysis is complicated, but intuitively, “key” is compared, on average to half the items preceding it, so
Three Kinds of Analyses
Best case: Clearly useless
Worst case: Commonly used
Gives running time guarantee independent of actual input
Fair comparison among different algorithms
Not perfect: For some problems, worst-case input never occurs in real life;
some algorithms with bad worst-case running time actually work well in
practice (e.g. the simplex algorithm for linear programming)
Worst-case analysis is the default and the one used in this class. From now on when we refer to the running time of an algorithm, we imply the worst case running time.
Average case: Sometimes used
Needs to assume some input distribution:
real-world inputs are seldom uniformly random!
Analysis is complicated
Will see examples later
More on Worst-Case Analysis (Expanded)
What does each of these statements mean? (n is the input/problem size)
The algorithm’s worst case running time is
On all inputs of (large) size n, the running time of the algorithm is ≤
Further expanded:
There exist constants 𝑐 > 0, 𝑛 ≥ 0, such that for any 𝑛 ≥ 𝑛 and all inputs of size 𝑛, the algorithm’s running time is ≤ 𝑐 ⋅ 𝑓(𝑛).
Implication 1: No need to really find the worst input.
Implication 2: No need to consider input of size smaller than a constant 𝑛.
The algorithm’s worst case running time is
There exists at least one input of (large) size n
for which the running time of the algorithm is ≥
Further expanded:
There exist const 𝑐 > 0, 𝑛 ≥ 0, s.t. for any 𝑛 ≥ 𝑛, there exists some input of size 𝑛
on which the algorithm’s running time is ≥ 𝑐 ⋅ 𝑓(𝑛). Mainly used to show that the big-Oh analysis is tight
(i.e., the best possible upper bound); often not required.
More on Worst-Case Analysis
Example: Insertion Sort (n is the instance size)
We saw that, on all inputs of size n, Insertion Sort runs in () time.
On some inputs, e.g., if the items are in reverse order, it requires () time,
while on others, e.g., already sorted data, it only takes time
Insertion sort runs in time (upper bound)
Insertion sort runs in time (lower bound). (and any function
asymptotically smaller than
Insertion sort runs in time
) is also a lower bound, but it is not tight
Assume that you have computed O(n2) as the worst case upper bound for an algorithm A. At this point, you do not know whether the bound is tight (e.g., the worst case running time of A maybe O(nlogn)). Once you establish that the lower bound is Ω(n2) then you know that O(n2) is tight and the running time of A is .
Exercise 2 (from previous exam)
We have two algorithms, A and B. Let TA(n) and TB(n) denote the time complexities of algorithm A and B respectively, with respect to the input size n. Complete the last column of the following table with A, B, or U, where:
A means that for large enough n; algorithm A is always faster; B means that for large enough n; algorithm B is always faster; U means that the information provided is not enough to justify
A (A is polynomial, B is exponential)
U (m=logn. A=O(m), B=Θ(2logm)= Θ(m)) A (A is logarithmic, B is polynomial)
U (both are upper bounds) B
B (Θ(4log5n)=Θ(nlog54) =O(n))
Θ(n2/(logn)3)
Θ(2loglogn)
Simple theoretical analysis is not the end of the story
Example: Selection sort, insertion sort, and wild-guess sort all have worst- case running time . How to distinguish between them?
Theoretical analysis loses information because
Asymptotic notation keeps only the largest terms and suppresses multiplicative constants
Worst-case analysis ignores non worst-case inputs, which could be the majority.
When algorithms have the same theoretical running time differentiate via a
Closer examination of hidden constants
Careful analysis of typical expected inputs
Other factors such as cache efficiency, parallelization are important
Empirical comparison
But, theoretical analysis provides first guidelines
Useful when you don’t know what inputs to expect
An Θ(𝑛 log 𝑛) algorithm is (almost) always better than an Θ(𝑛) algorithm, for large enough inputs
– Will see several Θ 𝑛 log 𝑛 sorting algorithms later
Writing down algorithms in pseudocode
How should we describe algorithms
– Implementable code is too hard to read
• Also, “most’’ lines of code, while important in practice,
are not important algorithmically and would just hide the main ideas
– Natural language is usually not descriptive enough to concisely describe algorithmic ideas
Will often use pseudocode
– Tries to get across main ideas clearly. Models programming syntax
but also uses natural language.
– Not formally defined
– Ad Hoc “rules’’ on next page
Wild-Guess-Sort(𝐴):
𝜋 ← [4,7,1,3,8,11,5,…]
check if 𝐴𝜋𝑖 ≤𝐴𝜋𝑖+1 for all 𝑖=1,2,…,𝑛−1 if yes, output 𝐴 according to 𝜋 and terminate else Insertion-Sort(𝐴)
Insertion-Sort(𝐴): for 𝑗←2to 𝑛 do
while 𝑖≥1 and 𝐴[𝑖]>𝐴[𝑖+1] do
swap 𝐴[𝑖] and 𝐴[𝑖 + 1] 𝑖←𝑖−1
Writing down algorithms in pseudocode
Use standard keywords (if/then/else, while, for, repeat/until, return) and notation: variable ← value, Array[index], function(arguments), etc.
Indent everything carefully and consistently; may also use { } for clarity. Use standard mathematical notation and NOT programming language. E.g.
Write i = i+1 instead of i++
Write and instead of and Write and instead of and
Use data structures as black boxes. If data structure is new, define its functionality first; then describe how to implement each operation.
Use standard/learned algorithms (e.g. sorting) as black boxes Use functions to decompose complex algorithms.
Use natural/plain language when it’s clearer or simpler
(e.g., if is an array, you may write “ the maximum element in ”).
Exercise 3 Stirling’s formula First, I will prove that
Prove that
Then, I will prove that
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com