CSC 240 LECTURE 8
ANALYSIS OF ALGORITHMS
Why might we want to know how fast an algorithm runs? -to estimate how long the algorithm will run on a given input
or the amount of resources it needs
-to estimate how large an input it is reasonable to give the algorithm -to compare the efficiency of different algorithms that solve the same
problem
The actual running time of an algorithm depends on a number of factors, such as:
the quality of the implementation,
the compiler that is used,
what machine it is implemented on, and
how heavily loaded the machine is.
Thus, when we analyze an algorithm, we can only estimate the running time to within a constant factor.
The running time of an algorithm also depends on the particular input on which it is run.
For an algorithm A, let tA(I) denote the number of steps the algorithm performs on input I.
What is a step?
examples: assignment, array index, arithmetic operation, comparison,
return from a function
Choose one or two operations such that the total number of operations performed by the algorithm you are analyzing is the same as the number of these operations performed, to within a constant factor.
Example LinearSearch(L,x)
i←1
while i ≤ length(L) do
if L[i] = x then return i
i ← i+1 end while
return 0
What is a good choice for step? comparison with x
On input L = [2,4,6,8] and x = 2, 1 step is performed. On input L = [2,4,6,8] and x = 4, 2 steps are performed. On input L = [2,4,6,8] and x = 8, 4 steps are performed. On input L = [2,4,6,8] and x = 1, 4 steps are performed.
The running time of an algorithm A usually increases as the size of its input gets larger. Thus, we use a function of the input size n to represent the number of steps it performs on inputs of size n.
Problem: on different inputs of the same size, the algorithm can perform different numbers of steps.
worst case time complexity
TA :N→N, where TA(n) = max {tA(I) | I is an input to A of size n}.
Example:
When A is LinearSearch and the size of input (L,x) is the length of L, TA(n) = n.
average case time complexity
T’A : N → R+ U {0}, where T’A(n) = E[tA] and
the expectation is taken over a probability space of all inputs of size n.
If all inputs are equally likely, then
T’A(n) = ∑{tP (I) | size(I) = n}
————————– .
#{I | size(I) = n}
Like the choice of which steps to count,
input size can depend on the algorithm.
• Foranalgorithmwithalistasinput,theinputsizeisusuallythelengthof
the list.
• Foranalgorithmwithanintegerasinput,theinputsizecanbethe
absolute value of the integer or the number of bits in its binary
representation.
• Foranalgorithmwithanmxnmatrixasinput,
the input size can be m x n, the number of elements in the matrix
or we can use both m and n to describe the input size.
• Foranalgorithmwithagraphasinput,theinputsizeiseitherthe
number of vertices, n, the the number of edges, m, or both.
The worst time complexity TA(n,m) can be a function of two parameters.
Let u:N→N.
Algorithm A has worst-case time complexity at most u means TA ≤ u.
Let 𝒮 denote the set of all inputs to algorithm A.
∀ n ∈ N.(max {tA(I) | (I ∈ 𝒮) AND (size(I) = n)} ≤ u(n)). Express this without using max:
∀ n ∈N. ∀ I ∈ 𝒮. [(size(I) = n) IMPLIES (tA(I) ≤ u(n))]
∀ I ∈ 𝒮. (tA(I) ≤ u(size(I)))
To prove that TA ≤ u, you must show that for all n ∈Nand for all inputs I to A
of size n, algorithm A takes at most u(n) steps on input I.
When comparing two algorithms with different rates of growth, constant factors and low order terms may matter when
the problem size is small,
when many instances will be solved, or
when fast response time is essential (for example, in real-time systems). They do not usually matter when the problem size is large.
Since we only determine the number of operations performed to within a constant factor, we often use O notation to express an upper bound on the time complexity of an algorithm.
It is often much easier mathematically to do an analysis to within a constant factor than to do an exact analysis.
Recall that
O(f)={g∈F |∃c∈R+.∃b∈N.∀n∈N. [(n≥b)IMPLIES (g(n)≤c・f(n))]}.
Express TA ∈ O(u) using predicate logic: ∃c∈R+.∃b∈N.∀n∈N. [(n≥b)IMPLIES (TA(n)≤c・u(n))]
Alternatively, without using TA :
∃ c ∈ R+. ∃ b ∈ N. ∀ n ∈ N. [(n ≥ b) IMPLIES ∀ I ∈ 𝒮. [(size(I) = n) IMPLIES
(tA(I) ≤ c・u(n))]] ∃ c ∈ R+. ∃ b ∈ N. ∀ I ∈ 𝒮. [(size(I) ≥ b) IMPLIES (tA(I) ≤ c・u(size(I)))]
If someone proves TA ≤ u, it does not mean that the algorithm ever takes that long. The actual running time might be much less.
To know what the worst case time complexity of algorithm A is, we also need to prove a lower bound on TA.
Let 𝓁:N→N.
Algorithm A has worst-case time complexity at least 𝓁 means TA ≥ 𝓁. Express this using predicate logic:
Let 𝒮 denote the set of all inputs to algorithm A.
∀ n ∈ N.(max {tA(I) | (I ∈ 𝒮) AND (size(I) = n)} ≥ 𝓁(n)). Express this without using max:
∀ n ∈ N. ∃ I ∈ 𝒮. [(size(I) = n) AND (tA(I) ≥ 𝓁(n))]
∃ I ∈ 𝒮. (tA(I) ≥ 𝓁(size(I))) IS INCORRECT
ToprovethatTA ≥𝓁, foralln∈N,youmustfindaninputItoAofsizen,and
show that algorithm A takes at least 𝓁(n) steps on input I.
try
III
Bigger lower bounds are better! Recall that
Upper bound onTA
on Ta
I
l I i
Tt
bounds
Ω(f)={g∈F |∃c∈R+.∃b∈N.∀n∈N. [(n≥b)IMPLIES (g(n)≥cf(n))]} Express TA ∈ Ω(𝓁) using predicate logic:
∃c∈R+.∃b∈N.∀n∈N. [(n≥b)IMPLIES (TA(n)≥c・𝓁(n))]
∃ c ∈ R+. ∃ b ∈ N. ∀ n ∈ N. [(n ≥ b) IMPLIES ∃ I ∈ 𝒮. [(size(I) = n) AND
lower
not lover
or
an
upper bowdan TA
(tA(I) ≥ c・𝓁(n))]]
ANALYZING THE WORST CASE TIME COMPLEXITY OF ITERATIVE ALGORITHMS
Code without loops or procedure calls takes O(1) time.
Loops
If a loop is executed O(f(n)) times and each iteration takes O(g(n)) steps, then the entire loop takes O(f (n) · g(n)) steps.
If Statements
Suppose A, B, and C are blocks of statements that take O(f(n)), O(g(n)), and O(h(n)) steps, respectively. Then
if C
then A
else B
takes O(h(n)+ max{f(n),g(n)}) steps.
Procedure and Function Calls
Suppose the worst case time complexity of a procedure (or function) P with input size r is T(r) and suppose T(r) ∈ O(f(r)).
Then a call to P with an input of size g(n) takes O(f(g(n))) steps.
Worst Case Time Complexity of Merging Two Sorted Lists
1 2
3 4 5
6 7 8 9
10 11 12
13 14
15 16
17 18 19
i←1 j←1
k←1
while i ≤ length(A) and j ≤ length(B) do
if A[i] ≤ B[j] then C[k] ←A[i]
i←i+1 else C[k] ←B[j]
j←j+1
k←k+1 if i > length(A)
then while j ≤ length(B) do C[k] ←B[j]
j←j+1 k←k+1
else while i ≤ length(A) do C[k] ←A[i]
i←i+1 k←k+1
What is a good choice for input size? m = length(A) and n = length(B)
What is a good choice for a step? • numberofassignmentstoC
What is the worst case time complexity?m+n
• numberofcomparisonsbetweenelementsofAandelementsofB
What is the worst case time complexity? min{m,n} X
m+n X
m+n-1 √
This is the number of times the while loop on line 4 is performed.
It terminates as soon as i > m or j > n.
So either i has been incremented m times or j has been incremented n times. Note that exactly one of i and j is incremented each iteration,
so it is not possible for both these conditions to be true.
Thus, in the first case, j has been incremented at most n-1 times and,
in the second case, i has been incremented at most m-1 times.
In both cases, at most m+n-1 comparisons between elements of A and elements of B are performed.
TA(m,n) ≤ m+n-1
Why is m+n-1 a lower bound?
For each m and n, what is an input in which m+n-1 comparisons are performed?
A is a list of length m, B is a list of length n and the last element of each list is larger than any other elements in the two lists.
Worst case time complexity of Insertion Sort
1 2
3 4
5 6 7 8
i←1
while i ≤ length(A) do
j←i
whilej>1and A[i]1and A[i] 1.
The solution to this recurrence is TSQ(n) = 4(n-1) for n ≥ 1.
MERGESORT(A)
if length(A) = 1 then return
divide A into two subarrays A’ and A” whose sizes differ by at most 1 MERGESORT(A’)
MERGESORT(A”)
A ← MERGE(A’, A”)
What is a good choice for input size? n = length(A)
What is a good choice for a step?
• numberofcomparisonsbetweenelementsofA • allcomparisonsandassignments
Let M denote the worst case time complexity of MERGESORT.
soM:Z+ →Nand
M(n) is the maximum number of steps taken by MERGESORT(A) for arrays A of size n.
Then
⎧c if n = 1 M(n) ≤ ⎨
⎩M(⌈n/2⌉) + M(⌊n/2⌋) + dn where c, d ∈ N are constants.
complexity measure 1, c = 0. complexity measure 2, c = 1.
complexity measure 1, d = 1, since ⌈n/2⌉ + ⌊n/2⌋ – 1 = n-1 complexity measure 2, d ∈ Z+
The arrays A’ and A” have sizes ⌈n/2⌉ and ⌊n/2⌋.
Then first two terms are for the steps taken by MERGESORT(A’) and MERGESORT(A”).
The third term is for dividing A into two subarrays
merging the two halves of the array.
The solution to this recurrence is M(n) ∈ O(n log n), by the Master Method.
Analysis of Recursive Binary Search
RecBinSearch(A, F, L, x) 1 if F = L
2 then if A[F] = x
3 then return F
4 else return 0
5 else m ← (F + L) div 2
6 ifA[m]≥x
7 then RecBinSearch(A, F, m, x)
8 else RecBinSearch(A, m + 1, L, x)
What is a good choice for input size? n = length(A) X
n = L− F+1√
What is a good choice for a step?
• numberofcomparisonsbetweenxandelementsofA
LetB:Z+ →N,
where B(n) denotes the worst case number of comparisons with x performed by RecBinSearch(A, F, L, x) when L − F + 1 = n,
over all choices of A, F, L, and x such that L − F + 1 = n.
Then, from the code,
B(1) = 1 (line 2)
B(n) ≤ 1 + max{B(⌊n/2⌋), B(⌈n/2⌉)}, for n > 1 (line 5 and one of lines 7,8) Note that, if B is a nondecreasing function,
then max{B(⌊n/2⌋), B(⌈n/2⌉)} = B(⌈n/2⌉), so B(n) ≤ 1 + B(⌈n/2⌉) for n > 1.
The solution to this recurrence is B(n) ∈ O(log n), by the Master Method.
When n is a power of 2, B(n) = 1 + B(n/2), so the solution gives a lower bound as well.