Asymptotic Analysis of Algorithms
EECS2030 B: Advanced Object Oriented Programming Fall 2018
CHEN-WEI WANG
Algorithm and Data Structure
● A
○ A systematic way to store and organize data in order to facilitate
access and modifications
○ Never suitable for all purposes: it is important to know its strengths
and limitations
● A well-specified computational problem precisely describes
the desired input/output relationship.
○ Input: A sequence of n numbers ⟨a1, a2, . . . , an⟩
○ Output: A permutation (reordering) ⟨a , a , . . . , a ⟩ of the input 12n
sequence such that a ≤ a ≤…≤ a 12n
○ An instance of the problem: ⟨3, 1, 2, 5, 4⟩
● An algorithm is:
○ A solution to a well-specified computational problem
○ A sequence of computational steps that takes value(s) as input
and produces value(s) as output
● Steps in an algorithm manipulate well-chosen data structure(s). 2 of 42
data structure is:
Measuring “Goodness” of an Algorithm
1. Correctness :
○ Does the algorithm produce the expected output? ○ Use JUnit to ensure this.
2. Efficiency:
○ Time Complexity: processor time required to complete
○ Space Complexity: memory space required to store data
Correctness is always the priority.
How about efficiency? Is time or space more of a concern?
3 of 42
Measuring Efficiency of an Algorithm
● Time is more of a concern than is storage.
● Solutions that are meant to be run on a computer should run as fast as possible.
● Particularly, we are interested in how running time depends on two input factors:
1. size
e.g., sorting an array of 10 elements vs. 1m elements 2. structure
e.g., sorting an already-sorted array vs. a hardly-sorted array
● How do you determine the running time of an algorithm? 1. Measure time via experiments
2. Characterize time as a mathematical function of the input size 4 of 42
Measure Running Time via Experiments
● Once the algorithm is implemented in Java:
○ Execute the program on test inputs of various sizes and structures. ○ For each test, record the elapsed time of the execution.
long startTime = System.currentTimeMillis(); /* run the algorithm */
long endTime = System.currenctTimeMillis(); long elapsed = endTime – startTime;
○ Visualize the result of each test.
● To make sound statistical claims about the algorithm’s running
time, the set of input tests must be “reasonably” complete.
5 of 42
Example Experiment
● Computational Problem:
○ Input: A character c and an integer n
○ Output: A string consisting of n repetitions of character c
e.g., Given input ‘*’ and 15, output ***************. ● Algorithm 1 using String Concatenations:
public static String repeat1(char c, int n) { String answer = “”; for(inti=0;i
○ Accessing an attribute of an object [e.g., acc.balance]
○ Returning from a method [e.g., return result;]
Q: Why is a method call in general not a primitive operation? A: It may be a call to:
● a “cheap” method (e.g., printing Hello World), or
● an“expensive”method(e.g.,sortinganarrayofintegers)
11 of 42
Example: Counting Primitive Operations
1 2 3 4 5 6 7
#oftimesi < ninLine3isexecuted? [n] # of times the loop body (Line 4 to Line 6) is executed? [ n−1 ]
● Line 2:
● Line 3:
● Line 4:
● Line 5:
● Line 6:
● Line 7:
2
n + 1
(n − 1) ⋅ 2 (n − 1) ⋅ 2 (n − 1) ⋅ 2 1
[1 indexing + 1 assignment] [1 assignment + n comparisons] [1 indexing + 1 comparison] [1 indexing + 1 assignment] [1 addition + 1 assignment] [1 return]
findMax (int[] a, int n) { currentMax = a[0];
for (int i = 1; i < n; ) {
if (a[i] > currentMax) { currentMax = a[i]; }
i ++ }
return currentMax; }
● Total # of Primitive Operations: 12 of 42
7n – 2
From Absolute RT to Relative RT
● Each primitive operation (PO) takes approximately the same, constant amount of time to execute. [ say t ]
● The number of primitive operations required by an algorithm should be proportional to its actual running time on a specific environment.
e.g.,findMax (int[] a, int n)has7n−2POs
RT = (7n – 2) ⋅ t
Say two algorithms with RT (7n – 2)⋅t and RT (10n + 3)⋅t. ⇒ It suffices to compare their relative running time:
7n – 2 vs. 10n + 3.
● To determine the of an algorithm, we only
focus on their .
13 of 42
time efficiency
number of POs
Example: Approx. # of Primitive Operations
● Given # of primitive operations counted precisely as 7n1 − 2, we view it as
7⋅n−2⋅n0
● We say
○ n is the highest power
○ 7 and 2 are the multiplicative constants ○ 2 is the lower term
● When approximating a function (considering that input size may be very large):
○ Only the highest power matters.
○ multiplicative constants and lower terms can be dropped. ⇒ 7n − 2 is approximately n
Exercise: Consider 7n + 2n ⋅ log n + 3n2:
○ highest power?
○ multiplicative constants?
○ lower terms? 14 of 42
[ n2 ] [ 7, 2, 3 ] [ 7n+2n⋅log n ]
Approximating Running Time as a Function of Input Size
Given the high-level description of an algorithm, we associate it withafunctionf,suchthat f(n) returnsthenumberof primitive operations that are performed on an input of size n.
15 of 42
○ f(n)=5
○ f(n)=log2n ○ f(n)=4⋅n ○ f(n)=n2
○ f(n)=n3
○ f(n)=2n
[constant] [logarithmic] [linear] [quadratic] [cubic] [exponential]
Focusing on the Worst-Case Input
5ms 4 ms 3 ms 2 ms 1 ms
worst-case time average-case time? best-case time
ABCDEFG Input Instance
● Average-case analysis calculates the expected running times based on the probability distribution of input values.
● worst-case analysis or best-case analysis? 16 of 42
Running Time
What is Asymptotic Analysis?
Asymptotic analysis
● Is a method of describing behaviour in the limit:
○ How the running time of the algorithm under analysis changes as
the input size changes without bound
○ e.g., contrast RT1(n) = n with RT2(n) = n2
● Allows us to compare the relative performance of alternative algorithms:
○ For large enough inputs, the multiplicative constants and lower-order terms of an exact running time can be disregarded.
○ e.g.,RT1(n)=3n2 +7n+18andRT1(n)=100n2 +3n−100are
considered equally efficient, asymptotically.
○ e.g., RT1(n) = n3 + 7n + 18 is considered less efficient than
RT1(n) = 100n2 + 100n + 2000, asymptotically. 17 of 42
Three Notions of Asymptotic Bounds
We may consider three kinds of asymptotic bounds for the running time of an algorithm:
● Asymptotic upper bound [O] ● Asymptotic lower bound [Ω] ● Asymptotic tight bound [Θ]
18 of 42
Asymptotic Upper Bound: Definition
●
Let f(n) and g(n) be functions mapping positive integers (input size) to positive real numbers (running time).
○ f (n) characterizes the running time of some algorithm. ○ denotes a collection of functions.
consists of all functions that can be upper bounded by g(n), starting at some point, using some constant factor.
f (n) ∈ O(g(n)) if there are: ○ Arealconstantc>0
○ An integer constant n0 ≥ 1
●
●
●
O(g(n))
O(g(n))
such that:
For each member function f(n) in O(g(n)) , we say that:
○ f (n) ∈ O(g(n)) [f(n) is a member of “big-Oh of g(n)”]
○ f(n) is O(g(n)) [f(n) is “big-Oh of g(n)”]
○ f(n)isorderofg(n) 19 of 42
f(n)≤c⋅g(n) forn≥n0
Asymptotic Upper Bound: Visualization
cg(n)
f(n)
n0
Input Size
From n0, f (n) is upper bounded by c ⋅ g(n), so f (n) is O(g(n)) . 20 of 42
Running Time
Asymptotic Upper Bound: Example (1)
Prove: The function 8n + 5 is O(n).
Strategy: Choose a real constant c > 0 and an integer constant n0 ≥ 1, such that for every integer n ≥ n0:
8n + 5 ≤ c ⋅ n
Can we choose c = 9? What should the corresponding n0 be? n 8n+5 9n
1 13 9 2 21 18 3 29 27 4 37 36 5 45 45 6 53 54
…
Therefore, we prove it by choosing c = 9 and n0 = 5.
We may also prove it by choosing c = 13 and n0 = 1. Why? 21 of 42
Asymptotic Upper Bound: Example (2)
Prove: The function f (n) = 5n4 + 3n3 + 2n2 + 4n + 1 is O(n4).
Strategy: Choose a real constant c > 0 and an integer constant n0 ≥ 1, such that for every integer n ≥ n0:
5n4 +3n3 +2n2 +4n+1≤c⋅n4 f (1) = 5 + 3 + 2 + 4 + 1 = 15
Choosec=15andn0 =1!
22 of 42
Asymptotic Upper Bound: Proposition (1)
If f(n) is a polynomial of degree d, i.e.,
f(n)=a0 ⋅n0 +a1 ⋅n1 +⋅⋅⋅+ad ⋅nd
anda0,a1,…,ad areintegers(i.e.,negative,zero,orpositive), then f(n) is O(nd) .
○ We prove by choosing
c = ∣a0∣+∣a1∣+⋅⋅⋅+∣ad∣
n0 = 1
○ Weknowthatforn≥1: n0 ≤n1 ≤n2 ≤⋅⋅⋅≤nd
○ Upper-bound effect starts when n0 = 1? [f (1) ≤ 1d ] a0 ⋅10 +a1 ⋅11 +⋅⋅⋅+ad ⋅1d ≤∣a0∣⋅1d +∣a1∣⋅1d +⋅⋅⋅+∣ad∣⋅1d
○ Upper-boundeffectholds? [f(n)≤nd]
a0 ⋅n0 +a1 ⋅n1 +⋅⋅⋅+ad ⋅nd ≤∣a0∣⋅nd +∣a1∣⋅nd +⋅⋅⋅+∣ad∣⋅nd
23 of 42
Asymptotic Upper Bound: Proposition (2)
O(n0) ⊂ O(n1) ⊂ O(n2) ⊂ …
If a function f(n) is upper bounded by another function g(n) of degree d , d ≥ 0, then f (n) is also upper bounded by all other functions of a strictly higher degree (i.e., d + 1, d + 2, etc.). e.g., Family of O(n) contains:
n0, 2n0, 3n0, … n, 2n, 3n, …
e.g., Family of O(n2) contains: n0, 2n0, 3n0, …
n, 2n, 3n, … n2, 2n2, 3n2, …
[functions with degree 0] [functions with degree 1]
[functions with degree 0] [functions with degree 1] [functions with degree 2]
24 of 42
Asymptotic Upper Bound: More Examples
● 5n2+3n⋅logn+2n+5isO(n2)
● 20n3+10n⋅logn+5isO(n3)
● 3⋅logn+2isO(logn)
○ Why can’t n0 be 1?
○ Choosingn0 =1means⇒f(
[c=15,n0 =1]
[c=35,n0 =1] [c=5,n0= 2]
)isupper-boundedbyc⋅log : ● Wehavef( =3⋅log1+2,whichis2.
● Wehavec⋅log ,whichis0.
⇒ f ( ) is not upper-bounded by c ⋅ log
● 2n+2 isO(2n)
● 2n+100⋅lognisO(n)
[ Contradiction! ] [c=4,n0 =1] [c=102,n0 =1]
25 of 42
1
1
1)
1
1
1
Using Asymptotic Upper Bound Accurately
● Use the big-Oh notation to characterize a function (of an algorithm’s running time) as closely as possible.
For example, say f (n) = 4n3 + 3n2 + 5: ○ Recall: O(n3) ⊂ O(n4) ⊂ O(n5) ⊂ …
○ It is the most accurate to say that f(n) is O(n3).
○ It is true, but not very useful, to say that f(n) is O(n4) and that
f(n) is O(n5).
○ It is false to say that f(n) is O(n2), O(n), or O(1).
● Do not include constant factors and lower-order terms in the big-Oh notation.
For example, say f(n) = 2n2 is O(n2), do not say f(n) is O(4n2 + 6n + 9).
26 of 42
Classes of Functions upper bound
O(1) O(log(n)) O(n) O(n ⋅ log(n)) O(n2) O(n3)
O(nk ), k ≥ 1 O(an), a > 1
cost
cheapest
class
constant
logarithmic
linear
“n-log-n”
quadratic
cubic
polynomial
exponential
most expensive
27 of 42
Rates of Growth: Comparison
1044 Exponential
1040
1036
1032
1028 N-Log-N 1024
Cubic Quadratic
1020 1016 1012 108 104 100
Linear Logarithmic Constant
28 of 42
100 101 102 103 104 105 106 107 108 109 1010 1011 1012 1013 1014 1015
n
f (n)
Upper Bound of Algorithm: Example (1)
1 2 3 4 5 6 7
● # of primitive operations: 4
2 assignments + 1 comparison + 1 return = 4
● Therefore, the running time is O(1) .
● That is, this is a constant-time algorithm.
maxOf (int x, int y) { int max = x;
if (y > x) {
max = y; }
return max; }
29 of 42
Upper Bound of Algorithm: Example (2)
1 2 3 4 5 6 7
● From last lecture, we calculated that the # of primitive operations is 7n − 2.
● Therefore, the running time is O(n) . ● That is, this is a linear-time algorithm.
findMax (int[] a, int n) { currentMax = a[0];
for (int i = 1; i < n; ) {
if (a[i] > currentMax) { currentMax = a[i]; }
i ++ }
return currentMax; }
30 of 42
Upper Bound of Algorithm: Example (3)
1 2 3 4 5 6 7 8
● Worst case is when we reach Line 8.
● #ofprimitiveoperations≈c1+n⋅n⋅c2,wherec1 andc2 are
some constants.
● Therefore, the running time is O(n2) .
● That is, this is a quadratic algorithm. 31 of 42
containsDuplicate (int[] a, int n) { for (int i = 0; i < n; ) {
for (int j = 0; j < n; ) {
if (i != j && a[i] == a[j]) {
return true; } j ++; }
i ++; }
return false; }
Upper Bound of Algorithm: Example (4)
1 2 3 4 5 6 7 8 9
10
● #ofprimitiveoperations≈(c1⋅n+c2)+(c3⋅n⋅n+c4),where c1, c2, c3, and c4 are some constants.
● Therefore, the running time is O(n + n2) = O(n2) .
● That is, this is a quadratic algorithm. 32 of 42
sumMaxAndCrossProducts (int[] a, int n) { int max = a[0];
for(int i = 1; i < n;) {
if (a[i] > max) { max = a[i]; } }
int sum = max;
for (int j = 0; j < n; j ++) {
for (int k = 0; k < n; k ++) { sum += a[j] * a[k]; } }
return sum; }
Upper Bound of Algorithm: Example (5)
1 2 3 4 5 6
● # of primitive operations ≈ n+(n−1)+⋅⋅⋅+2+1 = n⋅(n+1) 2
● Therefore, the running time is . ● That is, this is a quadratic algorithm.
triangularSum (int[] a, int n) { int sum = 0;
for (int i = 0; i < n; i ++) {
for(int j=i;j
a[j] = a[j – 1];
j –;
a[j] = current;
● Running time? O( 1 +
[ O(n2) ]
insert into {a[0]} insert into {a[0], a[1]}
insert into {a[0], …, a[n-2]}
● So insertion sort is a quadratic-time algorithm. 37 of 42
2
+⋅⋅⋅+
(n−1) )
Sorting: Alternative Implementations?
● In the Java implementations for selection sort and insertion sort, we maintain the “sorted portion” from the left end.
○ For selection sort, we select the minimum element from the “unsorted portion” and insert it to the end in the “sorted portion”.
● For insertion sort, we choose the left-most element from the “unsorted portion” and insert it at the “right spot” in the “sorted portion”.
● Question: Can we modify the Java implementations, so that the “sorted portion” is maintained and grown from the right end instead?
38 of 42
Comparing Insertion & Selection Sorts
● , running times of selection sort and insertion sort are both .
● We will later see that there exist better algorithms that can perform better than quadratic: O(n ⋅ logn).
39 of 42
Asymptotically
O(n2)
Index (1)
Algorithm and Data Structure
Measuring “Goodness” of an Algorithm Measuring Efficiency of an Algorithm Measure Running Time via Experiments Example Experiment
Example Experiment: Detailed Statistics Example Experiment: Visualization Experimental Analysis: Challenges Moving Beyond Experimental Analysis Counting Primitive Operations
Example: Counting Primitive Operations From Absolute RT to Relative RT
Example: Approx. # of Primitive Operations
40 of 42
Index (2)
Approximating Running Time
as a Function of Input Size
Focusing on the Worst-Case Input
What is Asymptotic Analysis?
Three Notions of Asymptotic Bounds Asymptotic Upper Bound: Definition Asymptotic Upper Bound: Visualization Asymptotic Upper Bound: Example (1) Asymptotic Upper Bound: Example (2) Asymptotic Upper Bound: Proposition (1) Asymptotic Upper Bound: Proposition (2) Asymptotic Upper Bound: More Examples Using Asymptotic Upper Bound Accurately
Classes of Functions
41 of 42
Index (3)
Rates of Growth: Comparison
Upper Bound of Algorithm: Example (1) Upper Bound of Algorithm: Example (2) Upper Bound of Algorithm: Example (3) Upper Bound of Algorithm: Example (4) Upper Bound of Algorithm: Example (5)
Basic Data Structure: Arrays Array Case Study:
Comparing Two Sorting Strategies
Sorting: Strategy 1 – Selection Sort Sorting: Strategy 2 – Insertion Sort Sorting: Alternative Implementations?
Comparing Insertion & Selection Sorts
42 of 42