Asymptotic Analysis of Algorithms
EECS2030 B: Advanced Object Oriented Programming Fall 2019
CHEN-WEI WANG
Measuring “Goodness” of an Algorithm
1. Correctness :
○ Does the algorithm produce the expected output? ○ Use JUnit to ensure this.
2. Efficiency:
○ Time Complexity: processor time required to complete
○ Space Complexity: memory space required to store data
Correctness is always the priority.
How about efficiency? Is time or space more of a concern?
3 of 42
Algorithm and Data Structure
● A data structure is:
○ A systematic way to store and organize data in order to facilitate
access and modifications
○ Never suitable for all purposes: it is important to know its strengths
and limitations
● A well-specified computational problem precisely describes
the desired input/output relationship.
○ Input: A sequence of n numbers a1, a2, . . . , an
○ Output: A permutation (reordering) a1′ , a2′ , . . . , an′ of the input
sequencesuchthata1′ ≤a2′ ≤…≤an′
○ An instance of the problem: 3, 1, 2, 5, 4
● An algorithm is:
○ A solution to a well-specified computational problem
○ A sequence of computational steps that takes value(s) as input
and produces value(s) as output
● Steps in an algorithm manipulate well-chosen data structure(s). 2 of 42
Measuring Efficiency of an Algorithm
● Time is more of a concern than is storage.
● Solutions that are meant to be run on a computer should run as fast as possible.
● Particularly, we are interested in how running time depends on two input factors:
1. size
e.g., sorting an array of 10 elements vs. 1m elements 2. structure
e.g., sorting an already-sorted array vs. a hardly-sorted array
● How do you determine the running time of an algorithm? 1. Measure time via experiments
2. Characterize time as a mathematical function of the input size 4 of 42
Measure Running Time via Experiments
● Once the algorithm is implemented in Java:
○ Execute the program on test inputs of various sizes and structures. ○ For each test, record the elapsed time of the execution.
○ Visualize the result of each test.
● To make sound statistical claims about the algorithm’s running
time, the set of input tests must be “reasonably” complete. 5 of 42
long startTime = System.currentTimeMillis(); /* run the algorithm */
long endTime = System.currenctTimeMillis(); long elapsed = endTime – startTime;
Example Experiment: Detailed Statistics
n
100,000
2,884
repeat2 (in ms)
50,000
repeat1 (in ms)
1
200,000
7,437
1
2
400,000
39,158
3
800,000
170,173
690,836
7
1,600,000
13
3,200,000
2,847,968
6,400,000
12,809,631
28
58
12,800,000
59,594,275
135
265,696,421 (≈ 3 days)
● As input size is doubled, rates of increase for both algorithms are linear:
○ Running time of repeat1 increases by ≈ 5 times.
○ Running time of repeat2 increases by ≈ 2 times. 7 of 42
Example Experiment
● Computational Problem:
○ Input: A character c and an integer n
○ Output: A string consisting of n repetitions of character c
e.g., Given input ‘*’ and 15, output ***************. ● Algorithm 1 using String Concatenations:
● Algorithm 2 using StringBuilder append’s:
public static String repeat1(char c, int n) { String answer = “”; for(inti=0;i
Q: Why is a method call in general not a primitive operation? A: It may be a call to:
● a “cheap” method (e.g., printing Hello World), or
● an“expensive”method(e.g.,sortinganarrayofintegers)
11 of 42
Moving Beyond Experimental Analysis
● A better approach to analyzing the efficiency (e.g., running times) of algorithms should be one that:
○ Allows us to calculate the relative efficiency (rather than absolute elapsed time) of algorithms in a ways that is independent of the hardware and software environment.
○ Can be applied using a high-level description of the algorithm (without fully implementing it).
○ Considers all possible inputs.
● We will learn a better approach that contains 3 ingredients:
1. Counting primitive operations
2. Approximating running time as a function of input size
3. Focusing on the worst-case input (requiring the most running time)
10 of 42
Example: Counting Primitive Operations
1 2 3 4 5 6 7
[1 assignment + n comparisons] [1 indexing + 1 comparison] [1 indexing + 1 assignment]
[1 addition + 1 assignment] [1 return]
7n – 2
findMax (int[] a, int n) { currentMax = a[0];
for (int i = 1; i < n; ) {
if (a[i] > currentMax) { currentMax = a[i]; }
i ++ }
return currentMax; }
#oftimesi < ninLine3isexecuted? [n]
# of times the loop body (Line 4 to Line 6) is executed? [ n−1 ] ● Line 2: 2 [1 indexing + 1 assignment]
● Line 3:
● Line 4:
● Line 5:
● Line 6:
● Line 7:
n + 1
(n − 1) ⋅ 2 (n − 1) ⋅ 2 (n − 1) ⋅ 2 1
● Total # of Primitive Operations: 12 of 42
From Absolute RT to Relative RT
● Each primitive operation (PO) takes approximately the same, constant amount of time to execute. [ say t ]
● The number of primitive operations required by an algorithm should be proportional to its actual running time on a specific environment.
e.g.,findMax (int[] a, int n)has7n−2POs
RT = (7n - 2) ⋅ t
Say two algorithms with RT (7n - 2)⋅t and RT (10n + 3)⋅t. ⇒ It suffices to compare their relative running time:
● To determine the
focus on their .
13 of 42
7n - 2 vs. 10n + 3. of an algorithm, we only
time efficiency
number of POs
Approximating Running Time as a Function of Input Size
Given the high-level description of an algorithm, we associate it withafunctionf,suchthat f(n) returnsthenumberof primitive operations that are performed on an input of size n.
○ f(n)=5
○ f(n)=log2n ○ f(n)=4⋅n ○ f(n)=n2
○ f(n)=n3
○ f(n)=2n
15 of 42
[constant] [logarithmic] [linear] [quadratic] [cubic] [exponential]
Example: Approx. # of Primitive Operations
● Given # of primitive operations counted precisely as 7n − 2,
we view it as
● We say
○ n is the highest power
○ 7 and 2 are the multiplicative constants ○ 2 is the lower term
7⋅n1 −2⋅n0
● When approximating a function (considering that input size may be very large):
○ Only the highest power matters.
○ multiplicative constants and lower terms can be dropped. ⇒ 7n − 2 is approximately n
Exercise: Consider 7n + 2n ⋅ log n + 3n2: ○ highest power?
○ multiplicative constants?
[ n2 ] [ 7, 2, 3 ] [ 7n+2n⋅log n ]
○ lower terms? 14 of 42
Focusing on the Worst-Case Input
5ms 4 ms 3 ms 2 ms 1 ms
worst-case time average-case time? best-case time
ABCDEFG Input Instance
● Average-case analysis calculates the expected running times based on the probability distribution of input values.
● worst-case analysis or best-case analysis? 16 of 42
Running Time
What is Asymptotic Analysis?
Asymptotic analysis
● Is a method of describing behaviour in the limit:
○ How the running time of the algorithm under analysis changes as
the input size changes without bound
○ e.g., contrast RT1(n) = n with RT2(n) = n2
● Allows us to compare the relative performance of alternative algorithms:
○ For large enough inputs, the multiplicative constants and lower-order terms of an exact running time can be disregarded. ○ e.g.,RT1(n)=3n2 +7n+18andRT1(n)=100n2 +3n−100are
considered equally efficient, asymptotically.
○ e.g., RT1(n) = n3 + 7n + 18 is considered less efficient than
RT1(n) = 100n2 + 100n + 2000, asymptotically. 17 of 42
Asymptotic Upper Bound: Definition
● Let f(n) and g(n) be functions mapping positive integers (input size) to positive real numbers (running time).
○ f (n) characterizes the running time of some algorithm. ○ O(g(n)) denotes a collection of functions.
● O(g(n)) consists of all functions that can be upper bounded by g(n), starting at some point, using some constant factor.
● f (n) ∈ O(g(n)) if there are: ○ Arealconstantc>0
○ An integer constant n0 ≥ 1
such that:
● For each member function f(n) in O(g(n)) , we say that:
○ f(n)isorderofg(n) 19 of 42
f(n)≤c⋅g(n) forn≥n0
○ f (n) ∈ O(g(n)) [f(n) is a member of “big-Oh of g(n)”]
○ f(n) is O(g(n)) [f(n) is “big-Oh of g(n)”]
Three Notions of Asymptotic Bounds
We may consider three kinds of asymptotic bounds for the running time of an algorithm:
● Asymptotic upper bound [O] ● Asymptotic lower bound [⌦] ● Asymptotic tight bound [⇥]
18 of 42
Asymptotic Upper Bound: Visualization
n0
Input Size
From n0, f (n) is upper bounded by c ⋅ g(n), so f (n) is O(g(n)) . 20 of 42
cg(n)
f(n)
Running Time
Asymptotic Upper Bound: Example (1)
Prove: The function 8n + 5 is O(n).
Strategy: Choose a real constant c > 0 and an integer constant n0 ≥ 1, such that for every integer n ≥ n0:
8n + 5 ≤ c ⋅ n
Can we choose c = 9? What should the corresponding n0 be? n 8n+5 9n
1 13 9 2 21 18 3 29 27 4 37 36 5 45 45 6 53 54
…
Therefore, we prove it by choosing c = 9 and n0 = 5.
We may also prove it by choosing c = 13 and n0 = 1. Why? 21 of 42
Asymptotic Upper Bound: Proposition (1)
If f(n) is a polynomial of degree d, i.e.,
f(n)=a0 ⋅n0 +a1 ⋅n1 +⋅⋅⋅+ad ⋅nd
and a0,a1,…,ad are integers, then f(n) is O(nd) . ○ We prove by choosing
c = a0+a1+⋅⋅⋅+ad n0 = 1
○ Weknowthatforn≥1: n0 ≤n1 ≤n2 ≤⋅⋅⋅≤nd ○ Upper-boundeffect: n0 =1? [f(1)≤(a0+a1+⋅⋅⋅+ad)⋅1d]
a0 ⋅10 +a1 ⋅11 +⋅⋅⋅+ad ⋅1d ≤a0⋅1d +a1⋅1d +⋅⋅⋅+ad⋅1d
○ Upper-boundeffectholds? [f(n)≤(a0+a1+⋅⋅⋅+ad)⋅nd]
23 of 42
a0 ⋅n0 +a1 ⋅n1 +⋅⋅⋅+ad ⋅nd ≤a0⋅nd +a1⋅nd +⋅⋅⋅+ad⋅nd
Asymptotic Upper Bound: Example (2)
Prove: The function f (n) = 5n4 + 3n3 + 2n2 + 4n + 1 is O(n4). Strategy: Choose a real constant c > 0 and an integer constant
n0 ≥ 1, such that for every integer n ≥ n0:
5n4 +3n3 +2n2 +4n+1≤c⋅n4
f (1) = 5 + 3 + 2 + 4 + 1 = 15 Choosec=15andn0 =1!
22 of 42
Asymptotic Upper Bound: Proposition (2)
O(n0) ⊂ O(n1) ⊂ O(n2) ⊂ …
If a function f(n) is upper bounded by another function g(n) of degree d , d ≥ 0, then f (n) is also upper bounded by all other functions of a strictly higher degree (i.e., d + 1, d + 2, etc.). e.g., Family of O(n) contains:
[functions with degree 0] [functions with degree 1]
[functions with degree 0] [functions with degree 1] [functions with degree 2]
24 of 42
n0, 2n0, 3n0, … n, 2n, 3n, …
e.g., Family of O(n2) contains: n0, 2n0, 3n0, …
n, 2n, 3n, … n2, 2n2, 3n2, …
Asymptotic Upper Bound: More Examples
● 5n2+3n⋅logn+2n+5isO(n2) [c=15,n0 =1]
● 20n3+10n⋅logn+5isO(n3) [c=35,n0 =1]
● 3⋅logn+2isO(logn) [c=5,n0= 2] ○ Why can’t n0 be 1?
○ Choosingn0 =1means⇒f( )isupper-boundedbyc⋅log :
● Wehavef( 1 )=3⋅log1+2,whichis2. ● Wehavec⋅log 1 ,whichis0.
1
1
⇒ f ( ) is not upper-bounded by c ⋅ log ● 2n+2 isO(2n)
[ Contradiction! ] [c=4,n0 =1] [c=102,n0 =1]
1
● 2n+100⋅lognisO(n) 25 of 42
1
Classes of Functions
O(1) cheapest O(log(n))
upper bound cost
class
constant
logarithmic
linear
“n-log-n”
quadratic
cubic
polynomial
exponential
O(n) O(n ⋅ log(n)) O(n2)
O(n3) O(nk ), k ≥ 1
O(an), a > 1 most expensive 27 of 42
Using Asymptotic Upper Bound Accurately
● Use the big-Oh notation to characterize a function (of an algorithm’s running time) as closely as possible.
For example, say f (n) = 4n3 + 3n2 + 5: ○ Recall: O(n3) ⊂ O(n4) ⊂ O(n5) ⊂ …
○ It is the most accurate to say that f(n) is O(n3).
○ It is true, but not very useful, to say that f(n) is O(n4) and that
f(n) is O(n5).
○ It is false to say that f(n) is O(n2), O(n), or O(1).
● Do not include constant factors and lower-order terms in the big-Oh notation.
For example, say f (n) = 2n2 is O(n2), do not say f (n) is O(4n2 + 6n + 9).
26 of 42
Rates of Growth: Comparison
1044 Exponential
1040 1036 1032 1028 1024 1020 1016 1012 108 104 100
Cubic Quadratic N-Log-N Linear Logarithmic Constant
100 101 102 103 104 105 106 107 108 109 1010 1011 1012 1013 1014 1015
n
28 of 42
f (n)
Upper Bound of Algorithm: Example (1)
1 2 3 4 5 6 7
maxOf (int x, int y) { int max = x;
if (y > x) {
max = y; }
return max; }
● # of primitive operations: 4
2 assignments + 1 comparison + 1 return = 4
● Therefore, the running time is O(1) .
● That is, this is a constant-time algorithm.
29 of 42
Upper Bound of Algorithm: Example (3)
1 2 3 4 5 6 7 8
● Worst case is when we reach Line 8.
● #ofprimitiveoperations≈c1+n⋅n⋅c2,wherec1 andc2 are
some constants.
● Therefore, the running time is O(n2) .
● That is, this is a quadratic algorithm. 31 of 42
containsDuplicate (int[] a, int n) { for (int i = 0; i < n; ) {
for (int j = 0; j < n; ) {
if (i != j && a[i] == a[j]) {
return true; } j ++; }
i ++; }
return false; }
Upper Bound of Algorithm: Example (2)
1 2 3 4 5 6 7
findMax (int[] a, int n) { currentMax = a[0];
for (int i = 1; i < n; ) {
if (a[i] > currentMax) { currentMax = a[i]; }
i ++ }
return currentMax; }
● From last lecture, we calculated that the # of primitive operations is 7n − 2.
● Therefore, the running time is O(n) . ● That is, this is a linear-time algorithm.
30 of 42
Upper Bound of Algorithm: Example (4)
1 2 3 4 5 6 7 8 9
10
● #ofprimitiveoperations≈(c1⋅n+c2)+(c3⋅n⋅n+c4),where c1, c2, c3, and c4 are some constants.
● Therefore, the running time is O(n + n2) = O(n2) .
● That is, this is a quadratic algorithm. 32 of 42
sumMaxAndCrossProducts (int[] a, int n) { int max = a[0];
for(int i = 1; i < n; i ++) {
if (a[i] > max) { max = a[i]; } }
int sum = max;
for (int j = 0; j < n; j ++) {
for (int k = 0; k < n; k ++) { sum += a[j] * a[k]; } }
return sum; }
Upper Bound of Algorithm: Example (5)
1 2 3 4 5 6
triangularSum (int[] a, int n) { int sum = 0;
for (int i = 0; i < n; i ++) {
for(int j=i;j
a[j] = a[j – 1];
j –;
a[j] = current;
O( 1 + 2 +⋅⋅⋅+ (n−1)
)
insert into {a[0]} insert into {a[0], a[1]}
insert into {a[0], …, a[n-2]}
● So insertion sort is a quadratic-time algorithm. 37 of 42
Comparing Insertion & Selection Sorts
● Asymptotically , running times of selection sort and insertion sort are both O(n2) .
● We will later see that there exist better algorithms that can perform better than quadratic: O(n ⋅ logn).
39 of 42
Sorting: Alternative Implementations?
● In the Java implementations for selection sort and insertion sort, we maintain the “sorted portion” from the left end.
○ For selection sort, we select the minimum element from the “unsorted portion” and insert it to the end in the “sorted portion”.
● For insertion sort, we choose the left-most element from the “unsorted portion” and insert it at the “right spot” in the “sorted portion”.
● Question: Can we modify the Java implementations, so that the “sorted portion” is maintained and grown from the right end instead?
38 of 42
Index (1)
Algorithm and Data Structure
Measuring “Goodness” of an Algorithm
Measuring Efficiency of an Algorithm
Measure Running Time via Experiments
Example Experiment
Example Experiment: Detailed Statistics
Example Experiment: Visualization
Experimental Analysis: Challenges
Moving Beyond Experimental Analysis
Counting Primitive Operations
Example: Counting Primitive Operations
From Absolute RT to Relative RT
Example: Approx. # of Primitive Operations
40 of 42
Index (2)
Approximating Running Time
as a Function of Input Size
Focusing on the Worst-Case Input
What is Asymptotic Analysis?
Three Notions of Asymptotic Bounds
Asymptotic Upper Bound: Definition
Asymptotic Upper Bound: Visualization
Asymptotic Upper Bound: Example (1)
Asymptotic Upper Bound: Example (2)
Asymptotic Upper Bound: Proposition (1)
Asymptotic Upper Bound: Proposition (2)
Asymptotic Upper Bound: More Examples
Using Asymptotic Upper Bound Accurately
Classes of Functions
41 of 42
Index (3)
Rates of Growth: Comparison
Upper Bound of Algorithm: Example (1)
Upper Bound of Algorithm: Example (2)
Upper Bound of Algorithm: Example (3)
Upper Bound of Algorithm: Example (4)
Upper Bound of Algorithm: Example (5)
Basic Data Structure: Arrays
Array Case Study:
Comparing Two Sorting Strategies
Sorting: Strategy 1 – Selection Sort
Sorting: Strategy 2 – Insertion Sort
Sorting: Alternative Implementations?
Comparing Insertion & Selection Sorts
42 of 42