03_Complexity_Analysis
Lecture 3
Complexity Analysis
EECS 281: Data Structures & Algorithms
Complexity Analysis:
Overview
Data Structures & Algorithms
Complexity Analysis
• What is it?
– Given an algorithm and input size n,
how many steps are needed?
– Each step should take O(1) time
– As input size grows, how does number of steps change?
• Focus is on TREND
• How do we measure complexity?
– Express the rate of growth as a function f(n)
– Use the big-O notation
• Why is this important?
– Tells how well an algorithm scales to larger inputs
– Given two algorithms, we can compare performance
before implementation
3
Metrics of Algorithm Complexity
• Best-Case
– Least number of comparisons required, given ideal input
– Analysis performed over inputs of a given size
– Example: Data is found in the first place you look
• Worst-Case
– Most number of comparisons required, given hard input
– Analysis performed over inputs of a given size
– Example: Data is found in the last place you could possibly look
• Average-Case
– Average number of comparisons required, given any input
– Average performed over all possible inputs of a given size
Array of
n items
Using a linear search over n items,
how many comparisons will it take to find item x?
Best-case: 1 comparison
Worst-case: n comparisons
Average-case: n/2 comparisons
4
What Affects Runtime?
• The algorithm
• Implementation details
– Skills of the programmer
• CPU Speed / Memory Speed
• Compiler (Options used)
g++ -g3 (for debugging, highest level of information)
g++ -O3 (Optimization level 3 for speed)
• Other programs running in parallel
• Amount of data processed (Input size)
5
Input Size versus Runtime
• Rate of growth independent of most factors
– CPU speed, compiler, etc.
• Does doubling input size mean doubling runtime?
• Will a “fast” algorithm still be “fast” on large inputs?
Input Size (n)
Runtime
(t)
How do we measure input size? 6
Measuring & Using Input Size
• Number of bits
– In an int, a double? (32? 64?)
• Number of items: what counts as an item?
– Array of integers? One integer? One digit? …
– One string? Several strings? A char?
• Notation and terminology
– n Input size
– f(n) Maximum number of steps taken by an
algorithm when input has size n (“f of n”)
– O(f(n)) Complexity class of f(n) (“Big-O of f of n”)
7
Common Orders of Functions
Notation Name
O(1) Constant
O(log n) Logarithmic
O(n) Linear
O(n log n) Loglinear, Linearithmic
O(n2) Quadratic
O(n3), O(n4), … Polynomial
O(cn) Exponential
O(n!) Factorial
O( ) Doubly Exponential
n22
8
Graphing f(n) Runtimes
0
5
10
15
20
25
0 1 2 3 4 5 6 7 8 9 10
1
log n
n
n log n
n2
2n
9
n
f(n)
f
Input Size Example
What should we measure?
• Vertices?
• Edges?
• Vertices and Edges?
Use V and E to determine which contributes more to the
total number of steps
• Big-O examples: E log V, EV, V2 log E
Graph G = (V, E):
V = 5 Vertices
E = 6 Edges
When in doubt, measure
input size in bits
10
From Analysis to Application
• Algorithm comparisons are independent of hardware,
compilers and implementation tweaks
• Predict which algorithms will eventually be faster
– For large enough inputs
– O(n2) time algorithms will take longer than O(n) algorithms
• Constants can often be ignored because
they do not affect asymptotic comparisons
– Algorithm with 20n steps runs faster than
algorithm with 10n steps. Why?
20n
10n
11
Complexity Analysis:
Overview
Data Structures & Algorithms
Complexity Analysis:
Counting Steps
Data Structures & Algorithms
Q: What counts as one step
in a program ?
A: Primitive operations
• a) Variable assignment
• b) Arithmetic operation
• c) Comparison
• d) Array indexing or pointer reference
• e) Function call (not counting the data)
• f) Function return (not counting the data)
Runtime of 1 step is independent on input
14
Counting Steps: for Loop
• The basic form of a for-loop:
for (initialization; test; update)
• The initialization is performed once (1)
• The test is performed every time the body
of the loop runs, plus once for when the
loop ends (n + 1)
• The update is performed every time the
body of the loop runs (n)
15
1
2 1 step
3 1 + 1 + 2n steps
4 1 step
5
6 1 step
7
8
9 1 step
10 1 + 1 + 2n steps
11 1 + 1 + 2n steps
12 1 step
13
14 1 + 1 + 2n steps
15 1 step
16
17 1 step
18
1 int func1(int n) {
2 int sum = 0;
3 for (int i = 0; i < n; ++i) {
4 sum += i;
5 } // for
6 return sum;
7 } // func1()
8 int func2(int n) {
9 int sum = 0;
10 for (int i = 0; i < n; ++i) {
11 for (int j = 0; j < n; ++j)
12 ++sum;
13 } // for i
14 for (int k = 0; k < n; ++k) {
15 --sum;
16 } // for
17 return sum;
18 } // func2()
Total steps: 4 + 3n
Total steps: 3n2 + 7n + 6
Counting Steps: Polynomial
16
1
2 1 step
3 1 + 1 + ~log n * (2 steps)
4 1 step
5
6 1 step
7
1 int func3(int n) {
2 int sum = 0;
3 for (int i = n; i > 1; i /= 2)
4 sum += i;
5
6 return sum;
7 } // func3()
Total: 4 + 3 log n = O(log n)
Counting Steps: Logarithmic
n items
n/2 items
n/4 items
1 item
. . .
= 1
= 1*…*2*2
= 1*…*2
= 1*…*2*2*2 }
1*2k = 2kk times{
Remember that k = log2n iff 2k = n
17
Examples of O(log n) Time
1 uint32_t logB(uint32_t n) {
2 // find binary log, round up
3 uint32_t r = 0;
4 while (n > 1) {
5 n /= 2
6 r++;
7 } // while
8 return r;
9 } // logB()
10 int *bsearch(int *lo, int *hi, int val) {
11 // find position of val between lo,hi
12 while (hi >= lo) {
13 int *mid = lo + (hi – lo) / 2;
14 if (*mid == val)
15 return mid;
16 else if (*mid > val)
17 hi = mid – 1;
18 else
19 lo = mid + 1;
20 } // while
21 return nullptr;
22 } // bsearch()
18
Algorithm Exercise
How many multiplications, if size = n?
1 // REQUIRES: in and out are arrays with size elements
2 // MODIFIES: out
3 // EFFECTS: out[i] = in[0] *…* in[i-1] * in[i+1] *…* in[size-1]
4 void f(int *out, const int *in, int size) {
5 for (int i = 0; i < size; ++i) {
6 out[i] = 1;
7 for (int j = 0; j < size; ++j) {
8 if (i != j)
9 out[i] *= in[j];
10 } // for j
11 } // for i
12 } // f()
19
Algorithm Exercise
How many multiplications and divisions, if size = n?
1 void f(int *out, const int *in, int size) {
2 int product = 1;
3 for (int i = 0; i < size; ++i)
4 product *= in[i];
5
6 for(int i = 0; i < size; ++i)
7 out[i] = product / in[i];
8 } // f()
20
Complexity Analysis:
Counting Steps
Data Structures & Algorithms
Complexity Analysis:
Big-O Notation
Data Structures & Algorithms
0
0.5
1
1.5
2
2.5
0 1
Big-O Definition
f(n) = O(g(n)) if and only if there are constants
}such that f(n) ≤ c * g(n) whenever n ≥ n0
f(n) = n
g(n) = n2
n0 = 1
c > 0
n0 ≥ 0
Is n = O(n2)?
23
:
:
:
Big-O: Sufficient (but not necessary) Condition
O(g(n)) is f(n) then d
g(n)
f(n)
If lim
n
ú
û
ù
ê
ë
é
¥<=÷÷
ø
ö
çç
è
æ
¥®
log2n = O(2n)?
lim
limf(n) = log2n
g(n) = 2n
Use L’Hopital’s Rule
log2n = O(2n)
f(n) =
g(n) = 100
lim Condition does not hold but
it is true that f(n) = O(g(n)):
O(100)?
100
n
sin =÷
ø
ö
ç
è
æ
÷
ø
ö
ç
è
æ
100
n
sin ÷
÷
÷
÷
ø
ö
ç
ç
ç
ç
è
æ
÷
ø
ö
ç
è
æ
100
sin
100
n
¥®n
¥®n
÷
ø
ö
ç
è
æ
2n
n log
÷
ø
ö
ç
è
æ
2n
1
¥<= d0
¥¥/
¥®n
24
Big-O: Can We Drop Constants?
÷
ø
ö
ç
è
æ +
¥® 2n
76n
lim
n
c > 0, n0 ≥ 0 such that
f(n) ≤ c ⋅g(n) whenever n≥n0
3n2 + 7n + 42 = O(n2)?
f(n) = 3n2 + 7n + 42
g(n) = n2
c = 5
g(n) = 5n2
Definition
n0
Sufficient Condition
÷
ø
ö
ç
è
æ
¥® 2limn
6
÷÷
ø
ö
çç
è
æ ++
¥®
2
2
n n
427n3n
lim
¥<= ¥® d g(n) f(n) lim n 25 Rules of Thumb 1. Lower-order terms can be ignored – n2 + n + 1 = O(n2) – n2 + log(n) + 1 = O(n2) 2. Coefficient of the highest-order term can be ignored – 3n2 + 7n + 42 = O(n2) 26 Log Identities Identity Example loga(xy) = logax + logay log2(12) = loga(x/y) = logax - logay log2(4/3) = loga(xr) = r logax log28 = loga(1/x) = -logax log21/3 = log79 = logaa = ? loga1 = ? Identity Example a(n+m) = anam 25 = a(n-m) = an/am 23-2 = (a(n))m = anm (22)3 = 2-4 = a-1= ? a0 = ? a1 = ? Power Identities a ln xln a log xlog xloga == n n a 1 a =- 27 Exercise True or false? 10100 = O(1) 3n4 + 45n3 = O(n4) 3n = O(2n) 2n = O(3n) 45 log(n) + 45n = O(log(n)) log(n2) = O(log(n)) [log(n)]2 = O(log(n)) Find f(n) and g(n), such that f(n) ≠ O(g(n)) and g(n) ≠ O(f(n)) 28 loga(xy) = logax + logay loga(x/y) = logax - logay loga(xr) = r logax loga(1/x) = -logax a(n+m) = anam a(n-m) = an/am (a(n))m = anm a ln xln a log xlog xloga == n n a 1 a =- Big-O (O) Big-Theta (𝛩) Big-Omega (𝛺) Defines Asymptotic upper bound Asymptotic tight bound Asymptotic lower bound Definition f(n) = O(g(n)) if and only if there exists an integer n0 and a real number c such that for all n ≥ n0, f(n) ≤ c·g(n) f(n) = Θ(g(n)) if and only if there exists an integer n0 and real constants c1 and c2 such that for all n ≥ n0: c1·g(n) ≤ f(n) ≤ c2·g(n) f(n) = Ω(g(n)) if and only if there exists an integer n0 and a real number c such that for all n ≥ n0, f(n) ≥ c·g(n) Mathematical Definition f1(n)=2n +1 O(n) or O(n2) or O(n3)… Ω(n) or Ω(1) f2(n)=n2 +n +5 O(n2) or O(n3) … Ω(n2) or Ω(n) or Ω(1) Big-O, Big-Theta, and Big-Omega ∃n0 ∈ Z, ∃c ∈R : ∀n≥n0,f(n) ≤ c ⋅g(n) ∃n0 ∈ Z, ∃c ∈R : ∀n≥n0,f(n) ≥ c ⋅g(n) Θ(n) Θ(n2) (f(n)) O(f(n)) (f(n)) W=Q ! 31 Complexity Analysis: Big-O Notation Data Structures & Algorithms Complexity Analysis: Amortized Complexity Data Structures & Algorithms Amortized Complexity • A type of worst-case complexity • Used when the work/time profile is “spiky” (sometimes it is very expensive, but most times it is a small expense) • Analysis performed over a sequence of operations covering of a given range – The sequence selected includes expensive and cheap operations • Considers the average cost of one operation over a sequence of operations – Best/Worst/Average-case only consider operations independently – Different from average-case complexity! • Key to understanding expandable arrays and STL vectors, priority queues, and hash tables 34 Cell Phone Bill* Example • Pay $100 once per month, each call and text has no (added) cost • If you make 1000 calls/texts, each one effectively costs $0.10 • The rate at which money leaves your pocket is very “spiky” • But each call or text appears to have basically a constant cost: the amortized cost per text is O(1) *assumes unlimited calls/texts 35 Analyze the asymptotic runtime complexity of the push operation for a stack implemented using an array/vector 36 Common Amortized Complexity Method Implementation push(object) 1. If needed, allocate a bigger array and copy data 2. Add new element at top_ptr, increment top_ptr 1. Constant Growth – When container fills, increase size by c – Amortized cost: (1 + Θ(n)) + c ∗Θ(1)cpush operations = Θ(n) – Amortized linear 2. Linear Growth – When container fills, increase size by n – Amortized cost: (1 + Θ(n)) + n∗Θ(1)npush operations = Θ(1) – Amortized constant 38 Container Growth Options Complexity Analysis: Amortized Complexity Data Structures & Algorithms Complexity Analysis: Balance Exercise Data Structures & Algorithms Exercise • You have n billiard balls. All have equal weight, except for one which is heavier. Find the heavy ball using only a balance. • Describe an O(n2) algorithm • Describe an O(n) algorithm • Describe an O(log n) algorithm • Describe another O(log n) algorithm 41 Two O(log n) solutions • Two groups: log2(n) = O(log3n) • Three groups: log3(n) = O(log2n) • True or false? Why? 46 Complexity Analysis: Balance Exercise Data Structures & Algorithms