Data Structures and Algorithms
Chapter 4
Algorithm Analysis Basic Concepts
• An algorithm is a finite sequence of steps which solves a problem.
• Efficiency of algorithms can be analyzed in terms of memory/space usage and in terms of running time.
• We will focus on running time analysis.
• Running time of an algorithm depends on the input size.
• We express running time as a function of the input size n.
Algorithm Analysis Basic Concepts
• Running times of two sorting algorithms
• •
Running times of an algorithm may be different for different inputs of the same size.
•
Often, we perform only the worst-case analysis
Algorithm Analysis Basic Concepts
For example, elapsed times of insertion sort algorithm on an array of 100,000 integers:
– Best case: 1 ms, when elements are sorted in nondecreasing order
– Average case: 12,145 ms, when elements are randomly distributed
– Worst case: 24,810 ms, when elements are sorted in the reverse order
Algorithm Analysis Mathematical Functions
• f(n) = c (constant)
• f(n)=clogn (logn)
• f(n) = cn (linear)
• f(n)=cnlogn (nlogn)
• f(n) = cn2 (quadratic)
Algorithm Analysis Rate of Growth
• When we analyze the running time of algorithms, we do not look at the actual running times.
• Instead, we focus on the rate of growth, i.e., how fast or slowly the running time grows as the input size increases.
• This is called asymptotic analysis.
• We use O (big-oh), Ω (big-omega), and Θ (big-theta) notations.
Algorithm Analysis Rate of Growth
• O(g(n)) = {f(n) : there exist positive constants c and n0 such that f(n) ≤ cg(n) for n ≥ n0}
Algorithm Analysis Rate of Growth
• f(n) = 3n + 2 => f(n) = O(n) Proof:
Let g(n) = n, c = 4, and n0 = 2. Then, f(n) = 3n + 2 ≤ 4n for all n ≥ 2, or
f(n) ≤ cg(n) for all n ≥ n0
So, f(n) = O(n)
Algorithm Analysis Rate of Growth
• f(n) = 5n3 + 2n2 + 8n + 4 => f(n) = O(n3) Proof:
f(n)=5n3 +2n2 +8n+4 ≤5n3 +2n3 +8n3 +4n3 = 19n3
If we let g(n) = n3, c = 19 and n0 = 1, then f(n) ≤ cg(n) for all n ≥ n0
So, f(n) = O(n3)
Algorithm Analysis Rate of Growth
• f(n) = 3n + 2 => O(n)
• f(n) = 5n3 + 2n2 + 8n + 4 => O(n3)
• f(n) = 2n2 + 2n log n + 2n + 4 => O(n2)
• f(n) = 2n log n + 10n – 6 => O(n log n)
• f(n) = 5n + 23log n => O(n)
• f(n)=3logn+10 =>O(logn)
Algorithm Analysis Rate of Growth
• Ω(g(n)) = {f(n) : there exist positive constants c and n0 such that f(n) ≥ cg(n) for n ≥ n0}
• f(n)=3nlogn–2n=>f(n)=Ω(nlogn) • f(n) = 5n3 + 2n2 + 8n + 4 => f(n) = Ω(n3)
Running Time
Algorithm Analysis Rate of Growth
• Θ(g(n)) = {f(n): there exist positive constants c1, c2, and n0 suchthat0≤c1g(n)≤f(n)≤c2g(n)foralln≥n0}
n0
Input Size
• f(n)=3nlogn+4n+5logn=>f(n)=Θ(nlogn) • f(n) = 5n3 + 2n2 + 8n + 4 => f(n) = Θ(n3)
c2g(n)
f(n)
c1g(n)
•
Rate of growth of different functions:
838 16 4 16 32 5 32 64 6 64
24 64
512
4096
32768
262144
2097152
16777216
134217728
256 65536 4294967296 1.84467E+19 3.40282E+38 1.15792E+77 1.3408E+154
128 7 128 256 8 256 512 9 512
64 256 160 1024 384 4096 896 16384 2048 65536
Algorithm Analysis Rate of Growth
n log n n n log n n2
n3
2n
4608 262144
1
2
3
4
5
6
7 8}
Algorithm Analysis Rate of Growth
• Example: Find the largest element
public static double arrayMax(double[] data) {
int n = data.length;
double currentMax = data[0]; for(intj=1;j
//loopexecutedn–1times // loop body takes c3
return currentMax;
// c4
• Totalrunningtime,f(n)=c1+c2+c3(n–1)+c4 • f(n) = O(n)
1
2
3
4
5
6
7 8}
for (int b : groupB) for (int c : groupC)
Algorithm Analysis Rate of Growth
• Example: Three-way set disjointness problem (solution 1)
public static boolean disjoint1(int[ ] groupA, int[ ] groupB, int[ ] groupC) { for (int a : groupA)
if ((a == b) && (b == c)) return false
return true;
• f(n) = O(n3)
1
2
3
4
5
6
7
8 9}
Algorithm Analysis Rate of Growth
• Example: Three-way set disjointness problem (solution 2)
public static boolean disjoint1(int[ ] groupA, int[ ] groupB, int[ ] groupC) { for (int a : groupA)
for (int b : groupB) if (a == b)
for (int c : groupC) if (b == c)
return false return true;
• f(n) = O(n2)
1
2
3
4
5
6
7 8}
for (int k=j+1; k < n; k++) if (data[j] == data[k])
return false; return true;
// found duplicate pair
// if we reach this, elements are unique
Algorithm Analysis Rate of Growth
• Example: Element uniqueness problem (solution 1)
public static boolean unique1(int[] data) { int n = data.length;
for (int j=0; j < n-1; j++)
• Intheworstcase,f(n)=(n–1)+(n–2)+...+1=O(n2)
Algorithm Analysis Rate of Growth
• Example: Element uniqueness problem (solution 2) 1 public static boolean unique2(int[ ] data) {
2
3
4
5
6
7
8 9}
int n = data.length;
int[ ] temp = Arrays.copyOf(data, n); // make copy of data
Arrays.sort(temp);
for (int j=0; j < n-1; j++)
// and sort the copy, O(n log n) // for loop takes O(n)
if (temp[j] == temp[j+1]) return false;
// check neighboring entries
return true;
// found duplicate pair
// if we reach this, elements are unique
• Intheworstcase,f(n)=O(nlogn)+O(n)=O(nlogn)
Proof Techniques
• To disprove a statement, it is sufficient to show a counterexample.
• To prove a statement, we must show it is true for all objects in the domain under consideration.
• Exhaustive proof, direct proof, proof by contraposition, proof by contradiction, mathematical induction
• Loop invariant method: to prove the correctness of an algorithm (or a program), which involves a loop.
Proof Techniques Proof by Contradiction
• To prove P → Q: Assume Q is false and find a contradiction.
• Example: If an even integer is added to another even integer, the result is an even integer.
• Proof:
– Letxandybetwoevenintegers.Letz=x+y.
– Let’s assume that z is an odd integer (negating the conclusion of the given statement).
– Since x is even, it can be rewritten as x = 2n, for some integer n.
– Since y is even, it can be rewritten as y = 2m, for some integer
m.
– Since z is odd (this we assumed), it can be rewritten as z = 2k + 1, for some integer k.
• Proof (continued)
– Then, we have:
– x+y=z
– 2n+2m=2k+1
– 2n+2m–2k=1
– 2(n+m–k)=1
Proof Techniques Proof by Contradiction
– This is a contradiction because the left hand side is an even number and the right hand side is 1, which is odd.
Proof Techniques Induction
• Consists of base case (or base step) and inductive step.
• To prove a predicate P(n) is true for all positive integers n.
– Base case: Show that P(1) is true
– Inductive step: Assume that P(k) is true, and prove P(k + 1) is
also true. This assumption is called inductive hypothesis. • See the example in the next slide.
Proof Techniques Induction
• Example: Prove that for any positive integer n, 2n > n Base case: n = 1
LHS = 21 = 2, RHS = 1. So, LHS > RHS Induction step: (n ≥ 1).
Inductivehypothesis:itistrueforn=k(k≥1),i.e.,2k >kfork≥1.
Weshowthatitisalsotrueforn=k+1,i.e.,2k+1 >k+1 LHS = 2k+1
= 2×2k
> 2k (by the inductive hypothesis) =k+k≥k+1
= RHS.
So, LHS > RHS.
Proof Techniques Loop Invariant
• Loop invariant:
– A property (or a statement) that is true at the
beginning of each iteration of the loop
• To prove an algorithm is correct
– (1). Identify a loop invariant and state it.
Choose one that will help prove the correctness – (2). Prove the loop invariant is true. This involves two
steps – base case and inductive step
– (3). Use the loop invariant to prove the correctness of
the algorithm.
• Insertion Sort
1
2
3
4
5
6
7
8
9
10 11 12 13 }
Proof Techniques Loop Invariant
public class InsertionSort {
public static void insertionSort(char[] data) {
int n = data.length;
for (int k = 1; k < n; k++) {
// begin with second element // save data[k] in cur
char cur = data[k];
int j = k;
while (j > 0 && data[j-1] > cur) { // thus, data[j-1] must go after cur
data[j] = data[j-1]; j–;
// shift data[j-1] rightward
// and consider previous j for cur
} // while
data[j] = cur; } // for
// this is the proper place for cur // running time: O(n2)
// find correct index j for cur
Proof Techniques Loop Invariant
• Loop invariant (L.I.): At the start of each iteration of the for loop of lines 4 – 12, the subarray data[0 . . k −1] consists of the elements originally in data[0 . . k − 1] but in sorted order.
• Base case: Just before the first iteration, k = 1. The subarray data[0 . . k − 1] has only one element data[0], and it is trivially sorted.
• Inductive step:
– Assume the L.I. is true at the start of an iteration and prove it is
also true at the start of the next iteration.
– Assume the loop invariant is true at the start of the m-th iteration, i.e. when k = m, the subarray data[0 . . m – 1] consists of the elements originally in data[0 . . m – 1] but in sorted order.
• Inductive step: (continued)
Proof Techniques Loop Invariant
– During the execution of the m-th iteration, the while loop in lines 7 – 10 moves data[m] into the correct position in data[0 . . m – 1 ].
– So, at the start of the (m+1)-th iteration, data[0 . . m] consists of the elements originally in A[0 . . m] but in sorted order.
• Theforloopendswhenk=n.
• According to the loop invariant, the subarray data[0 . . n – 1 ] consists of the elements originally in data[0 . . n – 1] but in sorted order. In other words, the entire array is sorted.
References
• M.T. Goodrich, R. Tamassia, and M.H. Goldwasser, “Data Structures and Algorithms in Java,” Sixth Edition, Wiley, 2014.