* CPSC 320: The Master theorem and Deterministic Select Solutions
1 The Master Theorem
Sections 5.1 and 5.2 of the textbook show how to derive the solution to a recurrence relation by drawing the recursion tree corresponding to that recurrence. It turns out, however, that most divide and conquer algorithms split the problems into equal-sized subproblems, which means that drawing recursion trees for every recurrence will quickly become repetitive. All such trees will be very similar, and the process to obtain a tight bound on the solution to the recurrence will also be very similar. More precisely, we mostly get three broad types of recursion trees:
1. Those for which the work done on each level is an increasing geometric series (like the example under the The Case of q > 2 Subproblems heading in section 5.2).
2. Those for which the work done on each level is constant (like the recurrence relation for Mergesort from section 5.1).
3. Those for which the work done on each level is a decreasing geometric series (like the example under the The Case of One Subproblem heading in section 5.2, although this can also happen with 2 or more recursive calls).
To avoid repeating similar calculations over and over again, Jon Bentley, Dorothea Haken, and James B. Saxe proved the following theorem:
Master theorem: Let a ≥ 1, b > 1 be real constants, let f(n) : N → R+, and let T(n) be dened by
aT(n/b) + f(n) if n ≥ n0
T(n) =
Θ(1) if n < n0
where n/b might be either ⌊n/b⌋ or ⌈n/b⌉. Then
1. If f(n) ∈ O(n(logb a)−ε) for some ε > 0 then T(n) ∈ Θ(nlogb a).
2. If f(n) ∈ Θ(nlogb a logk n) for some constant k ≥ 0, then T(n) ∈ Θ(nlogb a logk+1 n).
3. Iff(n)∈Ω(n(logba)+ε)forsomeε>0,andaf(n/b)<δf(n)forsome0<δ<1andalln large enough, then T(n) ∈ Θ(f(n)).
Case 1 of the theorem is the case where the work done at each level is an increasing geometric function; the total amount of work done is essentially determined by the number of nodes of the recursion tree. Case 2 is the case where every level does the same amount of work (hence the extra logn factor in the running time). Case 3 is the case where the amount of work done at each level decreases geometrically. It is the most complicated case because we need to ensure the function f(n) is well-behaved (badly-behaved functions do not result in the amount of work done at each level decreasing geometrically). The condition af(n/b) < δf(n) for some 0 < δ < 1 and all n large enough is called the regularity condition.
*
Copyright Notice: UBC retains the rights to this document. You may not distribute this document without permission.
1
Note that the theorem can only be used if the conditions stated at the beginning are satised. In other cases, it can not be used to derive a solution to the recurrence, even though such a solution might exist.
For each of the following recurrence relations, determine whether or not the Master Theorem can be used. If it can, give the solution to the recurrence. If it can not, explain why not. In all cases, assume that T (n) ∈ Θ(1) when n is suciently small.
√
T(n)=5T( n)+n
SOLUTION: The Master Theorem can not be used, because b is not a constant (it's
T(n)=T(n/2)+1
1.
2.
3.
4.
5.
6.
7.
2
In the previous worksheet, we developed a Θ(n) time algorithm to nd the kth smallest element of an
√
n).
SOLUTION:Herea=1andb=2,whichmeanslogba=log21=0,andf(n)=1=n0. Weare
therefore in case 2 of the Master theorem and T (n) ∈ Θ(log n).
T(n)=T(n/4)+n
SOLUTION:Herea=1andb=4,whichmeanslogba=log41=0. Alson=n1 ∈Ω(n0+1), which means the only possible case might be case 3. We still need to check the regularity condition: af(n/b)=f(n/4)=n/4,andn/4<δnforanyδintherange0.4<δ<1. Socase3applies,and T (n) ∈ Θ(n).
√
T(n)=3T(n/9)+ nlog2n
SOLUTION: Here a = 3 and b = 9, which means logba = log93 = 0.5, and f(n) =
√ n0.5 log2 n. We are therefore in case 2 of the Master theorem and T (n) ∈ Θ(√n log2 n).
T(n) = √nT(n/3) + n2
SOLUTION: The Master Theorem can not be used because a is not a constant (it is
T(n) = 9T(n/3)+nlogn
√
nlog2n =
n).
SOLUTION:Herea=9andb=3,andsologba=log39=2. Moreover,f(n)=nlogn∈O(n2−0.5)
so we are in case 1. This implies T(n) ∈ Θ(nlogb a) = Θ(n2). T(n) = 2T(n/2)+n/logn
SOLUTION:TheMasterTheoremcannotbeused: a=2andb=2,andsologba=log22=1. However there is no ε > 0 such that n/ log n ∈ O(n1−ε).
Deterministic Select
unsorted array, assuming that the rank of the pivot in our algorithm was always in the range between ⌈n⌉ 4
and ⌊3n⌋, on a problem of size n. This unreasonable assumption was relaxed in the bonus question, and we 4
asked you to prove that if we choose the pivot at random then the expected (think, average) running time of the algorithm is in Θ(n). Here, we will develop an algorithm for this problem whose worst-case running time is in Θ(n).
1. Assuming you had a somewhat powerful magic wand you could use when comes the time to choose a pivot, what would you use that wand for? Note that you can’t force the pivot to always be the element you’re searching for (your magic wand is not that powerful).
2
SOLUTION: We would use the wand to ensure that the pivot we get is not too close to either the smallest or the largest element of the array (maybe not exactly between those whose rank are ⌈n⌉
and ⌊3n⌋, but not too far from that). 4
4
2. Since magic wands don’t exist, we will need to choose our pivot the old fashioned way (through computations). Suppose you are given an array containing the values
19 35 44 13 42 23 21 62 27 24 40 53 31 16 48 25 17 13 9 26 18 50 21 30 67
Group them into groups of 5 (from left to right), and circle the median of each group. SOLUTION:
35
24
40
17
30
42 44 21 23 27 62
13 19
16 31 9 13 18 21
48 53 25 26 50 67
3. What is the median of the circled elements? SOLUTION: 30
4. How many of the elements are smaller than that median? How many are larger? SOLUTION: 14 elements are smaller, 10 are larger.
5. Now let us generalize the example from steps 2 through 4. Suppose we divide the elements into groups of 5 (unlike our previous example, up to 4 ungrouped elements may remain), nd the median of each group by running insertion sort and picking the 3rd smallest (with only 5 elements, there is need to be fancy), and then compute the median m of the group medians (recursively this time, since we have many group medians). How many groups of 5 elements will we get?
SOLUTION: We will get ⌊n/5⌋ groups.
6. How many elements of the array are guaranteed to be smaller than m?
SOLUTION: The value m is the median of ⌊n/5⌋ group medians, and so it is the ⌈⌊n/5⌋/2⌉ smallest of them. There are ⌈⌊n/5⌋/2⌉ − 1 groups with medians smaller than m, each of which contains 3 elements smaller than m (the smallest two, and their median). The group that contains m also has two elements smaller than it. So there are are least 3(⌈⌊n/5⌋/2⌉ − 1) + 2 elements smaller than m.
7. How many elements of the array are guaranteed to be larger than m?
SOLUTION: The value m is the median of ⌊n/5⌋ group medians, and so it is the ⌈⌊n/5⌋/2⌉ smallest of them. There are ⌊n/2⌋ − (⌈⌊n/5⌋/2⌉) = ⌊⌊n/5⌋/2⌋ groups with medians larger than m, each of which contains 3 elements larger than m (the largest two, and their median). The group that contains m also has two elements larger than it. So there are are least 3(⌊⌊n/5⌋/2⌋) + 2 elements larger than m.
8. Consider once again the function QuickSelect from the last worksheet, with a name change and a dierent way of choosing p:
function DeterministicSelect(A[1..n], k) // returns the element of rank k in an array A of n distinct numbers, where 1 ≤ k ≤ n
3
if n≥5then
Choose pivot element p using the procedure described above Let Lesser be an array of all elements from A less than p
Let Greater be an array of all elements from A greater than p if |Lesser| = k – 1 then
return p else
if |Lesser| > k – 1 then // all smaller elts are in Lesser; k is unchanged return DeterministicSelect(Lesser, k)
else// |Lesser| < k − 1
// subtract from k the number of smaller elts removed return DeterministicSelect(Greater, k− |Lesser| −1)
else
sort A and return A[k − 1]
What is the largest possible size of one of the Lesser or Greater lists?
SOLUTION: The Lesser list is guaranteed not to contain m or the 3(⌊⌊n/5⌋/2⌋) + 2 elements that are larger than it. So there are 3(⌊⌊n/5⌋/2⌋) + 2 + 1 = 3(⌊⌊n/5⌋/2⌋ + 1) elements that it does not contain.
The Greater list is guaranteed not to contain m or the 3(⌈⌊n/5⌋/2⌉ − 1) + 2 elements smaller than it. So there are 3(⌈⌊n/5⌋/2⌉ − 1) + 2 + 1 = 3(⌈⌊n/5⌋/2⌉ − 1) + 3 = 3⌈⌊n/5⌋/2⌉ elements that it does not contain.
The smallest of these two values is 3⌈⌊n/5⌋/2⌉, and so in the worst case the largest one of Lesser and Greater contains at most
⌊n/5⌋ ⌊n/5⌋ (n−4) 3n 12 7n 7n
n − 3 2 ≤ n − 3 2 ≤ n − 3 10 = n − 10 + 10 = 10 + 1.2 ≤ 10 + 2
elements.
9. Derive a recurrence relation for the worst-case running time T (n) of algorithm DeterministicSelect
running on an array with n elements.
SOLUTION: Splitting the array into groups of 5 and nding the median of each group takes Θ(n) time (we perform ⌊n/5⌋ constant time work), as does splitting the array into Lesser and Greater once we have found the median of the group medians. All other steps take Θ(1) time, except for the recursive call to nd the median of the group medians (T(⌊n/5⌋)) and the nal recursive call on either Lesser or Greater (at most T (7n/10 + 2)). Therefore the worst-case running time of DeterministicSelect is described by the recurrence relation
T(⌊n/5⌋)+T(7n/10+2)+cn ifn≥5. Θ(1) if n ≤ 4.
T(n) ≤
10. Prove that the function T(n) described by your recurrence relation is in Θ(n).
SOLUTION: By drawing the recursion tree, we see that the amount of work done by row i is approximately at most (9/10)icn (it's only approximately that because the +2 terms). This is almost a decreasing geometric series, and hence we can guess that T (n) ∈ O(n). The work done at the root is cn, and so T (n) ∈ Ω(n) as well. Therefore T (n) ∈ Θ(n).
It is possible to prove the upper bound formally using mathematical induction. We rst need a lemma:
4
3
Clearly T(n) ≥ cn, and so T(n) ∈ Ω(n). Now let us prove that T(n) ≤ dn as long as n is large enough. We use the strong form of induction, starting with the induction step. Consider an unspecied value of n ≥ 25. Assuming that the theorem is true for ⌊n/5⌋ and 7n/10 + 2, we have
T (n) ≤ T(⌊n/5⌋) + T(7n/10 + 2) + cn ≤ d⌊n/5⌋ + d(7n/10 + 2) + cn
≤ dn/5+7dn/10+2d+cn
≤ 9dn/10 + (cn + 2d)
Now, we know that as long as we pick d ≥ 50c, cn+2d ≤ dn/10. Hence, as long as d ≥ 50c, T (n) ≤ 9dn/10 + dn/10 = n, as required.
It remains to consider the base case. The induction step works for n ≥ 25, and assumes the theorem holds for ⌊n/5⌋, which means that we need to have n0 = 5, and to treat n = 5,6,7,...,24 as base cases. Let us suppose that the values of d that work for them are d5, d6, . . . , d24. Then we can choose d∗ = max{d5, d6, . . . , d24, 50c}, and then for every n ≥ 5, T (n) ≤ d∗n, as required. Therefore T (n) ∈ O(n).
Bonus Problems
1. Prove that if the regularity condition holds in case 3 of the Master Theorem, then f(n) ∈ Ω(nlogb a+ε)
for some ε > 0.
2. Instead of groups of 5, could our DeterministicSelect algorithm use groups of 3? What about groups of 7?
Lemma 1 If n≥25, then for every c∈R+, there exists d∈R+ for which cn+2d≤dn/10. Proof : Consider an unspecied positive integer c, and choose d ≥ 50c. Then c ≤ d/50 and so
cn+2d ≤ (d/50)n+2d
≤ (dn/10)(1/5 + 20/n)
Sincen≥25,weknowthat20/n≤4/5,andsocn+2d≤dn/10. QED Now we prove that the function T(n) described by the recurrence relation
T(n) ≤ is in Θ(n).
T(⌊n/5⌋)+T(7n/10+2)+cn ifn≥5 Θ(1) if n ≤ 4
5