COMP251: Divide-and-Conquer (2)
Jérôme Waldispühl School of Computer Science
McGill University
Based on (Kleinberg & Tardos, 2005) & (Cormen et al.,2009)
How to determine the running time of a divide-and-conquer algorithm?
The Master Theorem
Number of recursive calls
Recursive definition T(n): execution time on an input of size n.
MergeSort:𝑇𝑛 =2%𝑇 !” +𝑛 BinarySearch:𝑇 𝑛 =𝑇 !” +1 Karatsuba:𝑇𝑛 =3%𝑇 !” +𝑛
Time to merge
Size of sub- problems
Master method
Goal. Recipe for solving common divide-and-conquer recurrences: T(n)=aT n + f(n)
b
・a ≥ 1 is the number of subproblems.
・b > 0 is the factor by which the subproblem size decreases. ・f (n) = work to divide/merge subproblems.
Terms.
Recursion tree.
・k = logb n levels.
・ai = number of subproblems at level i. ・n / bi = size of subproblem at level i.
3
Case 1: total cost dominated by cost of leaves
Ex 1. If T (n) satisfies T (n) = 3 T (n / 2) + n, with T (1) = 1, then T (n) = Θ(nlg 3). T (n)
T(n/2) T(n/2) T(n/2)
T(n/4) T(n/4) T(n/4) T(n/4) T(n/4) T(n/4) T(n/4) T(n/4) T(n/4)
⋮
T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) … T(1) T(1) T(1) 3log2 n = nlog2 3
r = 3 / 2 > 1 T (n) = (1 + r + r2 + r3 + . . . + rlog2 n) n = r1+log2 n 1 n = r 1
n
3(n /2)
32 (n / 22) 3i (n / 2i)
log2 n
⋮
3log2 n(n/2log2 n)
3nlog2 3 2n
4
Case 2: total cost evenly distributed among levels
Ex 2.
If T (n) satisfies T (n) = 2 T (n / 2) + n, with T (1) = 1, then T (n) = Θ(n log n).
T (n)
n
2 (n / 2)
22 (n/22)
23(n/23) ⋮
n (1)
T (n / 2)
T (n / 2)
T(n/4)
T(n/4)
T(n/4)
T(n/4)
T(n/8) T(n/8) T(n/8) T(n/8) T(n/8) T(n/8) T(n/8) T(n/8) ⋮
T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) … T(1) T(1) T(1) 2log2 n = n
log2 n
r =1
T (n) = (1 + r + r2 + r3 + . . . + rlog2 n) n = n (log2 n + 1)
5
Case 3: total cost dominated by cost of root
Ex 3. If T (n) satisfies T (n) = 3 T (n / 4) + n5, with T (1) = 1, then T (n) = Θ(n5). T (n)
T(n/4) T(n/4) T(n/4)
T(n/16) T(n/16) T(n/16) T(n/16) T(n/16) T(n/16) T(n/16) T(n/16) T(n/16)
⋮
T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) … T(1) T(1) T(1) 3log4 n = nlog4 3
log4 n
n5
3(n /4)5
32 (n / 42)5 3i (n / 4i)5
⋮
3log4 n(n/4log4 n)5
r = 3 / 45 < 1 n5 ≤ T (n) ≤ (1 + r + r 2 + r 3 + ... ) n5 ≤ 1 n5
1–r 6
Master theorem
Master theorem. Suppose that T (n) is a function on the nonnegative integers that satisfies the recurrence
T(n)=aT n + f(n) b
where n/b means either !n/b" or #n/b$. Let k = logb a. Then,
Case 1. If f(n) = O(nk – ε) for some constant ε > 0, then T(n) = Θ(nk).
E・x. T (n) = 3 T(n / 2) + n. ・a = 3, b = 2, f (n) = n,
T (n) = Θ(nlg 3).
k = log2 3.
The formula works with 𝜀 = log! 3 − 1 > 0 𝑓 𝑛 =𝑛=𝑂 𝑛”#$!%& “#$!%&’
7
Master theorem
Master theorem. Suppose that T (n) is a function on the nonnegative integers that satisfies the recurrence
T(n)=aT n + f(n) b
where n/b means either !n/b” or #n/b$. Let k = logb a. Then, Case 2. If f(n) = Θ(nk logp n), then T(n) = Θ(nk logp+1 n).
E・x. T (n) = 2 T(n / 2) + Θ(n log n).
・a=2, b=2, f(n)=𝑛17long,𝑛k=log22=1, p=1.
T (n) = Θ(n log2 n).
𝑓 𝑛 =Θ 𝑛log𝑛 =Θ(𝑛”#$!!log𝑛)
8
Master theorem
Master theorem. Suppose that T (n) is a function on the nonnegative integers that satisfies the recurrence
T(n)=aT n + f(n) b
regularity condition holds where n/b means either !n/b” or #n/b$. Let k = logb a. Then, if f(n) = Θ(nk + ε)
Case 3. If f (n) = Ω(nk + ε) for some constant ε > 0 and if a f (n / b) ≤ c f (n) for some constant c < 1 and all sufficiently large n, then T (n) = Θ ( f (n) ).
E・x. T (n) = 3 T(n / 4) + n5.
・a = 3, b = 4, T (n) = Θ(n5).
f (n) = n5,
k = log4 3.
1st property satisfied with 𝜀 = 1 − log( 3
𝑓 𝑛 = 𝑛) = Ω(𝑛"#$" %*('&"#$" %)) 2nd property satisfied with c=%( 3G 𝑛 )≤𝑐G𝑛)
4
9
Master theorem
Master theorem. Suppose that T (n) is a function on the nonnegative integers that satisfies the recurrence
T(n)=aT n + f(n) b
where n/b means either !n/b" or #n/b$. Let k = logb a. Then,
Case 1. If f(n) = O(nk – ε) for some constant ε > 0, then T(n) = Θ(nk).
Case 2. If f(n) = Θ(nk logp n), then T(n) = Θ(nk logp+1 n).
Case 3. If f (n) = Ω(nk + ε) for some constant ε > 0 and if a f (n / b) ≤ c f (n) for some constant c < 1 and all sufficiently large n, then T (n) = Θ ( f (n) ).
P・f sketch.
・Use recursion tree to sum up terms (assuming n is an exact power of b). ・Three cases for geometric series.
Deal with floors and ceilings.
10
𝑘=log!1=0;𝑓 𝑛 =2- 2- = Ω 𝑛.*"#$ !
Applications 𝑘=log!3;𝑓 𝑛 =𝑛! 𝑛! = Ω 𝑛"#$! %*(!&"#$! %)
3 G 𝑛 ! ≤ 3 G 𝑛 ! 224
1 G 2 -! ≤ 1 G 2 -
T(n) = 3 * T(n/2) + n2 ⇒ T(n) = Θ(n2)
T(n) = T(n/2) + 2n ⇒ T(n) = Θ(2n)
T(n) = 16 * T(n/4) + n ⇒ T(n) = Θ(n2)
(case 3) (case 3) (case 1)
𝑘=log(16=2;𝑓 𝑛 =𝑛 𝑛=𝑂 𝑛!&'
T(n) = 2 * T(n/2) + n log n
⇒ T(n) = n log2 n (case 2)
T(n)=2n *T(n/2)+nn
⇒Doesnotapply!! 𝑘=log!2=1;𝑓 𝑛 =𝑛log𝑛 𝑛log𝑛=Θ 𝑛'𝑙𝑜𝑔'𝑛
Akra-Bazzi theorem
Desiderata. Generalizes master theorem to divide-and-conquer algorithms where subproblems have substantially different sizes.
Theorem. [Akra-Bazzi] Given constants ai > 0 and 0 < bi ≤ 1, functions
hi (n) = O(n / log 2 n) and g(n) = O(nc), if the function T(n) satisfies the recurrence:
k i=1
n k
np 1 +
T(n) =
aiT(bin+hi(n))+g(n)
ai subproblems small perturbation to handle of size bi n floors and ceilings
Then T(n) =
g(u) du where p satisfies ai bpi 1 up+1 i=1
= 1 .
E・x. T(n) = 7/4 T(!n/2") + T(#3/4n$) + n2. ・a1 = 7/4, b1 = 1/2, a2 = 1, b2 = 3/4 ⇒ p = 2. ・h1(n) = !1/2 n" – 1/2 n, h2(n) = #3/4 n$ – 3/4 n.
g(n)=n2 ⇒ T(n)=Θ(n2 logn).
11
Another Divide-and-Conquer Algorithms
Dot product
Dot product. Given two length n vectors a and b, compute c = a ⋅ b. Grade-school. Θ(n) arithmetic operations.
€
Remark. Grade-school dot product algorithm is asymptotically optimal.
n
a ⋅ b = ∑ ai bi
i=1
a =[.70 .20 .10] € b =[.30 .40 .30]
a ⋅ b = (.70×.30) + (.20×.40) + (.10×.30) = .32
24
Matrix multiplication
Matrix multiplication. Given two n-by-n matrices A and B, compute C = AB.
Grade-school. Θ(n3) arithmetic operations.
n
cij =∑aikbkj
k=1
€
"c11 c12 c1n% $ccc'
"a11 a12 a1n% $aaa'
"b11b12b1n% $bbb'
$21 22 2n' $ '
=
$21 22 2n' $ '
×
$21 22 €2n' $ '
$ccc' #n1 n2 nn&
$aaa' #n1 n2 nn&
$bbb' #n1 n2 nn&
".59 .32 .41% ".70 .20 .10% " .80 .50%
$'$'$' .31 .36 .25 = .30 .60 .10 × .10 .10 $'$'$'
$.45 .31 .42' $.50 .10 .40' $ .10 .40' #&#&#&
.30 .40 .30
€
Q. Is grade-school matrix multiplication algorithm asymptotically optimal?
25
Block matrix multiplication
C11
A11 A12 B11
" 164 170% "0 1 2 3% "16 17 18 19% $ 548 570' $4 5 6 7' $20 21 22 23'
152 158 504 526
$'=$'×$'
$856 894 932 970' $8 9 10 11' $24 25 26 27'
$'$'$'
1208 1262 1316 1370 12 13 14 15 28 29 30 31
#&#&#&
€
€
B11
C = A ×B + A ×B = #0 1&×#16 17& + #2 3&×#24 25& = #152 158& 11 11111221%(%(%(%(%(
$4 5' $20 21' $6 7' $28 29' $504 526'
26
€
€
Matrix multiplication: warmup
T・o multiply two n-by-n matrices A and B:
・Divide: partition A and B into 1⁄2n-by-1⁄2n blocks. ・Conquer: multiply 8 pairs of 1⁄2n-by-1⁄2n matrices, recursively.
Combine: add appropriate products using 4 matrix additions.
"C11 C12% = "A11 $#C21 C22'& $#A21
A12% × "B11 B12% A22'& $#B21 B22'&
C11 = C12 = C21 = C22 =
(A11×B11)+ (A12×B21) (A11 ×B12) + (A12 ×B22) (A21 × B11) + (A22 × B21) (A21×B12)+ (A22×B22)
Running time. Apply case 1 of Master Theorem.
€
27
T(n)= 8T(n/2) + Θ(n2)
recursive calls add, form submatrices
⇒ T(n)=Θ(n3)
€
Strassen's trick
Key idea. multiply 2-by-2 blocks with only 7 multiplications. (plus 11 additions and 7 subtractions)
"C11 C12% = "A11 A12% × "B11 B12%
P ← A " (B – B ) 1 11 12 22
P2 ← (A11 + A12) " B22
P3 ← (A21 +A22)"B11
P4 ← A22 " (B21 – B11)
P5 ← (A11 +A22)"(B11 +B22) P6 ← (A12 –A22)"(B21 +B22) P7 ← (A11 –A21)"(B11 +B12)
$#C C '& $#A A '& 21 22 21 22
C11 = P5+P4–P2+P6 C12 = P1+P2
C21 = P3+P4
C22 = P1+P5–P3–P7
$#B B '& 21 22
Pf. C12 =P1 +P2
= A11 " (B12 – B22) + (A11 + A12) " B22
=A11"B12+A12"B22. ✔
28
Strassen's algorithm
assume n is a power of 2
IF (n = 1) RETURN A " B.
Partition A and B into 2-by-2 block matrices.
P1 ← STRASSEN (n / 2, A11, (B12 – B22)).
P2 ← STRASSEN (n / 2, (A11 + A12), B22).
P3 ← STRASSEN (n / 2, (A21 + A22), B11).
P4 ← STRASSEN (n / 2, A22, (B21 – B11)).
P5 ← STRASSEN (n / 2, (A11 + A22) " (B11 + B22)). P6 ← STRASSEN (n / 2, (A12 – A22) " (B21 + B22)). P7 ← STRASSEN (n / 2, (A11 – A21) " (B11 + B12)). C11 = P5 +P4 –P2 +P6.
C12 = P1+P2.
C21 = P3+P4.
C22 = P1 +P5 –P3 –P7.
STRASSEN (n, A, B) ______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
RETURN C. ______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
keep track of indices of submatrices (don't copy matrix entries)
29
Analysis of Strassen's algorithm
Theorem. Strassen's algorithm requires O(n2.81) arithmetic operations to multiply two n-by-n matrices.
Pf. Apply case 1 of the master theorem to the recurrence:
€
T(n)= 7T(n/2)+ Θ(n2)
recursive calls add, subtract
⇒ T(n)=Θ(nlog2 7)=O(n2.81)
Q. What if n is not a power of 2 ? A. Could pad matrices with zeros.
1230 1011120 84 90 96 0 4 5 6 0 13 14 15 0 = 201 216 231 0 7 8 9 0 16 17 18 0 318 342 366 0 0000 0000 0000
30
Strassen's algorithm: practice
Implementation issues.
・Sparsity.
・Caching effects.
・Numerical stability.
・Odd matrix dimensions.
・Crossover to classical algorithm when n is "small" .
Common misperception. “Strassen is only a theoretical curiosity.” ・Apple reports 8x speedup on G4 Velocity Engine when n ≈ 2,048. ・Range of instances where it's useful is a subject of controversy.
31
Linear algebra reductions
Matrix multiplication. Given two n-by-n matrices, compute their product.
problem
linear algebra
order of growth
matrix multiplication
A×B
Θ(MM(n))
matrix inversion
A –1
Θ(MM(n))
determinant
|A|
Θ(MM(n))
system of linear equations
Ax = b
Θ(MM(n))
LU decomposition
A=LU
Θ(MM(n))
least squares
min ||Ax – b ||2
Θ(MM(n))
numerical linear algebra problems with the same complexity as matrix multiplication
32
Fast matrix multiplication: theory
Q. Multiply two 2-by-2 matrices with 7 scalar multiplications?
A. Yes! [Strassen 1969] Θ(nlog2 7 ) = O(n 2.807 )
Q. Multiply two 2-by-2 matrices with 6 scalar multiplications?
A. Impossible. [Hopcroft and Kerr 1971]
Q. Multiply two 3-by-3 matrices with 21 scalar multiplications?
A. Unknown.
B・egun, the decimal wars have. [Pan, Bini et al, Schönhage, ...]
€
・Two 20-by-20 matrices with 4,460 scalar multiplications. ・Two 48-by-48 matrices with 47,217 scalar multiplications.
O(n
2.805
)
・A year later. ・December 1979.
January 1980.
€
O(n 2.7801 )
O(n 2.7799 )
€
Θ(n log2 6) = O(n 2.59 ) Θ(nlog3 21) = O(n 2.77 )
€
O(n 2.521813) € O(n 2.521801)
€ € €
33
History of asymptotic complexity of matrix multiplication
year
algorithm
order of growth
?
brute force
O (n 3 )
1969
Strassen
O (n 2.808 )
1978
Pan
O (n 2.796 )
1979
Bini
O (n 2.780 )
1981
Schönhage
O (n 2.522 )
1982
Romani
O (n 2.517 )
1982
Coppersmith-Winograd
O (n 2.496 )
1986
Strassen
O (n 2.479 )
1989
Coppersmith-Winograd
O (n 2.376 )
2010
Strother
O(n2.3737 )
2011
Williams
O(n2.3727 )
?
?
O (n 2 + ε )
number of floating-point operations to multiply two n-by-n matrices
34