NEW SOUTH WALES
Algorithms: COMP3121/3821/9101/9801
School of Computer Science and Engineering University of New South Wales Sydney
2. DIVIDE-AND-CONQUER
COMP3121/3821/9101/9801 1 / 28
A Puzzle
An old puzzle: We are given 27 coins of the same denomination; we know that one of them is counterfeit and that it is lighter than the others. Find the counterfeit coin by weighing coins on a pan balance only three times.
Solution:
This method is called “divide-and-conquer”.
We have already seen a prototypical “serious” algorithm designed
using such a method: the Merge-Sort.
We split the array into two, sort the two parts recursively and
then merge the two sorted arrays.
We now look at a closely related but more interesting problem of
counting inversions in an array.
COMP3121/3821/9101/9801 2 / 28
Counting the number of inversions
Assume that you have m users ranking the same set of n movies. You want to determine for any two users A and B how similar their tastes are (for example, in order to make a recommender system).
How should we measure the degree of similarity of two users A and B?
Lets enumerate the movies on the ranking list of user B by assigning to the top choice of user B index 1, assign to his second choice index 2 and so on.
For the ith movie on B′s list we can now look at the position (i.e., index) of that movie on A′s list, denoted by a(i).
COMP3121/3821/9101/9801 3 / 28
Counting the number of inversions
B
1
2
3
4
5
6
7
8
9
10
11
12
1 2 3 4 5 6 7 8 9 10 11 12
A
1
9
7
3
4
6
8
2
5
a(1)=1 a(9)=3
a(7)=5 a(4)=8 COMP3121/3821/9101/9801
a(2)=11
4 / 28
Counting the number of inversions
A good measure of how different these two users are, is the total number of inversions, i.e., total number of pairs of movies i, j such that movie i precedes movie j on B′s list but movie j is higher up on A′s list than the movie i.
In other words, we count the number of pairs of movies i, j such that i < j (movie i precedes movie j on B′s list) but a(i) > a(j) (movie i is in the position a(i) on A′s list which is after the position a(j) of movie j on A′s list.
B
1 2 3 4 5 6 7 8 9 10 11 12
A
1
2
3
4
5
6
7
8
9
10
11
12
1
9
7
3
4
6
8
2
5
a(1)=1 a(9)=3 a(7)=5 a(4)=8
For example 1 and 2 do not form an inversion because a(1) < a(2) (a(1) = 1
and a(2) = 11 because a(1) is on the first and a(2) is on the 11th place in A); However, for example 4 and 7 do form an inversion because a(7) < a(4)
(a(7) = 5 because seven is on the fifth place in A and a(4) = 8)
a(2)=11
COMP3121/3821/9101/9801 5 / 28
Counting the number of inversions
An easy way to count the total number of inversions between two lists is by looking at all pairs i < j of movies on one list and determining if they are inverted in the second list, but this would produce a quadratic time algorithm, T(n) = Θ(n2).
We now show that this can be done in a much more efficient way, in time O(n log n), by applying a Divide-And-Conquer strategy.
Clearly, since the total number of pairs is quadratic in n, we cannot afford to inspect all possible pairs.
The main idea is to tweak the Merge-Sort algorithm, by extending it to recursively both sort an array A and determine the number of inversions in A.
COMP3121/3821/9101/9801 6 / 28
Counting the number of inversions
We split the array A into two (approximately) equal parts Atop =A[1...⌊n/2⌋]andAbottom =A[⌊n/2⌋+1...n].
Note that the total number of inversions in array A is equal to the sum of the number of inversions I(Atop) in Atop (such as 9 and 7) plus the number of inversions I(Abottom) in Abottom (such as 8 and 2) plus the number of inversions I(Atop,Abottom) across the two halves (such as 7 and 4).
A
1
9
7
3
4
6
8
2
5
Atop Abottom
COMP3121/3821/9101/9801
7 / 28
Counting the number of inversions
We now recursively sort arrays Atop and Abottom and obtain the number of inversions I(Atop) in the sub-array Atop and the number of inversions I(Abottom) in the sub-array Abottom.
We now merge the two sorted arrays Atop and Abottom while counting the number of inversions I(Atop,Abottom) which are across the two sub-arrays.
When the next smallest element among all elements in both arrays is an element in Abottom, such an element clearly is in an inversion with all the remaining elements in Atop and we add the total number of elements remaining in Atop to the current value of the number of inversions across Atop and Abottom .
A
1 11
9
12
7
10
3
4
6
8
2
5
a1=1 a9=3 a7=5 a4=8
a2=11
Atop
Abottom
Atop Abottom
1
7
9
10
11
12
1
2
3
3
4
5
6
8
COMP3121/3821/9101/9801
8 / 28
Counting the number of inversions
Whenever the next smallest element among all elements in both arrays is an element in Atop, such an element clearly is not involved in any inversions across the two arrays (such as 1, for example).
After the merging operation is completed, we obtain the total number of inversions I(Atop,Abottom) across Atop and Abottom.
The total number of inversions I(A) in array A is finally obtained as:
I(A) = I(Atop) + I(Abottom) + I(Atop, Abottom)
Next: we study applications of divide and conquer to arithmetic
of very large integers.
COMP3121/3821/9101/9801 9 / 28
Basics revisited: how do we add two numbers?
CCCCC carry
X X X X X first integer
+ X X X X X second integer
-----------
XXXXXX result
adding 3 bits can be done in constant time;
the whole algorithm runs in linear time i.e., O(n) many steps.
can we do it faster than in linear time?
no, because we have to read every bit of the input no asymptotically faster algorithm
COMP3121/3821/9101/9801
10 / 28
Basics revisited: how do we multiply two numbers?
X X X X <- first input integer
* X X X X <- second input integer
-------
XXXX X X XX
XX X X XXX X
--------------- XXXXXXXX
\
\ O(n^2) intermediate operations:
/ O(n^2) elementary multiplications
/ + O(n^2) elementary additions <- resultoflength2n
We assume that two X’s can be multiplied in O(1). time (each X could be a bit or a digit in some other base).
Thus the above procedure runs in time O(n2).
Can we do it in LINEAR time, like addition?
No one knows!
“Simple” problems can actually turn out to be difficult!
COMP3121/3821/9101/9801
11 / 28
Can we do multiplication faster than O(n2)?
Let us try a divide-and-conquer algorithm:
take our two input numbers A and B, and split them into two halves:
n
A=A122 +A0 XX...XXX...X
B=B2n +B n n 12022
• A0, B0 - the least significant bits; A1, B1 the most significant bits. • AB can now be calculated as follows:
AB=AB2n+(AB +BA)2n +AB (1) 111010200
What we mean is that the product AB can be calculated recursively by the following program:
COMP3121/3821/9101/9801 12 / 28
1: functionMult(A,B)
2: if |A|=|B|=1thenreturnAB
3: else
4: A1 ← MoreSignificantPart(A);
5: A0 ← LessSignificantPart(A);
6: B1 ← MoreSignificantPart(B);
7: B0 ← LessSignificantPart(B);
8: X←Mult(A0,B0);
9: Y←Mult(A0,B1);
10: Z←Mult(A1,B0);
11: W←Mult(A1,B1);
12: return W 2n +(Y +Z)2n/2 +X
13: end if
14: end function
COMP3121/3821/9101/9801 13 / 28
How many steps does this algorithm take?
Each multiplication of two n digit numbers is replaced by four multiplications of n/2 digit numbers: A1B1, A1B0, B1A0, A0B0, plus we have a linear overhead to shift and add:
T (n) = 4T n + c n 2
(2)
COMP3121/3821/9101/9801
14 / 28
Can we do multiplication faster than O(n2)?
Claim: if T(n) satisfies
T (n) = 4T n + c n 2
(3)
then
T (n) = n2(c + 1) − c n
Proof: By “fast” induction. We assume it is true for n/2:
n n2 n T 2 = 2 (c+1)−c2
and prove that it is also true for n:
T (n) = 4 T n + c n = 4 n 2 (c + 1) − n c + c n
222
= n2(c + 1) − 2c n + c n = n2(c + 1) − c n
COMP3121/3821/9101/9801
15 / 28
Can we do multiplication faster than O(n2)?
Thus,ifT(n)satisfies T(n)=4Tn+cn then 2
T (n) = n2(c + 1) − c n = O(n2) i.e., we gained nothing with our divide-and-conquer!
Is there a smarter multiplication algorithm taking less than O(n2) many steps??
Remarkably, there is, but first some history:
In 1952, one of the most famous mathematicians of the 20th century, Andrey Kolmogorov, conjectured that you cannot multiply in less than Ω(n2) elementary operations. In 1960, Karatsuba, then a 23-year-old student, found an algorithm (later it was called “divide and conquer”) that multiplies two n-digit numbers in Θ(nlog2 3) ≈ Θ(n1.58...) elementary steps, thus disproving the conjecture!! Kolmogorov was shocked!
COMP3121/3821/9101/9801 16 / 28
The Karatsuba trick
How did Karatsuba do it??
Take again our two input numbers A and B, and split them into two halves:
n
A=A122 +A0 XX...XXX...X
B=B2n +B n n 12022
• AB can now be calculated as follows: AB=AB2n+(AB +AB)2n +AB
111001200
=AB2n+((A +A)(B +B)−AB −AB)2n +AB
1110101100200 • So we have saved one multiplication at each recursion round!
COMP3121/3821/9101/9801 17 / 28
•
1: 2: 3: 4: 5: 6: 7: 8: 9:
Thus, the algorithm will look like this:
functionMult(A,B)
if |A|=|B|=1thenreturnAB else
A1 ← MoreSignificantPart(A); A0 ← LessSignificantPart(A); B1 ← MoreSignificantPart(B); B0 ← LessSignificantPart(B); U←A0+A1;
V←B0+B1; X←Mult(A0,B0); W←Mult(A1,B1);
10:
11:
12: Y ← Mult(U, V);
13: 14: 15:
•
return W 2n +(Y −X −W)2n/2 +X end if
end function
How fast is this algorithm?
COMP3121/3821/9101/9801 18 / 28
The Karatsuba trick
Clearly, the run time T(n) satisfies the recurrence
n 2
T (n) = 3 T and this implies (by replacing n with n/2)
and by replacing n with n/22
+ c n
n n n T=3T+c
2 22 2 n n n
So we get
T=3T+c
22 23 22
... n n n
+ c n = 3 3 T 2 + c + c n 222
T ( n ) = 3 T
n 3n n n 3n
=32T 22 +c 2 +cn=323T 23 +c22+c 2 +cn
n 32n 3n n n 32n 3n
=33T 23 +c 22 +c 2 +cn=333T 24 +c23+c 22 +c 2 +cn=...
COMP3121/3821/9101/9801
20 / 28
The Karatsuba trick
n n n 2 n 3n T(n)=3T 2 +cn=3 3T 22 +c2 +cn=3 T 22 +c2 +cn
nn 2
=323T +c +c3n +cn=33T n +c3 n +c3n +cn
23 22 2 23 22 2
3 n 32 3 =3T23 +cn22+2+1
nn2 = 3 3 3 T + c + c n 3 + 3 + 1
2423 222
= 34 T n + c n 33 + 32 + 3 + 1
...
24 ⌊log2 n⌋
23 22 2
3 ⌊log2 n⌋−1 2
T
=3 ≈3log2nT(1)+cn 2
n ⌊2log2n⌋
32 3 +...+22 +2 +1
+cn 3log2n−1
3 −1 2
=3log2nT(1)+2cn 3 log2n −1 2
COMP3121/3821/9101/9801
21 / 28
The Karatsuba trick
So we got
T (n) ≈ 3log2 nT (1) + 2c n We now use alogb n = nlogb a to get:
3log2 n
22222 T(n)≈nlog 3T(1)+2cnnlog 3 −1=nlog 3T(1)+2cnnlog 3−1 −1
= nlog2 3T (1) + 2c nlog2 3 − 2c n = O(nlog2 3) = O(n1.58...) < n2
Please review the basic properties of logarithms and the asymptotic notation from the review material (the first item at the class webpage under “class resources”.)
2
− 1
COMP3121/3821/9101/9801 22 / 28
A Karatsuba style trick also works for matrices: Strassen’s algorithm for faster matrix multiplication
If we want to multiply two n × n matrices P and Q, the product will be a matrix R also of size n×n. To obtain each of n2 entries in R we do n multiplications, so matrix product by brute force is Θ(n3).
However, we can do it faster using Divide-And-Conquer;
We split each matrix into four blocks of (approximate) size n/2 × n/2:
Then
a b e f r s P=cd; Q=gh; R=tu.
a b e f r s cd·gh=tu (4)
COMP3121/3821/9101/9801 23 / 28
A Karatsuba style trick also works for matrices: Strassen’s algorithm for faster matrix multiplication
We obtain:
a b e f r s
cd·gh=tu (5)
ae+bg=r af+bh=s ce+dg=t cf+dh=u
Prima facie, there are 8 matrix multiplications, each running in time T n
2
and 4 matrix additions, each running in time O(n2), so such a direct calculation would result in time complexity governed by the recurrence
T (n) = 8T n + c n2 2
The first case of the Master Theorem gives T(n) = Θ(n3), so nothing gained.
COMP3121/3821/9101/9801 24 / 28
Strassen’s algorithm for faster matrix multiplication
However, we can instead evaluate:
A = a (f − h); B = (a + b) h; C = (c + d) e D = d (g − e); E=(a+d)(e+h); F=(b−d)(g+h); H=(a−c)(e+f).
We now E+D−B+F
A + B
C + D E + A − C − H
obtain
=(ae+de+ah+dh)+(dg−de)−(ah+bh)+(bg−dg+bh−dh) = a e + b g = r;
= (a f − a h) + (a h + b h) = a f + b h = s;
= (c e + d e) + (d g − d e) = c e + d g = t;
= (a e + d e + a h + d h) + (a f − a h) − (c e + d e) − (a e − c e + a f − c f ) = c f + d h = u.
We have obtained all 4 components of C using only 7 matrix multiplications and 18 matrix additions/subtractions.
Thus, the run time of such recursive algorithm satisfies
T(n) = 7T(n/2) + O(n2) and the Master Theorem yields
T(n) = Θ(nlog2 7) = O(n2.808).
In practice, this algorithm beats the ordinary matrix multiplication for n > 32.
COMP3121/3821/9101/9801 25 / 28
Next time:
1 Can we multiply large integers faster than O nlog2 3??
2 Can we avoid messy computations like:
T(n)=3T 2
+cn=3 3T n n
+cn=3 T
+c 22 2
+cn
n 2
n n
+c 22 2
2 n 3n
3n
=33T +c+c+cn=3T +c+c+cn
3 n
23 22 2 23 22 2
3n 32 3 =3T +cn + +1= 23 222
3 n n 32 3 =3 3T +c +cn + +1=
24 23 22 2
4n 33 32 3 =3T +cn+++1= 24 23 22 2
…
=3 2 T logn +cn
32n 3n
⌊log n⌋ n 3 ⌊log2 n⌋−1 ⌊22⌋2 22
32 3 +…+2+ +1
≈3log2nT(1)+cn 2
2
3 log2 n − 1
log n = 3 2
T (1) + 2cn
3−1 3log2n
2
− 1
COMP3121/3821/9101/9801
26 / 28
PUZZLE!
You are given a 2n × 2n board with one of its cells missing (i.e., the board has a hole); the position of the missing cell can be arbitrary. You are also given a supply of “dominoes” each containing 3 such squares; see the figure:
Your task is to design an algorithm which covers the entire board with such “dominoes” except for the hole.
Hint: Do a divide-and-conquer recursion!
COMP3121/3821/9101/9801 27 / 28
That’s All, Folks!!
COMP3121/3821/9101/9801 28 / 28