程序代写 COMP3121/9101

Algorithms COMP3121/9101

THE UNIVERSITY OF
NEW SOUTH WALES

Copyright By PowCoder代写 加微信 powcoder

Algorithms
COMP3121/9101

School of Computer Science and Engineering
University of Wales Sydney

5. THE FAST FOURIER TRANSFORM

COMP3121/9101 1 / 41

Our strategy to multiply polynomials fast:

Given two polynomials of degree at most n,

PA(x) = Anx
+ . . .+A0; PB(x) = Bnx

+ . . .+B0

1 convert them into value representation at 2n+ 1 distinct points
x0, x1, . . . , x2n:

PA(x) ↔ {(x0, PA(x0)), (x1, PA(x1)), . . . , (x2n, PA(x2n))}
PB(x) ↔ {(x0, PB(x0)), (x1, PB(x1)), . . . , (x2n, PB(x2n))}

2 multiply them point by point using 2n+ 1 multiplications:{
(x0, PA(x0)PB(x0)︸ ︷︷ ︸

), (x1, PA(x1)PB(x1)︸ ︷︷ ︸

), . . . , (x2n, PA(x2n)PB(x2n)︸ ︷︷ ︸

3 Convert such value representation of PC(x) to its coefficient form

PC(x) = C2nx
2n + C2n−1x

2n−1 + . . .+ C1x+ C0;

COMP3121/9101 2 / 41

Our strategy to multiply polynomials fast:

So, we need 2n+ 1 values of PA(xi) and PB(xi), 0 ≤ i ≤ 2n.

If we use 2n+ 1 integers which are the smallest by their absolute value, i.e.,

−n,−(n− 1), . . . ,−1, 0, 1, . . . , n− 1, n

among the values which we need to compute is
PA(n) = A0 +A1n+ . . .+An−1n

We saw that the trouble is that, as the degree n of the polynomials PA(x) and
PB(x) increases, the value of n

n increases very fast and causes rapid increase
of the computational complexity of the algorithm for polynomial multiplication
which we used in the generalised Karatsuba algorithm.

Key Question: What values should we take for x0, . . . , x2n to avoid
“explosion” of size when we evaluate xni while computing
PA(xi) = A0 +A1x+ . . .+Anx

COMP3121/9101 3 / 41

Complex numbers revisited

Complex numbers z = a+ ib can be represented using their modulus |z| =

and their argument, arg(z), which is an angle taking values in (−π, π] and satisfying:

z = |z|ei arg(z) = |z|(cos arg(z) + i sin arg(z)),

see figure below.

z!=!a!+!i!b!

COMP3121/9101 4 / 41

Complex numbers revisited

Recall that

|z|ei arg(z)

= |z|nei n arg(z) = |z|n(cos(n arg(z)) + i sin(n arg(z))),

see the figure.

2!arg!(z)!

COMP3121/9101 5 / 41

Complex roots of unity

Roots of unity of order n are complex numbers which satisfy zn = 1.

If zn = |z|n(cos(n arg(z)) + i sin(n arg(z))) = 1 then

|z| = 1 and n arg(z) is an integer multiple of 2π;

Thus, n arg(z) = 2π k, i.e., arg(z) =

We denote ωn = e
i 2π/n; such ωn is called a primitive root of unity of order n.

!! = !!!!!!/!!

(ω16)4 (ω16)5

1=(ω16)0 (ω16)8

Roots of unity of order 16

A root of unity ω of order n is primitive if all other roots of unity of the same
order can be obtained as its powers ωk.

COMP3121/9101 6 / 41

Complex roots of unity

For ωn = e
i 2π/n and for all k such that 0 ≤ k ≤ n− 1,

Thus, ωkn = (ωn)
k is also a root of unity (and it can be shown that it is

primitive just in case k is relatively prime with n).

Since ωkn are roots of unity for k = 0, 1, . . . , n− 1 and there are at most n
distinct roots of unity of order n (i.e., solutions to the equation xn − 1 = 0) we
conclude that every root of unity of order n must be of the form ωkn.

For the product of any two roots of unity ωkn and ω
n of the same order we

If k +m ≥ n then k +m = n+ l for l = (k +m)mod n and we have

n = 1 · ωln = ωln where 0 ≤ l < n. Thus, the product of any two roots of unity of the same order is just another root of unity of the same order. COMP3121/9101 7 / 41 Complex roots of unity So in the set of all roots of unity of order n, i.e., {1, ωn, ω2n, . . . , ωn−1n } we can multiply any two elements or raise an element to any power without going out of this set. Note that this is not true for addition, i.e., the sum of two roots of unity is NOT another root of unity! A very important property of the roots of unity is: The Cancelation Lemma: ωkmkn = ω n for all integers k,m, n. kn = (ωkn) Thus, in particular, (ωk2n) 2 = ω2k2n = (ω So the squares of the roots of unity of order 2n are just the roots of unity of COMP3121/9101 8 / 41 The Discrete Fourier Transform Let A = ⟨A0, A1, . . . , An−1⟩ be a sequence of n real or complex numbers. We can form the corresponding polynomial PA(x) = We can evaluate it at all complex roots of unity of order n, i.e., we compute n) for all 0 ≤ k ≤ n− 1. The sequence of values ⟨PA(1), PA(ωn), PA(ω2n), . . . , PA(ωn−1n )⟩, is called the Discrete Fourier Transform (DFT) of the sequence A = ⟨A0, A1, . . . , An−1⟩. The value PA(ω n) is usually denoted by Âk and the sequence of values ⟨PA(1), PA(ωn), PA(ω2n), . . . , PA(ωn−1n )⟩ is usually denoted by  = ⟨Â0, Â1, . . . , Ân−1⟩. The DFT  of a sequence A can be computed VERY FAST using a divide-and-conquer algorithm called the Fast Fourier Transform. COMP3121/9101 9 / 41 New way for fast multiplication of polynomials If we multiply a polynomial PA(x) = A0 + . . .+An−1x of degree n− 1 with a polynomial PB(x) = B0 + . . .+Bm−1x of degree m− 1 we get a polynomial C(x) = PA(x)PB(x) = C0 + . . .+ Cm+n−2x of degree n− 1 +m− 1 = m+ n− 2 with m+ n− 1 coefficients. To uniquely determine such a polynomial C(x) of degree m+ n− 2 we need m+ n− 1 many values. Thus, we will evaluate both PA(x) and PB(x) at all the roots of unity of order n+m− 1 (instead of at −(n− 1), . . . ,−1, 0, 1, . . . ,m− 1 as we would in Karatsuba’s method!) COMP3121/9101 10 / 41 New way for fast multiplication of polynomials Note that we defined the DFT of a sequence of length n as the values of the corresponding polynomial of degree n− 1 at the n roots of unity of order n, i.e., ωkn (0 ≤ k ≤ n− 1). So the DFT of a sequence A is another sequence  of exactly the same length; we do not have an operation which would evaluate a polynomial of degree n− 1 at m roots of unity of order m ̸= n. For that reason, since we need n+m− 1 values of PC(x) = PA(x)PB(x), we pad A with m− 1 zeros at the end, (A0, A1, . . . , An−1, 0, . . . , 0︸ ︷︷ ︸ ) to make it of length m+ n− 1, and similarly we pad B with n− 1 zeros at the end, (B0, B1, . . . , Bm−1, 0, . . . , 0︸ ︷︷ ︸ ) to also obtain a sequence of length n+m− 1. Note that this does not change the associated polynomials because the added higher powers have the corresponding coefficients equal to zero. COMP3121/9101 11 / 41 New way for fast multiplication of polynomials We can now compute the DFTs of the two (0 padded) sequences: DFT (⟨A0, A1, . . . , An−1, 0, . . . , 0︸ ︷︷ ︸ ⟩) = ⟨Â0, Â1, . . . , Ân+m−2) DFT (⟨B0, B1, . . . , Bm−1, 0, . . . , 0︸ ︷︷ ︸ ⟩) = ⟨B̂0, B̂1, . . . , B̂n+m−2) For each k we multiply the corresponding values Âk = PA(ω n+m−1) and B̂k = PB(ω n+m−1), thus obtaining Ĉk = ÂkB̂k = PA(ω n+m−1)PB(ω n+m−1) = PC(ω We then use the inverse transformation for DFT, called IDFT, to recover the coefficients ⟨C0, C1, . . . , Cn+m−1⟩ of the product polynomial PC(x) from the sequence ⟨Ĉ0, Ĉ1, . . . , Ĉn+m−1⟩ of its values Ĉk = PC(ωkn+m−1) at the roots of unity of order n+m− 1. COMP3121/9101 12 / 41 New way for fast multiplication of polynomials PA(x) = A0 +. . .+ An−1x + 0 · xn +. . .+ 0 · xn+m−2; PB(x) = B0 +. . .+ Bm−1x + 0 · xm +. . .+ 0 · xn+m−2 ⇓ DFT ⇓ DFT {PA(1), PA(ωn+m−1), . . . , PA(ω n+m−1)}; {PB(1), PB(ωn+m−1), . . . , PB(ω ⇓ multiplication PA(1)PB(1)︸ ︷︷ ︸ , PA(ωn+m−1)PB(ωn+m−1)︸ ︷︷ ︸ PC (ωn+m−1) , . . . , PA(ω n+m−1)PB(ω n+m−1)︸ ︷︷ ︸ PC(x) = PA(x) · PB(x) = COMP3121/9101 13 / 41 The Fast Fourier Transform (FFT) Crucial fact: the values PA(ω n) for all k such that 0 ≤ k < n can be computed in O(n logn) time! Note that a direct evaluation of a polynomial of degree n− 1 at n roots of unity of order n would take n2 many multiplications, even if we precompute all powers ωkmn , because we have to perform multiplications Am · (ωkn)m for all 0 ≤ m ≤ n− 1 and all 0 ≤ k ≤ n− 1. We can assume that n is a power of 2 - otherwise we can pad PA(x) with zero coefficients until its number of coefficients becomes equal to the nearest power of 2. Exercise: Show that for every n which is not a power of two the smallest power of 2 larger or equal to n is smaller than 2n. Hint: consider n in binary. How many bits does the nearest power of two have? COMP3121/9101 14 / 41 The Fast Fourier Transform (FFT) Problem: Given a sequence A = ⟨A0, A1, . . . , An−1⟩ compute its DFT. This amounts to finding values of PA(x) for all x = ω n, 0 ≤ k ≤ n− 1. The main idea of the FFT algorithm: divide-and-conquer by splitting the polynomial PA(x) into the even powers and the odd powers: PA(x) = (A0 +A2x 4 + . . .+An−2x n−2) + (A1x+A3x 3 + . . .+An−1x 2)2 + . . .+An−2(x 2)2 + . . .+An−1(x Let us define A[0] = ⟨A0, A2, A4, . . . An−2⟩ and A[1] = ⟨A1, A3, A5, . . . An−1⟩; then (y) = A0 +A2y +A4y 2 + . . .+An−2y (y) = A1 +A3y +A5y 2 + . . .+An−1y PA(x) = PA[0] (x Note that the number of coefficients of the polynomials P (y) is n/2 each, while the number of coefficients of the polynomial PA(x) is n. COMP3121/9101 15 / 41 The Fast Fourier Transform (FFT) Problem of size n: Evaluate a polynomial with n coefficients at n many roots of unity. Problem of size n/2: Evaluate a polynomial with n/2 coefficients at n/2 many roots of unity. We reduced evaluation of our polynomial PA(x) with n coefficients at inputs x = ω0n, x = ω n, . . . , x = ω n to evaluation of two polynomials PA[0] (y) and (y) each with n/2 coefficients, at points y = x2 for the same values of inputs x. However, as x ranges through values {ω0n, ω1n, ω2n, . . . , ω n }, the value of y = x2 ranges through {ω0 , . . . , ωn−1 }, and there are only n/2 distinct such values. Once we get these n/2 values of P (x2) and P (x2) we need n additional multiplications with numbers ωkn to obtain the values of n) = PA[0] ((ω 2) + ωkn · PA[1] ((ω (ωkn/2) + ω n · PA[1] (ω Thus, we have reduced a problem of size n to two such problems of size n/2, plus a linear overhead. COMP3121/9101 16 / 41 The Fast Fourier Transform (FFT) - a simplification Note that by the Cancelation Lemma ω = ω2 = −1. We can now simplify evaluation of n) = PA[0] (ω for n/2 ≤ k < n as follows: let k = n/2 +m where 0 ≤ m < n/2; then n ) = PA[0] (ω n PA[1] (ω ωmn/2) + ω n PA[1] (ω (ωmn/2)− ω n PA[1] (ω Compare this with PA(ω n ) = PA[0] (ω ) + ωmn PA[1] (ω ) for 0 ≤ m < n/2. COMP3121/9101 17 / 41 The Fast Fourier Transform (FFT) - a simplification So we can replace evaluations of n) = PA[0] (ω for k = 0 to k = n− 1 with such evaluations only for k = 0 to k = n/2− 1 and just let for k = 0 to k = n/2− 1 n) = PA[0] (ω n ) = PA[0] (ω We can now write a pseudo-code for our FFT algorithm: COMP3121/9101 18 / 41 FFT algorithm 1: function FFT(A) 2: n← length[A] 3: if n = 1 then return A 5: A[0] ← (A0, A2, . . . An−2); 6: A[1] ← (A1, A3, . . . An−1); 7: y[0] ← FFT (A[0]); 8: y[1] ← FFT (A[1]); 9: ωn ← ei 10: ω ← 1; % a variable to hold powers of ωn 11: for k = 0 to k = n/2− 1 do: % PA(ωkn)︸ ︷︷ ︸ (ωkn/2)︸ ︷︷ ︸ +ωkn PA[1] (ω n/2)︸ ︷︷ ︸ 12: yk ← y + ω · y[1] 13: yn/2+k ← y − ω · y[1] (ωkn/2)︸ ︷︷ ︸ −ωkn PA[1] (ω n/2)︸ ︷︷ ︸ 14: ω ← ω · ωn; 15: end for 16: return y 17: end if 18: end function COMP3121/9101 19 / 41 How fast is the Fast Fourier Transform? We have recursively reduced evaluation of a polynomial PA(x) with n coefficients at n roots of unity of order n to evaluations of two polynomials PA[0](y) and PA[1](y), each with n/2 coefficients, at n/2 many roots of unity of order n/2. Once we get these n/2 values of PA[0](y) and PA[1](y) we need n/2 additional multiplications to obtain the values of n/2)︸ ︷︷ ︸ n/2)︸ ︷︷ ︸ n/2)︸ ︷︷ ︸ −ωkn PA[1](ω n/2)︸ ︷︷ ︸ for all 0 ≤ k < n/2. Thus, we have reduced a problem of size n to two such problems of size n/2, plus a linear overhead. Consequently, our algorithm’s run time satisfies the recurrence T (n) = 2T (n/2) + c n The Master Theorem gives T (n) = Θ(n logn). COMP3121/9101 20 / 41 Matrix representation of polynomial evaluation Evaluation of a polynomial PA(x) = A0 + A1x + . . . + An−1x n−1 at roots of unity ωkn of order n can be represented in the matrix form as follows: 1 1 1 . . . 1 (n−1)(n−1) The FFT is just a method of replacing this matrix-vector multiplication taking n2 many multiplications with an n logn procedure. From PA(1), PA(ωn), PA(ω n), . . . , PA(ω n ), we get the coefficients from 1 1 1 . . . 1 (n−1)(n−1) COMP3121/9101 21 / 41 Another remarkable feature of the roots of unity: To obtain the inverse of the above matrix, all we have to do is just change the signs of the exponents and divide everything by n: 1 1 1 . . . 1 (n−1)(n−1) 1 1 1 . . . 1 −(n−1)(n−1) COMP3121/9101 22 / 41 To see this, note that if we compute the product  1 1 1 . . . 1 (n−1)(n−1)   1 1 1 . . . 1 −(n−1)(n−1)  the (i, j) entry in the product matrix is equal to a product of ith row and jth COMP3121/9101 23 / 41 We now have two possibilities: 1 i = j: then ω(i−j)kn = 2 i ̸= j: then n represents a sum of a geometric progression with the ratio ωi−jn and thus ω(i−j)kn = 1− ω(i−j)nn 1− (ωnn)i−j ω(i−j)kn = n if i = j 0 if i ̸= j COMP3121/9101 24 / 41 So we get:  1 1 1 . . . 1 (n−1)(n−1)   1 1 1 . . . 1 −(n−1)(n−1)   n 0 0 . . . 0 0 n 0 . . . 0 0 0 n . . . 0 0 0 0 . . . n   1 0 0 . . . 0 0 1 0 . . . 0 0 0 1 . . . 0 0 0 0 . . . 1   1 1 1 . . . 1 (n−1)(n−1)   1 1 1 . . . 1 1 ω−(n−1)n ω −(n−1)(n−1)  COMP3121/9101 25 / 41 We now have 1 1 1 . . . 1 (n−1)(n−1) 1 1 1 . . . 1 1 ω−(n−1)n ω −(n−1)(n−1) This means that to covert from the values ⟨PA(1), PA(ωn), PA(ω2n), . . . , PA(ω which we denoted by ⟨Â0, Â1, Â2, . . . , Ân−1⟩ back to the coefficient form PA(x) = A0 +A1x+A2x we can use the same FFT algorithm with the only change that: 1 the root of unity ωn is replaced by ωn = e 2 the resulting output values are divided by n. COMP3121/9101 26 / 41 Inverse Fast Fourier Transform (IFFT): 1: function IFFT∗(Â) 2: n← length(Â) 3: if n = 1 then return  5: Â[0] ← (Â0, Â2, . . . Ân−2); 6: Â[1] ← (Â1, Â3, . . . Ân−1); 7: y[0] ← IFFT ∗(Â[0]); 8: y[1] ← IFFT ∗(Â[1]); 9: ωn ← e−i n ; ⇐ different from FFT 10: ω ← 1; 11: for k = 0 to k = n/2− 1 do; 12: yk ← y + ω · y[1] 13: yn/2+k ← y − ω · y[1] 14: ω ← ω · ωn; 15: end for 16: return y; 17: end if 18: end function 1: function IFFT(Â) ⇐ different from FFT 2: return IFFT ∗(Â)/length(Â) 3: end function COMP3121/9101 27 / 41 Back to fast multiplication of polynomials (convolution) PA(x) = A0 + A1x + . . . + An−1x PB(x) = B0 + B1x + . . . + Bn−1x ⇓ DFT O(n logn) ⇓ DFT O(n logn) {PA(1), PA(ω2n−1), PA(ω 2n−1), . . . , PA(ω 2n−1)}; {PB(1), PB(ω2n−1), PB(ω 2n−1), . . . , PB(ω ⇓ multiplication O(n) {PA(1)PB(1), PA(ω2n−1)PB(ω2n−1), . . . , PA(ω ⇓ IDFT O(n logn) = PA(x) · PB(x) Thus, the product PC(x) = PA(x)PB(x) of two polynomials PA(x) and PB(x) can be computed in time O(n logn). COMP3121/9101 28 / 41 Computing the convolution C = A ∗B A = ⟨A0, A1, . . . , An−1⟩ B = ⟨B0, B1, . . . , Bn−1⟩ ⇓ O(n) ⇓ O(n) PA(x) = A0 + A1x + . . . + An−1x PB(x) = B0 + B1x + . . . + Bn−1x ⇓ DFT O(n logn) ⇓ DFT O(n logn) {PA(1), PA(ω2n−1), PA(ω 2n−1), . . . , PA(ω 2n−1)}; {PB(1), PB(ω2n−1), PB(ω 2n−1), . . . , PB(ω ⇓ multiplication O(n) {PA(1)PB(1), PA(ω2n−1)PB(ω2 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com