handoutE.dvi
ECS130 Scientific Computing Handout E February 13, 2017
1. The Power Method
(a) Pseudocode:
Power Iteration
Given an initial vector u0,
i = 0
repeat
ti+1 = Aui
ui+1 = ti+1/‖ti+1‖2 (approximate eigenvector)
θi+1 = u
H
i+1Aui+1 (approximate eigenvalue)
i = i+ 1
until convergence
(b) Practical stopping criterion: |θi+1 − θi| ≤ tol · |θi|.
(c) Example: Let
A =
−261 209 −49
−530 422 −98
−800 631 −144
.
and λ(A) = {10, 4, 3}. Let u0 = (1, 0, 0)
T , then
i 1 2 3 · · · 10
θi 994.49 13.0606 10.07191 · · · 10.0002
(d) Convergence analysis: Assume
A = XΛX−1
with Λ = diag(λ1, λ2, . . . , λn) and |λ1| > |λ2| ≥ . . . ≥ |λn|. Then, we can show that
• ui =
Aiu0
‖Aiu0‖
→ x1/‖x1‖, where x1 = Xe1 as i → ∞.
• θi → λ1 as i → ∞.
(e) The convergence rate depends on
|λ2|
|λ1|
. Therefore, if
|λ2|
|λ1|
is close to 1, then the power
method could be very slow convergent or doesn’t converge at all.
1
2. Inverse Iteration
(a) Purposes:
• Overcome the drawbacks of the power method (slow convergence)
• find an eigenvalue closest to a particular given number (called shift): σ
(b) Observation: if λ is an eigenvalue of A, then
• λ− σ is an eigenvalue of A− σI,
• 1
λ−σ
is an eigenvalue of (A− σI)−1.
1/(\lambda-\sigma)
\sigma
(c) Pseudocode
Inverse Iteration
Given an initial vector u0 and a shift σ
i = 0
repeat
solve (A− σI)ti+1 = ui for ti+1
ui+1 = ti+1/‖ti+1‖2 (approximate eigenvector)
θi+1 = u
H
i+1Aui+1 (approximate eigenvalue)
i = i+ 1
until convergence
(d) Convergence analysis: Assume A = XΛX−1 with Λ = diag(λ1, λ2, . . . , λn) and λk is the
eigenvalue cloest to the shift σ. It can be shown that
• ui → xk/‖xk‖ as i → ∞, where xk = Xek
• θi converges to λk i → ∞.
• Convergence rate depends on maxj 6=k
|λk−σ|
|λj−σ|
.
(e) Advantages: the ability to converge to any desired eigenvalue (the one nearest to the shift
σ). By choosing σ very close to a desired eigenvalue, the method converges very quickly
and thus not be as limited by the proximity of nearby eigenvalues as is the original power
method. The method is particularly effective when we have a good approximation to an
eigenvalue and want only its corresponding eigenvector.
(f) Drawbacks: (a) expensive in general: solving (A − σI)ti+1 = ui for ui+1. One LU
factorization of A−σI is required, which could be very expensive for large matrices, (b)
Only compute one eigenpair.
2
3. Orthogonal iteration (subspace iteration, simultaneous iteration)
(a) Purpose: compute a p-dimensional invaraint subspace, p > 1, rather than one eigenvector
at a time.
(b) Pseudocode:
Orthogonal Iteration
Given an initial n× p orthogonal matrix Z0
i = 0
repeat
Yi+1 = AZi
Yi+1 = Zi+1Ri+1 (QR decomposition)
i = i+ 1
until convergence
The use of QR decomposition keeps the vectors spanning span{AiZ0} of full rank.
(c) Convergence: under mild conditions, Zi converges to the invariant subspace spanned by
the first p eigenvectors corresponding to the p dominant eigenvalues, where
|λ1| ≥ |λ2| ≥ · · · ≥ |λp| > |λp+1| ≥ · · · ≥ |λn|.
If we let Bi = Z
T
i AZi, then
‖AZi − ZiBi‖ → 0 as i → ∞
and eigenvalues of Bi approximate the dominant eigenvalues of A.
Convergence rate depends on |λp+1|/|λp|.
(d) Example: Let Z0 = [e1, e2, e3] and
A =
-0.4326 1.1892 -0.5883 -0.0956 -0.6918 -0.3999
-1.6656 -0.0376 2.1832 -0.8323 0.8580 0.6900
0.1253 0.3273 -0.1364 0.2944 1.2540 0.8156
0.2877 0.1746 0.1139 -1.3362 -1.5937 0.7119
-1.1465 -0.1867 1.0668 0.7143 -1.4410 1.2902
1.1909 0.7258 0.0593 1.6236 0.5711 0.6686
Eigenvalues of A = -2.1659 +- 0.5560i, 2.1493, 0.2111 +- 1.9014i, -0.9548
i = 10: Eigenvalues of Z’_10*A*Z_10: -1.4383 +- 0.3479i, 2.1500
i = 30: Eigenvalues of Z’_30*A*Z_30: -2.1592 +- 0.5494i, 2.1118
i = 70: Eigenvalues of Z’_70*A*Z_70: -2.1659 +- 0.5560i, 2.1493
(e) An important special case: Let p = n and Z0 = I, then Ai = Z
T
i AZi converges to the
Schur form of A provided that
(1) all eigenvalues of A have distinct absolute values and
(2) all the principal submatrices of S have full rank, where we assume A = SΛS−1.
(f) Example: the same test matrix, numerical results of orthogonal iteration with Z0 = I:
3
A_10 =
-1.6994 -0.2201 -0.8787 1.4292 0.3847 0.0112
0.0007 1.1325 -1.2186 1.2245 -0.0867 -0.0648
0.2637 -1.9636 -0.1598 2.3959 -0.8136 -0.4311
-0.0364 -0.2346 0.5527 -0.4393 -1.9263 -1.2496
-0.4290 1.3482 1.1484 0.6121 -0.5937 -0.2416
0.0003 -0.0013 -0.0003 0.0011 -0.0014 -0.9554
A_30 =
-2.4055 -1.0586 -1.3420 0.0991 -1.1210 0.1720
0.0517 0.9645 -1.6519 0.8512 0.7215 0.7654
0.2248 -1.9947 -0.7656 -1.1876 -0.2736 0.1552
0.0029 0.0263 -0.0682 0.1381 -2.3094 -0.6765
0.0147 -0.0808 -0.0569 1.5462 0.3082 0.8476
0.0000 0.0000 0.0000 0.0000 0.0000 -0.9548
A_70 =
-2.0800 1.6092 -0.6426 1.1025 -0.1553 0.0734
-0.1690 -1.6820 1.7425 -0.0311 0.9229 -0.3087
0.0677 1.2521 1.5795 -0.1458 1.2092 0.7088
0.0000 0.0000 -0.0005 0.5575 -1.7386 -1.0520
0.0000 0.0004 0.0003 2.1489 -0.1353 -0.3250
0.0000 0.0000 0.0000 0.0000 0.0000 -0.9548
….
As we see Ak is converging to a quasi-upper triangular matrix as k increasing.
4
4. QR iteration
(a) Our goal is to reorganize orthogonal iteration to incorporate shifting and inverting as in
the inverse iteration. This will make it more efficient and eliminate the assumption that
eigenvalues differ in magnitude.
(b) Pseudocode
QR Iteration, without shift
A0 = A
i = 0
repeat
Ai = QiRi (QR decomposition)
Ai+1 = RiQi
i = i+ 1
until convergence
(c) Properties
• Observe that
Ai+1 = RiQi = Q
T
i QiRiQi = Q
T
i AiQi
• Ai+1 is orthogonally similar toA0 = A. Therefore Ai+1 andA have same eigenvalues:
Ai+1 = (Q0Q1 · · ·Qi−1Qi)
TA(Q0Q1 · · ·Qi−1Qi).
Note that Q0 · · ·Qi−1Qi is an orthogonal matrix since all Qj are.
• Ai computed by QR iteration is identical to the matrix Z
T
i AZi implicitly computed
by orthogonal iteration. In fact, we have
Proposition. Let Ai be the matrix computed by QR iteration. Then Ai =
ZTi AZi, where Zi is the matrix computed from orthogonal iteration starting
with Z0 = I.
Therefore, Ai converges to Schur form if all the eigenvalues have different absolute
values.
(d) Example. The same test matrix, numerical results of QR iteration
A_10 =
-1.6994 0.2201 -0.8787 -1.4292 -0.3847 0.0112
-0.0007 1.1325 1.2186 1.2245 -0.0867 0.0648
0.2637 1.9636 -0.1598 -2.3959 0.8136 -0.4311
0.0364 -0.2346 -0.5527 -0.4393 -1.9263 1.2496
0.4290 1.3482 -1.1484 0.6121 -0.5937 0.2416
0.0003 0.0013 -0.0003 -0.0011 0.0014 -0.9554
A_30 =
-2.4055 -1.0586 1.3420 -0.0991 1.1210 0.1720
0.0517 0.9645 1.6519 -0.8512 -0.7215 0.7654
-0.2248 1.9947 -0.7656 -1.1876 -0.2736 -0.1552
-0.0029 -0.0263 -0.0682 0.1381 -2.3094 0.6765
-0.0147 0.0808 -0.0569 1.5462 0.3082 -0.8476
0.0000 0.0000 0.0000 0.0000 0.0000 -0.9548
5
A_70 =
-2.0800 1.6092 -0.6426 1.1025 -0.1553 -0.0734
-0.1690 -1.6820 1.7425 -0.0311 0.9229 0.3087
0.0677 1.2521 1.5795 -0.1458 1.2092 -0.7088
0.0000 0.0000 -0.0005 0.5575 -1.7386 1.0520
0.0000 0.0004 0.0003 2.1489 -0.1353 0.3250
0.0000 0.0000 0.0000 0.0000 0.0000 -0.9548
Note that the results are identical to the orthogonal iteration.
6
5. QR iteration with shifts ⇒ QR Algorithm
(a) Purpose: accelerate the convergence of QR iteration by using shifts
(b) Pseudo-code
QR Iteration with Shifts
A0 = A
i = 0
repeat
Choose a shift σi
Ai − σiI = QiRi (QR decomposition)
Ai+1 = RiQi + σiI
i = i+ 1
until convergence
(c) Property: Ai and Ai+1 are orthogonally similar:
Ai+1 = Q
T
i AiQi.
Therefore, Ai+1 and A are orthogonally similar, and Ai+1 and A have the same eigen-
values.
(d) How to choose the shifts σi?
• If σi is an exact eigenvalue of A, then it can be shown that
Ai+1 = RiQi + σiI =
[
A′ a
0 σi
]
.
This means that the algorithm converges in one iteration. If more eigenvalues are
wanted, we can apply the algorithm again to the n− 1 by n− 1 matrix A′.
• In practice, a common choice of the σi is
σi = Ai(n, n).
A motivation of this choice is by observing that the convergence of the QR iteration
(without a shift), the (n, n) entry of Ai usually converges to an eigenvalue of A first.
(e) Example. The same test matrix as before. The following is the numerical result of QR
iteration with a shift.
With the shift σ0 as an exact eigenvalue σ0 = 2.1493, then
A_1 =
-1.4127 1.4420 1.0845 -0.6866 -0.1013 -0.2042
-1.2949 -0.2334 1.4047 -1.3695 1.5274 -0.7062
0.5473 0.1343 -0.7991 -0.6716 1.1585 0.0736
-0.2630 0.0284 0.5440 -1.4616 -1.5892 0.9205
-1.6063 -0.3898 0.3410 0.1623 -0.9576 -0.5795
0.0000 0.0000 0.0000 0.0000 0.0000 2.1493
7
With the shifts σi = Ai(n, n).
A_7 =
-2.4302 2.0264 -0.2799 -0.2384 0.3210 -0.0526
-0.1865 -1.4295 -1.3515 0.0812 0.8577 -0.0388
-0.1087 -0.8991 0.4491 0.4890 -1.8463 -1.2034
-0.0008 0.0511 -0.5997 -0.7839 -0.8088 -0.5188
-0.0916 -0.8273 1.6940 0.0645 -0.6698 -0.0854
0.0000 0.0000 0.0000 0.0000 0.0000 2.1493
We observe that by 7th iteration, we have found an eigenvalue of A.
(f) Note that the QR decomposition in the algorithm takes O(n3) flops. Even if the al-
gorithm took n iterations to converge, the overall cost of the algorithm will be O(n4).
This is too expensive (today, the complexity of algorithms for all standard matrix com-
putation problems is at O(n3).) However, if the matrix is initially reduced to upper
Hessenberg form, then the QR decomposition of a Hessenberg form costs O(n2) flops.
As a result, the overall cost of the algorithm is reduced to O(n3). This is referred to as
the Hessenberg QR algorithm, the method of choice for dense eigenvalue problem today,
say Matlab’s eigensolver eig use LAPACK’s implementation of the QR algorithm.
(g) QR algorithm is one of the top 10 algorithms in the 20th century.
8