Module 2: Square Linear Systems
1/94
Preliminary: Floating-Point Numbers
2/94
Absolute and Relative Errors
In numerical analysis, we use an algorithm to approximate some quantity of interest.
‚ We estimate of the accuracy of the computed value via an absolute error or a relative error:
eabs “ Aapprox ́ Aexact (absolute error) erel “ Aapprox ́ Aexact “ Aapprox ́ 1 , (relative error)
Aexact Aexact
where Aexact is the exact, analytical answer and Aapprox is the
approximate, numerical answer.
‚ If eabs or erel is small, we say that the approximate answer is accurate.
3/94
Example: Stirling’s Formula
Stirling’s formula provides a “good” approximation to n! for large n: ? ˆn ̇n
n!« 2fin e . (‹)
‚ Assume that the exact value of n! is found by factorial.
‚ Estimate n! using (‹).
‚ Show the accuracy of this approximation for various values of n.
Try in MATLAB:
n = …;
err_abs = sqrt(2*pi*n)*(n/exp(1))^n – factorial(n);
err_rel = err_abs/factorial(n);
disp(err_abs)
disp(err_rel)
4/94
Limitations of Digital Representations
A digital computer uses a nite number of bits to represent a real number and so it cannot represent all real numbers.
‚ The represented numbers cannot be arbitrarily large or small; ‚ There must be gaps between them.
So for all operations involving real numbers, it uses a subset of R called the oating-point numbers, F.
5/94
Floating-Point Numbers
A oating-point number is written in the form ̆p1 ` F q2E where E, the exponent, is an integer;
‚∞
‚ F,themantissa,isanumberF “ di“1bi2 ́i,withbi “0orbi “1.
Note that F can be rewritten as
F “ 2 ́d where M is an integer in Nr0, 2d ́ 1s.
dÿ ́1 k“0
Consequently, there are 2d evenly-spaced numbers between 2E and 2E`1 in the oating-point number system.
bd ́k2k, looooomooooon
“:M
6/94
Floating-Point Numbers – IEEE 754 Standard
‚ MATLAB, by default, uses double precision oating-point numbers, stored in memory in 64 bits (or 8 bytes):
̆1.xxxxxxxx ̈ ̈ ̈xxxxxxxxp2q ˆ2xxxx ̈ ̈ ̈xxxxp2q ́1023 . looooooooooooooooomooooooooooooooooon loooooooomoooooooon
mantissa (base 2): 52+1 bits ‚ Prede ned variables:
‚ eps = the distance from 1.0 to the next largest double-precision number: eps“2 ́52 «2.2204ˆ10 ́16.
‚ realmin = the smallest positive oating-point number that is stroed to full accuracy; the actual smallest is realmin/2^52.
‚ realmax = the largest positive oating-point number
exponent: 11 bits
7/94
Machine Epsilon and Relative Errors
The IEEE standard guarantees that the relative representation error and the relativecomputationalerrorhavesizessmallerthan eps,themachineepsilon:
‚ Representation: The oating-point representation, xˆ P F, of x P R satis es 1
xˆ“xp1`‘1q, forsome|‘1|§2 eps.
‚ Arithmetic: The oating-point representation, xˆ ‘ yˆ, of the result of xˆ ` yˆ
with xˆ, yˆ P F satis es
xˆ‘yˆ“pxˆ`yˆqp1`‘2q, forsome|‘2|§ 12 eps .
Similarly with a, b, c corresponding to ́, ˆ, ̃, respectively.
8/94
Round-O Errors
Computers CANNOT usually
‚ represent a number correctly;
‚ add, subtract, multiply, or divide correctly!!
Run the following and examine the answers:
format long
1.2345678901234567890
12345678901234567890
(1 + eps) – 1
(1 + .5*eps) – 1
(1 + .51*eps) – 1
n = input(’ n = ’); ( n^(1/3) )^3 – n
9/94
Catastrophic Cancellation
In nite precision storage, two numbers that are close to each other are indistinguishable. So subtraction of two nearly equal numbers on a computer can result in loss of many signi cant digits.
Catastrophic Cancellation
Consider two real numbers stored with 10 digits of precision:
e “ 2.7182818284, b “ 2.7182818272.
‚ Suppose the actual numbers e and b have additional digits that are not stored.
‚ The stored numbers are good approximations of the true values.
‚ However,ifwecomputee ́bbasedonthestorednumbers,weobtain 0.0000000012 “ 1.2 ˆ 10 ́9, a number with only two signi cant digits.
10/94
Example 1: Cancellation for Large Values of x
Question
Compute fpxq “ expcosh x ́ sinh xq at x “ 1, 10, 100, and 1000. Numerically:
format long
x = input(’ x = ’);
y = exp(x) * ( cosh(x) – sinh(x) );
disp([x, y])
11/94
Example 2: Cancellation for Small Values of x
Question ?1 ` x ́ 1
Compute fpxq “ x at x “ 10
Numerically:
x = 1e-12;
fx = (sqrt(1+x) – 1)/x;
disp( fx )
́12
.
12/94
To Avoid Such Cancellations . . .
‚ Unfortunately, there is no universal way to avoid loss of precision.
‚ One way to avoid catastrophic cancellation is to remove the source of cancellation by simplifying the given expression before computing numerically.
‚ For Example 1, rewrite the given expression recalling that cosh x “ pex ` e ́xq{2 sinh x “ pex ́ e ́xq{2.
‚ For Example 2, try again after rewriting f pxq as ?1`x ́1 ?1`x`1 1
fpxq “ x ̈ ?1 ` x ` 1 “ ?1 ` x ` 1 . ‚ Do you now have an improved accuracy?
13/94
Preliminary: Conditioning
14/94
Problems and Conditioning
‚ A mathematical problem can be viewed as a function f : X Ñ Y from a data/input space X to a solution/output space Y .
‚ We are interested in changes in f pxq caused by small perturbations of x.
‚ A well-conditioned problem is one with the property that all small perturbations of x lead to only small changes in f pxq
15/94
Condition Number
Letf :RÑRandxˆ“xp1`‘qbetherepresentationofxPR.
‚ The ratio of the relative error in f due to the change in x to the relative
error in x simpli es to
ˇfpxq ́fpxp1`‘qqˇ
ˇ‘f pxqˇ .
‚ In the limit of small error (ideal computer), we obtain
Ÿ pxq :“ lim ˇfpxq ́ fpxp1 ` ‘qqˇ f ‘Ñ0 ˇ‘f pxqˇ
“ˇlimfpx`‘xq ́fpxq ̈ x ˇ“ˇxf1pxqˇ, (‹) ˇ‘Ñ0 ‘x fpxqˇ ˇ fpxq ˇ
which is called the (relative) condition number.
16/94
Example: Conditioning of Subtraction
Consider f pxq “ x ́ c where c is some constant. Using the formula (‹), we nd that the associated condition number is
Ÿpxq “ ˇxf1pxqˇ “ ˇ x ˇ.
ˇ fpxq ˇ ˇx ́cˇ ‚ Itislargewhenx«c.
17/94
Example: Conditioning of Multiplication
The condition number of f pxq “ cx is
Ÿpxq “ ˇxf1pxqˇ “ ˇx ̈ cˇ “ 1.
ˇfpxqˇ ˇcxˇ ‚ No magni cation of error.
18/94
Example: Conditioning of Function Evaluation
The condition number of f pxq “ cospxq is
Ÿpxq “ ˇxf1pxqˇ “ ˇ ́xsinxˇ “ |xtanx|.
ˇfpxqˇ ˇ cosx ˇ
‚ The condition number is large when x “ pn ` 1{2qfi, where n P Z.
19/94
Example: Conditioning of Root-Finding
Let r “ f pa; b, cq be a root of ax2 ` bx ` c “ 0. Instead of direct di erentiation, use implicit di erentiation
r2 ` 2ar dr ` b dr “ 0. da da
Solve for the derivative,
1 dr r2 r2
f paq“ da “ ́2ar`b “ ́ ̆?b2 ́4ac,
then compute the condition number using the formula (‹) to get Ÿpaq “ ˇaf1paqˇ “ ˇ ?ar2 ˇ “ ˇ? ar ˇ.
ˇfpaqˇ ˇ ̆r b2 ́4acˇ ˇ b2 ́4acˇ
‚ Conditioning is poor for small discriminant, i.e., near repeated roots.
20/94
Preliminary: Stability
21/94
Algorithms
‚ Recallthatwede nedaproblemasafunctionf :X ÑY.
‚ An algorithm can be viewed as another map f ̃ : X Ñ Y between the same
two spaces, which involves errors arising in
‚ representing the actual input x as xˆ;
‚ implementing the function f numerically on a computer.
22/94
Analysis – General Framework
The relative error of our interest is
ˇ ̃ ˇ ˇ ̃ ˇ ˇ ˇ ˇfpxˆq ́ fpxqˇ § ˇfpxˆq ́ fpxˆqˇ ` ˇfpxˆq ́ fpxqˇ
ˇ fpxq ˇ ˇ fpxq ˇ ˇ fpxq ˇ
ˇ ̃ ˇˇ ˇ
Æˇfpxˆq ́fpxˆqˇ`ˇfpxˆq ́fpxqˇ§pŸˆnum`Ÿfq eps. ˇ fpxˆq ˇ ˇ fpxq ˇ
looooooomooooooon
looooooomooooooon
perturbation error
where Ÿ “ Ÿf be the (relative) condition number of the exact problem f and
ˇ ̃ ˇOˇ ˇ Ÿˆnum “ max ˇfpxˆq ́ fpxˆqˇ ˇxˆ ́ xˇ .
ˇ fpxˆq ˇ ˇ x ˇ
numerical error
23/94
Example: Root-Finding Revisited
Consider again solving the quadratic problem ar2 ` br ` c “ 0.
‚ Takinga“c“1andb“ ́p106 `10 ́6q,therootscanbecomputed
exactly by hand: r1 “ 106 and r2 “ 10 ́6.
‚ If numerically computed in MATLAB using the quadratic equation
formula, r1 is correct but r2 has only 5 correct digits.
‚ Fix it using r2 “ pc{aq{r1.
24/94
Introduction to Square Linear Systems
25/94
Polynomial Interpolation
Formal Statement
Given a set of n data points tpxj,yjq | j P Nr1,nsu with distinct xj’s, not necessarily sorted, nd a polynomial of degree n ́ 1,
ppxq“c1 `c2x`c3x2 ` ̈ ̈ ̈`cnxn ́1, (‹) which interpolates the given points, i.e.,
ppxjq “ yj, for j “ 1,2,…,n.
‚ Thegoalistodeterminethecoe cientsc1,c2,…,cn.
‚ Note that the total number of data point is 1 larger than the degree of the interpolating polynomial.
26/94
Why Do We Care?
‚ to nd the values between the discrete data points;
‚ to approximate a (complicated) function by a polynomial, which makes such computations as di erentiation or integration easier.
27/94
Interpolation to Linear System
Writing out the n interpolating conditions ppxj q “ yj :
Equations Matrix equation
$’ ’ c 1 ` c 2 x 1 ` ̈ ̈ ̈ ` c n x n ́ 1 “ y 1 ,/ / » 1 x 1 ̈ ̈ ̈ x n ́ 1 fi » c 1 fi » y 1 fi
’ 1/—1 — — & c1 ` c2x2 ` ̈ ̈ ̈ ` cnxn ́1 “ y2 . —1 x2 ̈ ̈ ̈ xn ́1 —c2 —y2 2 — 2 — — ’ ’ . . . . . . . . . . . . / / Ñ —– . . . . . . . . . fl —– . . . fl “ —– . . . fl
’ ’% c 1 ` c 2 x n ` ̈ ̈ ̈ ` c n x n ́ 1 “ y n / /- 1 x n ̈ ̈ ̈ x n ́ 1 c n y n nn
‚ This is a linear system of n equations with n unknowns. ‚ The matrix V is called a Vandermonde matrix.
looooooooooooomooooooooooooon lomon lomon
Vcy
28/94
Example: Fitting Population Data
U.S. Census data are collected every 10 years.
Year Population (millions) 1980 226.546 1990 248.710 2000 281.422 2010 308.746 2020 332.639
Question. How do we estimate population in other years?
‚ Interpolate available data to compute population in intervening years.
29/94
Example: Fitting Population Data
‚ Input data.
‚ Match up notation (optional).
‚ Note the shift in Line 7.
‚ Construct the Vandermonde matrix V by broadcasting.
‚ Solve the system using the backslash (\) operator.
1 year = (1980:10:2020)’; 2 pop = [226.546;
3 248.710;
4 281.422;
5 308.746; 6 332.639]; 7 x=year-1980; 8 y=pop;
9 n = length(x);
10 V = x.^(0:n-1); 11 c=V\y;
30/94
Post-Processing
1 xx = linspace(0, 40, 100)’;
2 yy = polyval(flip(c), xx);
3 clf
4 plot(1980+x, y, ’.’, 1980+xx, yy)
5 title(’US Population’),
6 xlabel(’year’), ylabel(’population (millions)’)
7 legend(’data’, ’interpolant’, ’location’, ’northwest’)
340
320
300
280
260
240
220
1980 1985
US Population
d in
ata terpolant
‚
‚
Use the polyval function to evaluate the polynomial.
MATLAB expects coe cients to be in descending order. (flip)
1990 1995
2000 2005 2010 2015 2020 year
31/94
population (millions)
Overview
Let A P Rnˆn and b P Rn. Then the equation Ax “ b has the following possibilities:
‚ If A is invertible (or nonsingular), then Ax “ b has a unique solution x “ A ́1b, or
‚ If A is not invertible (or singular), then Ax “ b has either no solution or in nitely many solutions.
The Backslash Operator “ \ ”
To solve for x in MATLAB, we use the backslash symbol “ \ ”:
>> x = A \ b
This produces the solution without explicitly forming the inverse of A. Warning:Eventhoughx“A ́1banalytically,don’tusex = inv(A)*b!
32/94
Triangular Systems
Systems involving triangular matrices are easy to solve.
‚ A matrix U P Rnˆn is upper triangular if all entries below main diagonal are zero:
»— u 1 1 u 1 2 u 1 3 ̈ ̈ ̈ u 1 n fi
— 0 u22 u23 ̈ ̈ ̈ u2n U“—0 0 u33 ̈ ̈ ̈ u3n .
—– . . . … . fl 0 0 0 ̈ ̈ ̈ unn
‚ A matrix L P Rnˆn is lower triangular if all entries above main diagonal are zero:
»— ̧11 0 0 ̈ ̈ ̈ 0fi
— ̧21 ̧22 0 ̈ ̈ ̈ 0 L “ — — ̧ 3 1 ̧ 3 2 ̧ 3 3 ̈ ̈ ̈ 0 .
—– . . . … . fl ̧n1 ̧n2 ̧n3 ̈ ̈ ̈ ̧nn
33/94
Example: Upper Triangular Systems
Solve the following 4 ˆ 4 system
»u11 u12 u13 u14fi »x1fi »b1fi
— 0 u22 u23 u24 —x2 “ —b2 . – 0 0 u33 u34fl–x3fl –b3fl
0 0 0 u44 x4 b4
34/94
General Results
‚ BackwardSubstitution.TosolveageneralnˆnuppertriangularsystemUx“y: $’ xn “ bn and
&unn ̈ ̨ ’ 1 ÿn
’% xi “ uii ̋bi ́ uijxj‚ j “i`1
for i “ n ́ 1, n ́ 2, . . . , 1.
‚ Forward Elimination. To solve a general n ˆ n lower triangular system Lx “ y:
$’ x1 “ b1 and
& ̧11 ̈ i ́1 ̨
’ xi “ ̧1 ̋bi ́ ÿ ̧ijxj‚ % ii j“1
for i “ 2,3,…,n.
35/94
Implementation: Backward Substitution
function x = backsub(U,b)
% BACKSUB x = backsub(U,b)
% Solve an upper triangular linear system.
% Input:
% U upper triangular square matrix (n by n)
% b right-hand side vector (n by 1)
% Output:
% x solution of Ux=b (n by 1 vector)
n = length(U);
x = zeros(n,1); % preallocate
for i = n:-1:1
x(i) = ( b(i) – U(i,i+1:n)*x(i+1:n) ) / U(i,i);
end end
36/94
Implementation: Forward Elimination
Exercise. Complete the code below.
function x = forelim(U,b)
% FORELIM x = forelim(L,b)
% Solve a lower triangular linear system.
% Input:
% L lower triangular square matrix (n by n)
% b right-hand side vector (n by 1)
% Output:
% x solution of Lx=b (n by 1 vector)
end
37/94
Does It Always Work?
Theorem 1 (Singularity of Triangular Matrix)
A triangular matrix is singular if and only if at least one of its diagonal elements is zero.
38/94
LU Factorization
39/94
General Method: Gaussian Elimination
‚ Gaussian elimination is an algorithm for solving a general system of linear equations that involves a sequence of row operations performed on the associated matrix of coe cients.
‚ This is also known as the method of row reduction. ‚ There are three variations to this method:
‚ G.E. without pivoting
‚ G.E. with partial pivoting (that is, row pivoting)
‚ G.E. with full pivoting (that is, row and column pivoting)
40/94
G.E. Without Pivoting: Example
Key Example
Solve the following system of equations.
$’ 2 x ` 2 x ` x “ 6
’& 1 2 3 matrix equation —
2 2 1 —x1 — 6 ’ ́4×1 ` 6×2 ` x3 “ ́8 ››››››››Ñ – ́4 6 1fl –x2fl “ – ́8fl
’% 5 x 1 ́ 5 x 2 ` 3 x 3 “ 4 5 ́ 5 3 x 3 4 looooooooomooooooooon lomon loomoon
Axb
Step 1: Write the corresponding augmented matrix and row-reduce to an echelon form.
»2216fi »22 16fi »2216fi – ́4 6 1 ́8 fl Ñ – 0 10 3 4 fl Ñ – 0 10 3 4 fl.
5 ́5 3 4 0 10 ́0.5 11 0 Step 2: Solve for x3, then x2, and then x1 via backward substitution.
x “ p3,1, ́2qT .
0 3.5 ́7
» fi»fi»fi
41/94
G.E. without Pivoting: General Procedure
As shown in the example, G.E. without pivoting involves two steps: 1 Row reduction: Transform Ax “ b to U x “ — where
»— u 1 1 u 1 2 ̈ ̈ ̈ u 1 n fi »— — 1 fi U“— u22 ̈ ̈ ̈ u2n and —“——2 .
—– . . . . . . fl —– . . . fl 0unn —n
2 Backward substitution: Solve U x “ — for x by $’ xn “ —n and
& unn
’ 1 ̈ ÿn ̨
’ xi “ u ̋—i ́ uijxj‚, fori“n ́1,n ́2,…,1. % ii j “i`1
42/94
G.E. without Pivoting: MATLAB Implementation
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15
function x = GEnp(A, b)
% Step 1: Row reduction to upper tri. system
S= n= for
[A, b]; % augmented matrix
size(A, 1);
j = 1:n-1
for i = j+1:n
mult = -S(i,j)/S(j,j);
S(i,:) = S(i,:) + mult*S(j,:);
end end
% Step 2: Backward substitution
U = S(:,1:end-1);
beta = S(:,end);
x = backsub(U, beta);
end
Exercise. Rewrite Lines 6–9 without using a loop. (Think vectorized!)
43/94
G.E. with Partial Pivoting: Procedure
In this variation of G.E., reduction to echelon form is done slightly di erently. ‚ OntheaugmentedmatrixrA|bs,
Key Process (partial pivoting)
1 Find the entry in the rst column with the largest absolute
value. This entry is called the pivot.
2 Perform a row interchange, if necessary, so that the pivot is
on the rst diagonal position.
3 Use elementary row operations to reduce the remaining entries in the rst column to zero.
‚ Once done, ignore the rst row and rst column and repeat the Key Process on the remaining submatrix.
‚ Continue this until the matrix is in a row-echelon form.
44/94
G.E. with Partial Pivoting: Example
Let’s solve the example on p. 41 again, now using G.E. with partial pivoting.
1st column:
» 2 2 1 6 fi pivot » 5 ́5 3 4 fi zero » 5 ́5 3 4 fi – ́4 6 1 ́8 fl›››Ñ– ́4 6 1 ́8 fl››Ñ– 0 2 3.4 ́4.8 fl
5 ́5 3 4 2 2 1 6 0 4 ́0.2 4.4 2nd column:
» 5 ́5
3
4 fi pivot » 5 ́4.8 fl›››Ñ– 0
4.4 0
́5 3 4 fi zero » 5 4 ́0.2 4.4 fl››Ñ– 0 2 3.4 ́4.8 0
́5 4 0
3 4 fi ́0.2 4.4 fl
3.5 ́7
– 0 0
2 3.4 4 ́0.2
Now that the last matrix is upper triangular, we work up from the third equation to the second to the rst and obtain the same solution as before.
45/94
G.E. with Partial Pivoting: MATLAB Implementation
Exercise
Write a MATLAB function GEpp.m which carries out G.E. with partial pivoting. ‚ Modify GEnp.m on p. 43 to incorporate partial pivoting.
‚ The only part that needs to be changed is the for-loop starting at Line 5.
‚ Right after for j = 1:n-1, nd the index of the pivot element of the jth
column of A below the diagonal.
[~, iM] = max(abs(A(j:end,j)));
iM = iM + j – 1;
‚ If the pivot element is not on the diagonal, swap rows so that it is on the diagonal.
if j ~= iM
S([j iM], 🙂 = S([iM j], 🙂
end
46/94
Why Is Pivoting Necessary?
Example
Given ‘ ! 1, solve the system
« ́‘ 1 «x1 “ «1 ́ ‘
1 ́1 x2 0
using Gaussian elimination with and without partial pivoting.
Without pivoting: By R2 Ñ R2 ` p1{‘qR1, we have
„ ́ ‘ 1 ⇢ „ x 1 ⇢ “ „ 1 ́ ‘ ⇢ ù ñ $’& x 2 “ 1 ,
0 ́1`1{‘ x2 1{‘ ́1 ’%x1 “ p1 ́‘q ́1. ́‘
‚ Inexactarithmetic,thisyieldsthecorrectsolution.
‚ In oating-point arithmetic, calculation of x1 su ers from catastrophic cancellation.
47/94
Why Is Pivoting Necessary? (Cont’)
Example
Given ‘ ! 1, solve the system
« ́‘ 1 «x1 “ «1 ́ ‘
1 ́1 x2 0
using Gaussian elimination with and without partial pivoting.
With partial pivoting: First, swap the rows R1 Ø R2, and then do R2 Ñ R2 ` ‘R1 to obtain
„1 ́1⇢„x1⇢“„ 0 ⇢ ùñ $’&x2“1,
0 1 ́‘ x2 1 ́‘ ’%x1 “ 0 ́p ́1q.
1
‚ Each of the arithmetic steps (to compute x1, x2) is well-conditioned. ‚ Thesolutioniscomputedstably.
48/94
Emulation of Gaussian Elimination
In this section, we emulate row operations steps required in Gaussian elimination by matrix multiplications. Two major operations.
‚ Row interchange Ri Ø Rj :
P pi, jqA, where P pi, jq is an elementary permutation matrix.
‚ Row replacement Ri Ñ Ri ` cRj :
p I ` c e i e Tj q A
See Appendix for more details.
49/94
Key Example Revisited
Let’s work out the key example from last time once again, now in matrix form Ax “ b.
» 2 2 1fi»x1fi » 6fi – ́4 6 1fl–x2fl “ – ́8fl .
5 ́5 3 x3 4 [Pivot] Switch R1 and R3 using P p1, 3q:
» 0 0 1 fi »— » 2 2 1 fi » x 1 fi » 6 fi fi » 5 ́ 5 3 fi » x 1 fi » 4 fi
–0 1 0fl—–– ́4 6 1fl–x2fl “ – ́8fl fl ›Ñ – ́4 6 1fl–x2fl “ – ́8fl
1 0 0 5 ́5 3 x3 4 2 2 1 x3 6 loooooomoooooon
P p1,3q
[Zero] Do row operations R2 Ñ R2 ` p4{5qR1 and R3 Ñ R3 ́ p2{5qR1:
» 1 0 0fi»—»5 ́5 3fi»x1fi »4fifi – 4{5 1 0fl—–– ́4 6 1fl–x2fl “ – ́8fl fl
́2{5 0 1 2 2 1 x3 6 looooooooomooooooooon
G1
»5 ́5 3fi»x1fi » 4fi ›Ñ –0 2 3.4fl–x2fl “ – ́4.8fl
0 4 ́0.2
x3 4.4
50/94
Key Example Revisited (cont’)
[Pivot] Switch R2 and R3 using P p2, 3q:
» 1 0 0 fi »— » 5 ́ 5 3 fi » x 1 fi » 4 fi fi –0 0 1fl —––0 2 3.4fl –x2 fl “ – ́4.8fl fl
0 1 0 0 4 ́0.2 x3 4.4 loooooomoooooon
P p2,3q
[Zero] Do a row operation R3 Ñ R3 ́ p1{2qR2:
» 1 0 0 fi »— » 5 ́ 5 3 fi » x 1 fi »
–0 1 0fl—––0 4 ́0.2fl–x2fl “ – 4.4fl fl
4 fi fi 0 ́1{2 1 0 2 3.4 x3 ́4.8
looooooooomooooooooon
3fi»x1fi » 4fi ›Ñ –0 4 ́0.2fl–x2fl “ –4.4fl
G2
»5 ́5
0 0 3.5 x3 ́7
»5 ́5 3fi»x1fi » 4fi ›Ñ–0 4 ́0.2fl–x2fl“– 4.4fl
023.4×3 ́4.8
looooooooooomooooooooooon
U
51/94
Analysis of Example
‚ The previous calculations can be summarized as
G2Pp2,3qG1Pp1,3qA “ U . (‹)
‚ Using the noted properties of permutation matrices and GTMs, (‹) can be written as
G2Pp2,3qG1 Pp2,3qPp2,3qPp1,3qA “ U
looooooomooooooon
“I
›Ñ G2 Pp2,3qG1Pp2,3qPp2,3qPp1,3qA “ U .
looooooooomooooooooon looooooomoooorooon “ : Gr 1 “ : P
‚ The above can be summarized as PA “ LU where L “ pG2G1q ́1 is a lower triangular matrix.
52/94
Generalization – PLU Factorization
For an arbitrary matrix A P Rnˆn, the partial pivoting and row operations are intermixed as
Gn ́1 P pn ́ 1, rn ́1 q ̈ ̈ ̈ G2 P p2, r2 qG1 P p1, r1 qA “ U .
Going through the same calculations as above, it can always be written as
́ Gr n ́ 1 ̈ ̈ ̈ Gr 2 Gr 1 ̄ P p n ́ 1 , r n ́ 1 q ̈ ̈ ̈ P p 2 , r 2 q P p 1 , r 1 q A “ U , which again leads to PA “ LU:
P p n ́ 1 , r n ́ 1 q ̈ ̈ ̈ P p 2 , r 2 q P p 1 , r 1 q A “ ́ Gr n ́ 1 ̈ ̈ ̈ Gr 2 Gr 1 ̄ ́ 1 U .
looooooooooooooooooooomooooooooooooooooooooon
looooooooooomooooooooooon
“:P
This is called the PLU factorization of matrix A.
“:L
53/94
LU and PLU Factorization
If no pivoting is required, the previous procedure simpli es to
which leads to A “ LU:
looooooooooomooooooooooon
“:L This is called the LU factorization of matrix A.
Gn ́1 ̈ ̈ ̈G2G1A “ U . A “ pGn ́1 ̈ ̈ ̈G2G1q ́1 U .
54/94
Implementation of LU Factorization
function [L,U] = mylu(A)
% MYLU LU factorization (demo only–not stable!).
% Input:
% A square matrix
% Output:
% L,U unit lower triangular and upper triangular such that
LU=A
n = length(A);
L = eye(n); % ones on diagonal
% Gaussian elimination
for j = 1:n-1
for i = j+1:n
L(i,j) = A(i,j) / A(j,j);
A(i,j:n) = A(i,j:n) – L(i,j)*A(j,j:n);
end end
U = triu(A);
end
% row multiplier
55/94
Implementation of LU Factorization
Exercise. Write a MATLAB function myplu for PLU factorization by modifying the previous function mylu.m.
function [L,U,P] = myplu(A)
% MYPLU PLU factorization (demo only–not stable!).
% Input:
% A square matrix
% Output:
% P,L,U permutation, unit lower triangular, and upper
triangular such that LU=PA
% Your code here.
end
56/94
Solving a Square System Using PLU Factorization
Multiplying Ax “ b on the left by P we obtain
loPomAon x “ loPmbon ›Ñ LU x “ — ,
“LU “:— which can be solved in two steps:
‚ De ne Ux “ y and solve for y in the equation Ly “ — .
‚ Having calculated y, solve for x in the equation U x “ y .
(forward elimination) (backward substitution)
57/94
Solving a Square System Using PLU Factorization
‚ Using the instructional codes ( backsub, forelim, myplu ): [L,U,P] = myplu(A);
x = backsub( U, forelim(L, P*b) );
‚ Using MATLAB’s built-in functions: [L,U,P] = lu(A);
x = U \ (L \ (P*b));
‚ The backslash is designed so that triangular systems are solved with the appropriate substitution.
‚ The most compact way: x = A \ b;
‚ The backslash does partial pivoting and triangular substitutions silently and automatically.
58/94
Analysis
59/94
Notation: Big-O and Asymptotic
Let f, g be positive functions de ned on N.
‚ fpnq“O`gpnq ̆(“f isbig-Oofg”) asnÑ8if
fpnq § C, for all su ciently large n. gpnq
‚ fpnq„gpnq(“f isasymptotictog”) asnÑ8if
lim fpnq“1. nÑ8 gpnq
60/94
Timing Vector/Matrix Operations – FLOPS
‚ One way to measure the “e ciency” of a numerical algorithm is to count the number of oating-point arithmetic operations (FLOPS) necessary for its execution.
‚ The number is usually represented by „ cnp where c and p are given explicitly.
‚ We are interested in this formula when n is large.
61/94
FLOPS for Major Operations
Vector/Matrix Operations
Let x,y P Rn and A,B P Rnˆn. Then
‚ (vector-vector) xTy requires „ 2n ops.
‚ (matrix-vector) Ax requires „ 2n2 ops. ‚ (matrix-matrix) AB requires „ 2n3 ops.
62/94
Cost of PLU Factorization
Note that we only need to count the number of ops required to zero out elements below the diagonal of each column.
‚ For each i ° j, we replace Ri by Ri ` cRj where c “ ́ai,j{aj,j. This requires approximately 2pn ́ j ` 1q ops:
‚ 1 division to form c
‚ n ́j`1multiplicationstoformcRj ‚ n ́j`1additionstoformRi `cRj
‚ Since i P Nrj ` 1, ns, the total number of ops needed to zero out all elements below the diagonal in the jth column is approximately
2pn ́ j ` 1qpn ́ jq. 3
‚ Summing up over j P Nr1, n ́ 1s, we need about p2{3qn ops:
n ́1 n ́1n ́12 ÿ 2pn ́j`1qpn ́jq„2 ÿpn ́jq2 “2 ÿ j2 „ 3n3
j“1 j“1 j“1
63/94
Cost of Forward Elimination and Backward Substitution
Forward Elimination ∞i ́1
j“1 ̧ijyj for i ° 1 requires approximately 2i ‚ Summing over all i P Nr2, ns, we need about n2 ops:
ÿn 2 i „ 2 n 2 “ n 2 . i“2 2
‚ The cost of backward substitution is also approximately n2 ops, which can be shown in the same manner.
‚ The calculation of yi “ —i ́ ops:
‚ 1 subtraction
‚ i ́1multiplications ‚ i ́2additions
Backward Substitution
64/94
Cost of G.E. with Partial Pivoting
Gaussian elimination with partial pivoting involves three steps: ‚ PLU factorization: „ p2{3qn3 ops
‚ Forward elimination: „ n2 ops
‚ Backward substitution: „ n2 ops
Summary
The total cost of Gaussian elimination with partial pivoting is approximately
ops for large n.
23 n 3 ` n 2 ` n 2 „ 23 n 3
65/94
Application: Solving Multiple Square Systems Simultaneously
To solve two systems Ax1 “ b1 and Ax2 “ b2.
Method 1.
‚ UseG.E.forboth.
‚ It takes „ p4{3qn3 ops.
Method 2.
‚ Do it in two steps:
1 Do PLU factorization P A “ LU .
2 Then solve LU x1 “ P b1 and LUx2 “Pb2.
‚ It takes „ p2{3qn3 ops.
%% method 1
x1 = A \ b1;
x2 = A \ b2;
%% method 2
[L, U, P] = lu(A);
x1 = U \ (L \ (P*b1));
x2 = U \ (L \ (P*b2));
%% compact implementation
X = A \ [b1, b2];
x1 = X(:, 1);
x2 = X(:, 2);
66/94
Further Analysis
67/94
Vector Norms
The “length” of a vector v can be measured by its norm. De nition 2 (p-Norm of a Vector)
Let p P r1,8q. The p-norm of v P Rm is denoted by }v}p and is de ned by
ˆ ÿm ̇ 1 { p }v}p “ |vi|p .
When p “ 8,
The most commonly used p values are 1, 2, andgf8:
i“1
}v}8 “ max |vi| .
1§i§m
ÿm feÿm2
}v}1 “ |vi|, }v}2 “ |vi| . i“1 i“1
68/94
Vector Norms
Ingeneral,anyfunction} ̈}:Rm ÑR`Yt0uiscalledavectornormifit satis es the following three properties:
1 }x}“0ifandonlyifx“0.
2 }–x} “ |–|}x} for any constant – and any x P Rm.
3 }x ` y} § }x} ` }y} for any x, y P Rm. This is called the triangle inequality.
69/94
Unit Vectors
‚ A vector u is called a unit vector if }u} “ 1.
‚ Depending on the norm used, unit vectors will be di erent. ‚ For instance:
Figure 1: 1-norm Figure 2: 2-norm Figure 3: 8-norm
70/94
Matrix Norms
The “size” of a matrix A P Rmˆn can be measured by its norm as well. As above,wesaythatafunction} ̈}:Rmˆn ÑR`Yt0uisamatrixnormifit satis es the following three properties:
1 }A}“0ifandonlyifA“0.
2 }–A} “ |–| }A} for any constant – and any A P Rmˆn.
3 }A`B}§}A}`}B}foranyA,BPRmˆn.Thisiscalledthetriangle inequality.
71/94
Matrix Norms (Cont’)
‚ If, in addition to satisfying the three conditions, it satis es }AB}§}A}}B} forallAPRmˆn andallBPRnˆp,
it is said to be consistent.
‚ If, in addition to satisfying the three conditions, it satis es
}Ax}§}A}}x} forallAPRmˆn andallxPRn, then we say that it is compatible with a vector norm.
72/94
Induced Matrix Norms
De nition 3 (p-Norm of a Matrix)
Let p P r1,8s. The p-norm of A P Rmˆn is given by
}A} “ max }Ax}p “ max }Ax} . px‰0}x}}x}“1 p
‚ The de nition of this particular matrix norm is induced from the vector p-norm.
‚ By construction, matrix p-norm is a compatible norm.
‚ Induced norms describe how the matrix stretches unit vectors with respect to the vector norm.
pp
73/94
Induced Matrix Norms
The commonly used p-norms (for p “ 1, 2, 8) can also be calculated by }A}1 “bmax ÿm ˇaijˇ,
1§j§n i“1
}A}2 “ ⁄maxpÿATAq “ ‡maxpAq,
}A}8 “ max n ˇaijˇ. 1§i§m j“1
In words,
‚ The 1-norm of A is the maximum of the 1-norms of all column vectors.
‚ The 2-norm of A is the square root of the largest eigenvalue of ATA. ‚ The 8-norm of A is the maximum of the 1-norms of all row vectors.
74/94
Non-Induced Matrix Norm – Frobenius Norm
De nition 4 (Frobenius Norm of a Matrix)
The Frobenius norm of A P Rmˆn is given by
ˆ ÿm ÿn ˇ ˇ ˇ ˇ 2 ̇ 1 { 2
}A}F “ aij . i“1 j“1
‚ This is not induced from a vector p-norm.
‚ However, both p-norm and the Frobenius norm are consistent and
compatible.
‚ For compatibility of the Frobenius norm, the vector norm must be the 2-norm, that is, }Ax}2 § }A}F }x}2.
75/94
Norms in MATLAB
‚ Vector p-norms can be easily computed:
norm(v, 1) % = sum(abs(v))
norm(v, 2) % = sqrt(v’*v) if v is a column
norm(v, ’inf’) % = max(abs(v))
‚ The same function norm is used to calculate matrix p-norms:
norm(A, 1)
norm(A, 2)
norm(A, Inf)
% = max(sum(abs(A), 1))
% = max(sqrt(eig(A’*A)))
% = max(sum(abs(A), 2))
‚ To calculate the Frobenius norm:
norm(A, ’fro’) % = sqrt(A(:)’*A(:))
% = norm(A(:), 2)
76/94
Conditioning of Solving Linear Systems: Overview
‚ Analyze how robust (or sensitive) the solutions of Ax “ b are to perturbations of A and b.
‚ For simplicity, consider separately the cases where
1 b changes to b ` ”b, while A remains unchanged, that is
Ax“b ›Ñ Apx`”xq“b`”b.
2 A changes to A ` ”A, while b remains unchanged, that is
Ax“b ›Ñ pA`”Aqpx`”xq“b.
77/94
Sensitivity to Perturbation of RHS
Case1.Ax“b Ñ Apx`”xq“b`”b ‚ Bound }”x} in terms of }”b}:
Ax ` A”x “ b ` ”b
A”x “ ”b ùñ }”x} § ÎA ́1ÎΔbÎ.
”x “ A ́1”b
‚ Sensitivity in terms of relative errors:
}”x} ́1 }x} “ }”x} }b} § ÎA
}”b} }”b} }x} }b}
Î }”b} ̈ ÎAÎ }x} “ ÎA ́1ÎÎAÎ. }”b} }x}
78/94
Sensitivity to Perturbation of Matrix
Case2.Ax“b Ñ pA`”Aqpx`”xq“b ‚ Bound }”x} now in terms of }”A}:
Ax ` A”x ` p”Aqx ` p”Aq”x “ b
A”x “ ́p”Aqx ́ p”Aq”x ùñ
”x “ ́A ́1p”Aqx ́ A ́1p”Aq”x ‚ Sensitivity in terms of relative errors:
}”x} À ÎA ́1ÎΔAÎ }x} . ( rst-order truncation)
}”x} ́1 }x} “ }”x} ÎAÎ À ÎA
ΔAΠΔAÎ }x} ÎAÎ
ÎΔAÎ }x} ̈ ÎAÎ “ ÎA ́1ÎÎAÎ. ΔAÎ }x}
79/94
Matrix Condition Number
‚ Motivated by the previous estimations, we de ne the matrix condition
number by ́1 ŸpAq “ ÎA
ÎÎAÎ,
where the norms can be any p-norm or the Frobenius norm.
‚ A subscript on Ÿ such as 1, 2, 8, or F(robenius) is used if clari cation is needed.
80/94
Matrix Condition Number (Cont’)
‚ We can write
}”x} § ŸpAq}”b}, }”x} § ŸpAqΔAÎ,
}x} }b} }x} ÎAÎ
where the second inequality is true only in the limit of in nitesimal
perturbations ”A.
‚ The matrix condition number ŸpAq is equal to the condition number of
solving a linear system of equation Ax “ b.
‚ The exponent of ŸpAq in scienti c notation determines the approximate
number of digits of accuracy that will be lost in calculation of x.
‚ Since 1 “ ÎIÎ “ ÎA ́1AÎ § ÎA ́1ÎÎAÎ “ ŸpAq, a condition number of 1 is
the best we can hope for.
‚ If ŸpAq ° eps ́1, then for computational purposes the matrix is singular.
81/94
Condition Numbers in MATLAB
‚ Use cond to calculate various condition numbers:
cond(A)
cond(A, 1)
cond(A, Inf)
cond(A, ’fro’)
% the 2-norm; or cond(A, 2)
% the 1-norm
% the infinity-norm
% the Frobenius norm
‚ A condition number estimator (in 1-norm) condest(A) % faster than cond
‚ The fastest method to estimate the condition number is to use linsolve function as below:
[x, inv_condest] = linsolve(A, b);
fast_condest = 1/inv_condest;
82/94
Appendix: Row and Column Operations
83/94
Notation: Unit Basis Vectors
Throughout this tutorial, suppose n P N is xed. Let I be the n ˆ n identity matrix and denote by ej its jth column, i.e.,
That is,
—–. . … . fl —– 1 2 n fl 00 ̈ ̈ ̈1
»—1 0 ̈ ̈ ̈ 0fi »— fi I“—0 1 ̈ ̈ ̈ 0 “—e e ̈ ̈ ̈ e .
»—10fi
e1 “ —–. fl, e2 “ —–. fl, ̈ ̈ ̈ , en “ —–. fl.
»—01fi 001
»—0fi
84/94
Notation: Concatenation
Let A P Rnˆn. We can view it as a concatenation of its rows or columns as visualized below.
»— a 1 1 — — a 2 1
a 1 2 a 2 2 .
̈ ̈ ̈ ̈ ̈ ̈
a 1 n fi »— fi »— a 2 n — — — —
–T1 fi –T2
. . –Tn fl
A“ — — .
–an1 an2 ̈ ̈ ̈ annfl –fl –
…
. “ — — a 1 a 2 ̈ ̈ ̈ a n “ — —
85/94
Row or Column Extraction
A row or a column of A can be extracted using columns of I.
Operation
extract the ith row of A extract the jth column of A extract the pi, jq entry of A
Mathematics MATLAB
eTi A Aej eTi Aej
A(i,:)
A(:,j)
A(i,j)
86/94
Elementary Permutation Matrices
De nition 5 (Elementary Permutation Matrix)
For i, j P Nr1, ns distinct, denote by P pi, jq the n ˆ n matrix obtained by interchanging the ith and jth rows of the n ˆ n identity matrix. Such matrices are called elementary permutation matrices.
Example. (n “ 4)
»0 1 0 0fi »0 0 1 0fi
Pp1,2q“—1 0 0 0 , Pp1,3q“—0 1 0 0 , ̈ ̈ ̈ –0 0 1 0fl –1 0 0 0fl
0001 0001
Notable Properties.
‚ Ppi,jq “ Ppj,iq ‚ Ppi,jq2 “ I
87/94
Row or Column Interchange
Elementary permutation matrices are useful in interchanging rows or columns.
Operation Mathematics
–Ti Ø –Tj P pi, jqA ai Ø aj AP pi, jq
MATLAB
A([i,j],:)=A([j,i],:)
A(:,[i,j])=A(:,[j,i])
88/94
Permutation Matrices
De nition 6 (Permutation Matrix)
A permutation matrix P P Rnˆn is a square matrix obtained from the same-sized identity matrix by re-ordering of rows.
Notable Properties.
‚ PT “P ́1
‚ A product of elementary permutation matrices is a permutation matrix.
Row and Column Operations. For any A P Rnˆn, ‚ P A permutes the rows of A.
‚ AP permutes the columns of A.
89/94
Row or Column Rearrangement
Question
Let A P R6ˆ6, and suppose that it is stored in MATLAB. Rearrange rows of A by moving 1st to 2nd, 2nd to 3rd, 3rd to 5th, 4th to 6th, 5th to 4th, and 6th to
1st, that is,
»— –T1 — –T2
— — –T3 — –T4
—– –T5 –T6
fi
݄
fl
»—
— — — — —–
–T6 fi –T1
–T2 –T5
–T3 fl –T4
90/94
Row or Column Rearrangement
Solution.
‚ Mathematically: P A where »— eT6
— eT1
P“— eT2
— eT5 —– eT3
eT4 000100
‚ MATLAB:
fi »—000001fi —100000 “—010000 . —000010 fl—–001000 fl
A = A([6 1 2 5 3 4], 🙂
%shortfor A([123456],:)=A([612534],:)
91/94
Elementary Row Operation and GTM
Let 1 § j † i § n.
‚ TherowoperationRi ÑRi `cRj onAPRnˆn,forsomecPR,canbe
emulated by a matrix multiplication1
pI`ceieTj qA.
‚ In the context of Gaussian elimination, the operation of introducing zeros
below the jth diagonal entry can be done via ÿn T
pI` ci,jeiejqA, 1§j†n. i“j `1
looooooooooomooooooooooon
“Gj
The matrix Gj is called a Gaussian transformation matrix (GTM).
1Many linear algebra texts refer to the matrix in parentheses as an elementary matrix.
92/94
Elementary Row Operation and GTM (cont’)
‚ To emulate pI ` ceieTj qA in MATLAB: A(i,:) = A(i,:) + c*A(j,:);
‚ To emulate in MATLAB:
end
This can be done without using a loop.
GjA “ pI ` A(i,:) = A(i,:) + c*A(j,:);
ci,jeiej qA
ÿn i“j `1
T
for i = j+1:n
c = ….
93/94
Analytical Properties of GTM
‚ GTMs are unit lower triangular matrices.
‚ The product of GTMs is another unit lower triangular matrix. ‚ The inverse of a GTM is also a unit lower triangular matrix.
94/94