程序代写代做代考 Numerical Methods & Scientific Computing: lecture notes

Numerical Methods & Scientific Computing: lecture notes
Data fitting
The matrix 2-norm
Example:
The 2-norm is the natural norm for LSQ problems (minimizing
k r k2) =) can no longer avoid the matrix 2-norm :
for a square matrix A (see ‘MatrixNorms’ for proof) kAk2 = qmax(AT A)
max(AT A) is the largest eigenvalue of AT A
(all eigenvalues are positive since AT A is positive definite).

Numerical Methods & Scientific Computing: lecture notes
Data fitting
Singular value decomposition SVD
It is easier to characterize the condition number in the 2-norm in terms of the singular values of A. To do that we need the
Definition
A m ⇥ n real matrix A has the singular value decomposition A = U⌃VT
where
1 U is m ⇥ m orthogonal matrix
2 ⌃ is a diagonal m⇥n real matrix
3 V is n ⇥ n orthogonal matrix
The non-negative diagonal entries {k 0} in ⌃ are called the singular values of A.
In our case, where m > n, there are n positive singular values
1 2…n >0,ifAisoffullrank.

Numerical Methods & Scientific Computing: lecture notes
Data fitting
The matrix 2-norm
Then from the definition above, we get
Proof:
kAk2 = 1(A)

Numerical Methods & Scientific Computing: lecture notes
Data fitting
The pseudoinverse
The pseudoinverse A† = (AT A)1AT can be expressed in terms of the SVD of A:
A† = V⌃†UT
where ⌃† is the n ⇥ m diagonal matrix, with entries {1/k }. Proof:

Numerical Methods & Scientific Computing: lecture notes
Data fitting
The condition number of a rectangular matrix
By the same argument, we get
By the same argument as before, the condition number of a rectangular matrix is given by
Proof:
kA†k2 = 1/n(A)
2(A) = ||A||2||A†||2 = 1(A)/n(A)

Numerical Methods & Scientific Computing: lecture notes
Data fitting
Sensitivity of the normal equations
The normal equations
are a linear system, so the sensitivity is given by the condition number of
AT A. But Proof:
ATAx = ATb 2(AT A) = 2(A)2

Numerical Methods & Scientific Computing: lecture notes
Data fitting
Sensitivity of the LSQ problem
It turns out : condition number of LSQ problem is
⇡ 2(A) if the fit to the data is good (not much scatter) ⇡ 2(A)2 if the fit to the data is poor (a lot of scatter)
=) using normal equations worsens conditioning of problem (if the fit is good)

Numerical Methods & Scientific Computing: lecture notes
Data fitting
Better ways to solve the LSQ problem
If A is of full rank, use another matrix factorization — the QR factorization
For rank-deficient matrices, use the QR factorization with column pivoting (MATLAB) aka Rank-revealing QR
or the singular value decomposition SVD
We’ll assume A is of full rank.

Numerical Methods & Scientific Computing: lecture notes
Data fitting
QR factorization
The idea of (‘economy-size’) QR factorization is: form a factorization
A = QR
where Q is orthogonal m ⇥ n matrix, R is upper triangular n ⇥ n
matrix i.e. QT Q = In to solve Ax = b,
QRx = b ) QT QRx = QT b ) Rx = QT b so solve a triangular system for x in O(n2) ops!

Numerical Methods & Scientific Computing: lecture notes
Data fitting
Gram-Schmidt process
You’ve seen a QR factorization before (in disguise) in Gram-Schmidt orthogonalization:
given a set of linearly independent vectors {a1,a2···an} forming an n-D subspace of Rm, Gram-Schmidt orthogonaliza- tion produces a set of orthonormal vectors {q1,q2···qn}, an orthonormal basis of the same subspace.
A = QR
i.e. use triangular transformations to produce an orthogonal matrix. We don’t do it this way because it’s numerically unstable; instead we use orthogonal transformations to turn A into R.

Numerical Methods & Scientific Computing: lecture notes
Data fitting
Orthogonal transformations
Orthogonal transformations are good because:
they involve perfectly-conditioned matrices
they don’t change the conditioning of the problem they don’t change the solution of the LSQ problem
Proofs:

Numerical Methods & Scientific Computing: lecture notes
Data fitting
Complexity of QR
The QR factorization takes n2(m n/3) ops
i.e. for m n, ⇡ twice as expensive as Cholesky factorization of normal equations
but allows us to handle a larger class of matrices.
Example
For square systems, can use QR (normwise backward stable) ! takes 2n3/3 ops
twice as expensive as GEPP but no issues re growth factor etc.

Numerical Methods & Scientific Computing: lecture notes
Data fitting
QR in MATLAB
I have described what MATLAB calls ‘economy-size QR’ factorization. A = QR
where Q is orthogonal m ⇥ n matrix, R is upper triangular n ⇥ n matrix. This is all we need for the LSQ problem.
MATLAB by default produces the ‘full QR’ factorization
A = Q ̄ R ̄
where Q ̄ is orthogonal m ⇥ m matrix, R ̄ is upper triangular m ⇥ n matrix
Q ̄=[Q|extraorthog.cols]; R ̄=R0
so that Q ̄ T Q ̄ = Im . The extra columns are never used in the LSQ problem.

Numerical Methods & Scientific Computing: lecture notes
Data fitting
Using QR
Since ATA = RTQTQR = RTR, R is the Cholesky factor of ATA. Hence to solve LSQ problem, we could:
1 [q,r]=qr(A,0);x=r\(q’*b); easiest to understand
2 r=triu(qr(A));x=r\(r’\(A’*b)); better since never need to form Q
3 x=A\b;
\ acting on overdetermined system does the same as 2 (unless AT A is rank-deficient)

Numerical Methods & Scientific Computing: lecture notes
Data fitting
The rank-deficient case
Suppose the matrix A has rank k < n i.e. is rank-deficient or not of full rank. This means the columns of A are not linearly independent. In this case, there is no unique solution, and we usually use the solution with minimum 2-norm. This solution can be found by either of: 1 QR factorization with column pivoting aka Rank revealing QR (RR-QR), based on AE = QR obtained in MATLAB by a 3-output call to qr 2 the SVD In the latter case, let then A=U⌃VT =[U1 U2] ⌃1 0 [V1 V2] 00 xLS = A†b = V1⌃1UT b 11 Numerical Methods & Scientific Computing: lecture notes Data fitting End of Week 8