程序代写代做代考 C AI Euclidean spaces

Euclidean spaces
Linear algebra review
Daniel Hsu (COMS 4771)
For each natural number n, the n-dimensional Euclidean space is denoted by Rn, and it is a vector space over the real field R (i.e., Rn is a real vector space). Vectors v1, . . . , vk ∈ Rn are linearly dependent if there exist c1,…,ck ∈ R, not all zero, such that c1v1 + ··· + ckvk = 0. If v1,…,vk ∈ Rn are not linearly dependent, then we say they are linearly independent. The span of v1, . . . , vk, denoted by span{v1, . . . , vk}, is the space of all linear combinations of v1,…,vk, i.e., span{v1,…,vk} = {c1v1 +···+ckvk : c1,…,ck ∈ R}. The span of a collection of vectors from Rn is a subspace of Rn, which is itself a real vector space in its own right. If v1, . . . , vk ∈ Rn are linearly independent, then they form an (ordered) basis for span{v1, . . . , vk}. In this case, for every vector u ∈ span{v1,…,vk}, there is a unique choice of c1,…,ck ∈ R such that u = c1v1 +···+ckvk.
We agree on a special ordered basis e1, . . . , en for Rn, which we call the standard coordinate basis for Rn. This ordered basis defines a coordinate system, and we write vectors v ∈ Rn in terms of this coordinate system, as v = (v1, . . . , vn) = 􏰉ni=1 viei. The (Euclidean) inner product (or dot product) on Rn will be written either using the transpose notation, u⊤v, or the angle bracket notation, ⟨u,v⟩. In terms of their coordinates u = (u1,…,un) and v = (v1,…,vn), we have
n
⟨u,v⟩ = 􏰊uivi.
i=1
The (Euclidean) norm will be written as ∥v∥2 = 􏰌⟨v, v⟩. The inner product satisfies the Cauchy-Schwarz
inequality,
as well as the polarization identity
4
The vectors e1,…,en are orthogonal, i.e., ⟨ei,ej⟩ = 0 for i ̸= j. Each ei is also a unit vector, i.e., ∥ei∥2 = 1.
A collection of orthogonal unit vectors is said to be orthonormal, so the basis e1, . . . , en is orthonormal. Linear maps
Linear maps A: Rn → Rm between Euclidean spaces Rn and Rm are written as matrices in Rm×n, using the standard coordinate bases in the respective Euclidean spaces:
⟨u, v⟩ ≤ ∥u∥2∥v∥2, u, v ∈ Rn, ⟨u,v⟩=∥u+v∥2−∥u−v∥2, u,v∈Rn.
 A1,1 · · · A1,n  …
A=. .. .. Am,1 ··· Am,n
The adjoint A⊤ : Rm → Rn of a linear map A: Rn → Rm is written using the transpose notation: ⟨u,Av⟩ = ⟨A⊤u,v⟩, u ∈ Rm;v ∈ Rn.
1

In matrix notation, we also have
Note that (A⊤)⊤ = A. Composition of linear maps A: Rn → Rm and B: Rp → Rn is obtained by matrix multiplication: C = AB, where
n
Ci,j = 􏰊Ai,kBk,j, i = 1,…,m;j = 1,…,p.
k=1
The adjoint of the composition AB is the composition of the adjoints in reverse order: (AB)⊤ = B⊤A⊤. In
the context of matrix multiplication, vectors v ∈ Rn shall be regarded as column vectors, so A1,1 ··· A1,n v1 􏰉nj=1 A1,jvj 
….. Av= . .. . .= . .
Am,1 ··· Am,n vn 􏰉nj=1 Am,jvj
If the j-th column of A is aj ∈ Rm (for each j = 1,…,n), then Av = 􏰉nj=1 vjaj. Note that this is consistent with the transpose notation for inner products u⊤v. If the i-th row of A is a⊤i for some ai ∈ Rn (for each i = 1,…,m), then Av = (a⊤1v,…,a⊤mv). The outer product of vectors u ∈ Rm and v ∈ Rn refers to uv⊤ ∈Rm×n:
and
rank(A) = rank(A⊤)
n = dim(null(A)) + rank(A).
A1,1 ··· Am,1 ⊤…
A=. .. .. A1,n ··· An,m
u1  u1v1 ··· u1vn  ⊤.􏰑􏰒…
uv=.v1···vn=. .. .. um umv1 ··· umvn
Fundamental subspaces
With every linear map A: Rn → Rm, we associate four fundamental subspaces: range(A), range(A⊤), null(A), and null(A⊤). The range of a linear map A: Rn → Rm, denoted range(A), is the subspace {Av : v ∈ Rn} ⊆ Rm. Its dimension is the rank of A, denoted rank(A). When A ∈ Rm×n is regarded as a matrix, the range of A is the same as the column space of A (so if the columns of A are the vectors a1,…,an ∈ Rn, range(A) = span{a1,…,an}). The row space of A is the column space of A⊤. The null space of A, denoted null(A), is the subspace {v ∈ Rn : Av = 0} ⊆ Rn. We always have
In particular, if rank(A) = n, which is equivalent to the columns of A being linearly independent, then null(A) = {0}. The subspaces range(A) and null(A⊤) are orthogonal, written range(A) ⊥ null(A⊤), meaning every u ∈ range(A) and v ∈ null(A⊤) have ⟨u,v⟩ = 0. Similarly, range(A⊤) and null(A) are orthogonal.
We can obtain an orthonormal basis v1 , v2 , . . . for range(A) using the Gram-Schmidt orthogonalization process, which is given as follows. Let a1,…,an denote the columns of A. Then for i = 1,2,…: (1) if all aj are zero, thenstop;(2)pickanon-zeroaj;(3)letvi :=aj/∥aj∥2;(4)replaceeachaj withaj −⟨vi,aj⟩vi.
Linear operators
A linear operator A: Rn → Rn (where range(A) is a subspace of the domain) is represented by a square matrix A ∈ Rn×n. We say A is singular if dim(null(A)) > 0; if dim(null(A)) = 0, we say A is non-singular.
2

The identity map is denoted by I (or sometimes In to emphasize that it is the identity operator I : Rn → Rn for Rn), and I is clearly non-singular. Its matrix representation is
1 0 ··· 0
0 1 ··· 0 I=. . .. .,
. . . . 00···1
i.e., every diagonal entry is 1 and every off-diagonal entry is 0. (A matrix is diagonal if all of its off-diagonal entries are 0.) A linear operator is non-singular if and only if it has an inverse, denoted A−1, that satisfies AA−1 = A−1A = I. (So a synonym for non-singular is invertible.) A linear operator A is self-adjoint (or equivalently, its matrix representation is symmetric) if A = A⊤. If A and B are non-singular, then so is their composition AB; the inverse of AB is (AB)−1 = B−1A−1. Also, if A is non-singular, then so is A⊤, and its inverse is denoted by A−⊤.
A linear operator A is orthogonal if A⊤ is its inverse, i.e., A⊤ = A−1. From the matrix equation A⊤A = I, we see that if a1,…,an ∈ Rn are the columns of A, then A is orthogonal if and only if the vectors a1,…,an are orthonormal. If A is orthogonal, then for any vector v ∈ Rn, we have ∥Av∥2 = ∥v∥2 (Parseval’s identity).
A linear operator A: Rn → Rn is a projection operator (or projector) if AA = A (i.e., A is idempotent), Ax=xforallx∈range(A),andeveryx∈Rn canbewrittenuniquelyasx=y+zforsomey∈range(A) and z ∈ null(A) (i.e., Rn is the direct sum of range(A) and null(A), written Rn = range(A) ⊕ null(A)). A projector A is an orthogonal projection operator (or orthoprojector) if A = A⊤. Every subspace of Rn has a unique orthoprojector. A generic way to obtain the orthoprojector Π for a subspace S is to start with an orthonormal basis u1, u2, . . . for S, and then form the sum of outer products Π := 􏰉i uiu⊤i . For any orthoprojector Π, we have the Pythagorean theorem
∥v∥2 = ∥Πv∥2 + ∥(I − Π)v∥2, v ∈ Rn. In particular, for any v ∈ Rn and u ∈ range(Π),
∥u−v∥2 =∥Πv−u∥2 +∥v−Πv∥2.
Put another way, the orthogonal projection of v ∈ Rn is the closest point in range(Π) to v.
Determinants
The determinant of A ∈ Rn×n, written det(A), is defined by
n
det(A) = 􏰊 sgn(σ) 􏰋 Ai,σ(i),
σ i=1
where the summation is over all permutations σ of {1, . . . , n}, and sgn(σ) is the sign of the permutation σ (which takes value either 1 or −1). When the n2 entries of the matrix A are viewed as formal variables, the
determinant can be regarded as a degree-n multivariate polynomial in these variables. A linear operator A is non-singular if and only if det(A) ̸= 0.
Eigenvectors and eigenvalues
A scalar λ is an eigenvalue of a linear operator A: Rn → Rn if there is a non-zero vector v ̸= 0 such that Av = λv. This vector v is an eigenvector corresponding to the eigenvalue λ. Note that corresponding eigenvectors are not unique; if v is an eigenvector corresponding to λ, then so is cv for any c ̸= 0. Every
3

linear operator has eigenvalues. To see this, observe the following equivalences:
λ is an eigenvalue of A
⇔ there exists v ̸= 0 such that Av = λv
⇔ there exists v ̸= 0 such that (A−λI)v = 0
⇔ A − λI is singular
⇔ det(A−λI)=0.
The function λ 􏰜→ det(A − λI) is a degree-n polynomial in λ, and hence it has n roots (where some roots may be repeated, and some may be complex).1 The determinant of A ∈ Rn×n is the product of its n eigenvalues.
An important special case is when A is self-adjoint (i.e., A is symmetric). In this case, all n of its eigenvalues λ1, . . . , λn are real, and all corresponding eigenvectors are vectors in Rn. In particular, there is a collection of n corresponding eigenvectors v1, . . . , vn, where vi corresponds to λi, such that v1, . . . , vn are orthonormal, and A = 􏰉ni=1 λivivi⊤. When all of the eigenvalues are non-negative, we say A is positive semi-definite (psd); when all of the eigenvalues are positive, we say A is positive definite.
The trace of a matrix A ∈ Rn×n, denoted tr(A), is the sum of its diagonal entries. The sum of the n eigenvalues of A is equal to the trace of A. For symmetric matrices A, this fact can be easily deduced from the fact that tr: Rn×n → R is linear. Indeed, let v1,…,vn be orthonormal eigenvectors corresponding to the eigenvalues λ1, . . . , λn of A. Then
􏰅n􏰆nn tr(A) = tr 􏰊 λivivi⊤ = 􏰊 λi tr(vivi⊤) = 􏰊 λi.
i=1 i=1 i=1
The last step uses the fact that tr(vivi⊤) = tr(vi⊤vi) = vi⊤vi = 1. This is a special case of a more general
identity: if A and B are matrices such that both AB and BA are square, then tr(AB) = tr(BA).
1If you are not a fan of determinants, you may prefer the approach from http://www.axler.net/DwD.pdf to deduce the existence of eigenvalues.
4