CS计算机代考程序代写 finance Principal Components Analysis

Principal Components Analysis
Chris Hansman
Empirical Finance: Methods and Applications Imperial College Business School
February 15-16
1/86

Today: Four Parts
1. Geometric Interpretation of Eigenvalues and Eigenvectors 2. Geometric Interpretation of Correlation Matricies
3. An Introduction to PCA
4. An Example of PCA
2/86

Topic 1: Geometry of Eigenvalues and Eigenvectors
1. Technical definitions of eigenvalues and eigenvectors 2. Geometry of matrix multiplication: rotate and stretch 3. Eigenvectors are only stretched
4. Length of eigenvectors doesn’t matter
3/86

A Review of Eigenvalues and Eigenvectors
􏰒 Consider a square n×n matrix A.
􏰒 An eigenvalue λi of A is a (1×1) scalar:
􏰒 The corresponding eigenvector of a ⃗vi is an (n × 1) vector 􏰒 Whereλi,⃗vi satisfy:
A⃗vi =λi⃗vi
4/86

Geometric Interpretation of Eigenvalues and Eigenvectors
􏰒 Consider the square n×n matrix A.
􏰒 A times any (n×1) vector gives an (n×1) vector
􏰒 Useful to think of this as a linear function that: 􏰒 Takes n×1 vectors as inputs
􏰒 Gives n×1 vectors as outputs:
f : Rn → Rn
􏰒 Specifically, for the input vector⃗v, this is the function that outputs: f (⃗v ) = A⃗v
5/86

Geometric Interpretation of Eigenvalues and Eigenvectors
􏰒 Consider the square n×n matrix A.
􏰒 Think of this matrix as the function that maps vectors to vectors:
􏰒 Lets say
f (⃗v ) = A⃗v
􏰉5 0􏰊 􏰉2􏰊 A= 2 3 and⃗v= 1
􏰒 Whatisf(⃗v)? 􏰒 menti.com
6/86

The Matrix A Can be Thought of as a Function
􏰒 Consider the square n×n matrix A.
􏰒 Think of this matrix as the function that maps vectors to vectors:
􏰒 Lets say
􏰒 Whatisf(⃗v)?
f (⃗v ) = A⃗v
􏰉5 0􏰊 􏰉1􏰊
A= 2 3 and⃗v= 0
􏰉5􏰊 f(⃗v)=Ax= 2
7/86

The Matrix A Rotates and Stretches a Vector ⃗v 􏰒 Lets say
􏰉5 0􏰊 􏰉1􏰊 􏰉5􏰊 A= 2 3 and⃗v= 0 ⇒A⃗v= 2
v=(1 0)’
8/86

The Matrix A Rotates and Stretches a Vector ⃗v 􏰒 Lets say
􏰉5 0􏰊 􏰉1􏰊 􏰉5􏰊 A= 2 3 and⃗v= 0 ⇒A⃗v= 2
v=(1 0)’
Av=(5 2)’
9/86

The Matrix A Rotates and Stretches a Vector ⃗v 􏰒 Lets say
􏰉5 0􏰊 􏰉−1􏰊 􏰉−5􏰊 A= 2 3 and⃗v= 2 ⇒A⃗v= 4
v=(−1 2)’
10/86

The Matrix A Rotates and Stretches a Vector ⃗v 􏰒 Lets say
􏰉5 0􏰊 􏰉−1􏰊 􏰉−5􏰊 A= 2 3 and⃗v= 2 ⇒A⃗v= 4
Av=(−5 4)’
v=(−1 2)’
11/86

The Matrix A Rotates and Stretches a Vector ⃗v 􏰒 Lets say
􏰉5 0􏰊 􏰉0􏰊 􏰉0􏰊
A= 2 3 andv⃗2= 1 ⇒Av⃗2= 3 =3v⃗2
v2=(0 1)’
12/86

The Matrix A Rotates and Stretches a Vector ⃗v 􏰒 Lets say
􏰉5 0􏰊 􏰉0􏰊 􏰉0􏰊
A= 2 3 andv⃗2= 1 ⇒Av⃗2= 3 =3v⃗2
Av2=(0 3)’
v2=(0 1)’
13/86

For Some Vectors⃗v, Matrix A Only Stretches 􏰒 Lets say
􏰉5 0􏰊 􏰉0􏰊 􏰉0􏰊
A= 2 3 andv⃗2= 1 ⇒Av⃗2= 3 =3v⃗2
Av2=(0 3)’ = 3v2
v2=(0 1)’
14/86

For Some Vectors⃗v, Matrix A Only Stretches 􏰒 Lets say
􏰒
􏰉5 0􏰊 􏰉0􏰊 􏰉0􏰊
A= 2 3 andv⃗2= 1 ⇒Av⃗2= 3 =3v⃗2
􏰉0􏰊
Some vectors, like v⃗2 = 1 have a special relationship with A:
􏰒 The matrix A only stretches v⃗2 􏰒 No rotations!
􏰉0􏰊 Are there any other vectors like v⃗2 = 1 ?
􏰉1􏰊 Lets try v⃗1 = 1
􏰒 􏰒
15/86

For Some Vectors⃗v, Matrix A Only Stretches 􏰒 Lets say
􏰉5 0􏰊 􏰉1􏰊 􏰉5􏰊
A= 2 3 andv⃗1= 1 ⇒Av⃗1= 5 =1v⃗1
v1=(1 1)’
16/86

For Some Vectors⃗v, Matrix A Only Stretches 􏰒 Lets say
􏰉5 0􏰊 􏰉1􏰊 􏰉5􏰊
A= 2 3 andv⃗1= 1 ⇒Av⃗1= 5 =1v⃗1
Av1=(5 5)’
v1=(1 1)’
17/86

For Some Vectors⃗v, Matrix A Only Stretches 􏰒 Lets say
􏰉5 0􏰊 􏰉1􏰊 􏰉5􏰊
A= 2 3 andv⃗1= 1 ⇒Av⃗1= 5 =5v⃗1
Av1=(5 5)’ = 5v1
v1=(1 1)’
18/86

For Some Vectors⃗v, Matrix A Only Stretches
􏰒 For the matrix A, we’ve found two vectors with this special property: 􏰉1􏰊 􏰉5􏰊
v⃗1= 1 withAv⃗1= 5 = 5 v⃗1 􏰐􏰏􏰎􏰑
λ1
􏰉0􏰊 􏰉0􏰊
v⃗2= 1 withAv⃗2= 3 = 3 v⃗2
􏰐􏰏􏰎􏰑
λ1 􏰒 We call these vectors eigenvectors of the matrix A
19/86

For Some Vectors⃗v, Matrix A Only Stretches
􏰒 For the matrix A, we’ve found two vectors with this special property: 􏰉1􏰊 􏰉5􏰊
v⃗1= 1 withAv⃗1= 5 = 5 v⃗1 􏰐􏰏􏰎􏰑
λ1
􏰉0􏰊 􏰉0􏰊
v⃗2= 1 withAv⃗2= 3 = 3 v⃗2
􏰐􏰏􏰎􏰑
λ2 􏰒 Note that they get stretched by different factors
􏰒 5forv⃗1,3forv⃗2
􏰒 We call these stretching factors eigenvalues:
λ1=5, λ2=3
20/86

Defining Eigenvalues and Eigenvectors
􏰒 This notion of only stretching is the defining feature of eigenvalues and eigenvectors:
􏰒 Eigenvalue λi and corresponding eigenvector ⃗vi are λi , ⃗vi such that: A⃗vi =λi⃗vi
􏰒 In our example:
􏰒 And:
A⃗v1 = λ1⃗v1
􏰉5 0􏰊􏰉1􏰊 􏰉1􏰊
231=51
A⃗v2 = λ2⃗v2
􏰉5 0􏰊􏰉0􏰊 􏰉1􏰊
231=31
21/86

Length of Eigenvector Doesn’t Change Anything
􏰒 Imagine multiplying an eigenvector by some constant (e.g. 1 ): 2
􏰉5 0􏰊 􏰉1􏰊 􏰉5􏰊
A= 2 3 andv⃗1= 1 ⇒Av⃗1= 5 =5v⃗1
Av1=(5 5)’ = 5v1
v1=(1 1)’
22/86

Length of Eigenvector Doesn’t Change Anything
􏰒 Imagine multiplying an eigenvector by some constant (e.g. 1 ): 2
􏰉5 0􏰊 􏰉0.5􏰊 􏰉2.5􏰊
A= 2 3 andv⃗1= 0.5 ⇒Av⃗1= 2.5 =5v⃗1
v1=(0.5 0.5)’
22/86

Length of Eigenvector Doesn’t Change Anything
􏰒 Imagine multiplying an eigenvector by some constant (e.g. 1 ): 2
􏰉5 0􏰊 􏰉0.5􏰊 􏰉2.5􏰊
A= 2 3 andv⃗1= 0.5 ⇒Av⃗1= 2.5 =5v⃗1
Av1=(2.5 2.5)’ = 5v1
v1=(0.5 0.5)’
22/86

Length of Eigenvector Doesn’t Change Anything
􏰒 Imagine multiplying an eigenvector by some constant (e.g. 2): 􏰉5 0􏰊 􏰉0􏰊 􏰉0􏰊
A= 2 3 andv⃗2= 2 ⇒Av⃗2= 6 =3v⃗2
Av2=(0 3)’ = 3v2
v2=(0 1)’
23/86

Length of Eigenvector Doesn’t Change Anything
􏰒 Imagine multiplying an eigenvector by some constant (e.g. 2): 􏰉5 0􏰊 􏰉0􏰊 􏰉0􏰊
A= 2 3 andv⃗2= 2 ⇒Av⃗2= 6 =3v⃗2
v2=(0 2)’
23/86

Length of Eigenvector Doesn’t Change Anything
􏰒 Imagine multiplying an eigenvector by some constant (e.g. 2): 􏰉5 0􏰊 􏰉0􏰊 􏰉0􏰊
A= 2 3 andv⃗2= 2 ⇒Av⃗2= 6 =3v⃗2
Av2=(0 6)’
v2=(0 2)’
23/86

Length of Eigenvector Doesn’t Change Anything
􏰒 Imagine multiplying an eigenvector by some constant (e.g. 2): 􏰉5 0􏰊 􏰉0􏰊 􏰉0􏰊
A= 2 3 andv⃗2= 2 ⇒Av⃗2= 6 =3v⃗2
Av2=(0 6)’ = 3v2
v2=(0 2)’
23/86

Length of Eigenvector Doesn’t Change Anything
􏰒 Any multiple of an eigenvector is also an eigenvector: 􏰒 if vi is an an eigenvector, so is cvi for any scalar c.
􏰒 As a result, often normalize them so that they have unit length 􏰒 i.e. vi′vi =1
􏰒 Best to think of an eigenvector vi as a direction 􏰒 Think of eigenvalue λi as a stretching factor
24/86

Finding Eigenvalues of Symmetric Matricies
􏰒 From here, focus on symmetric matricies (like covariance Σx ) 􏰒 How do we calculate the eigenvalues?
􏰒 Use a computer
􏰒 But if you have to, in the 2×2 case:
􏰉a b􏰊 A=bd
(a+d)+􏰺(a−d)2 +4b2 2 (a+d)−􏰺(a−d)2 +4b2 2
􏰉7 0􏰊 A=02
λ1 =
λ2 = 􏰒 What are the eigenvalues of:
􏰒 Menti.com
25/86

Finding Eigenvectors of Symmetric Matricies
􏰒 From here, we will focus on symmetric matricies (like Σx )
􏰒 Given the eigenvalues, how do we calculate the eigenvectors?
􏰒 Again, use a computer
􏰒 But if you have to, simply use:
Avi =λivi
􏰒 Important note: Symmetric matrices have orthogonal eigenvectors,
that is:
for any i ̸=j
vi′vj =0
26/86

Finding Eigenvalues of Diagonal Matrices
􏰒 Diagonal matrices are a subset of symmetric matrices: 􏰉a 0􏰊
A=0d
􏰒 How do we calculate the eigenvalues and eigenvectors?
27/86

Topic 2: Geometric Interpretations of Correlation Matrices
1. Uncorrelated assets: eigenvalues are variances
2. Correlated assets: first eigenvector finds direction of maximum variance
28/86

􏰉za􏰊 Uncorrelated Standardized Data: z = zb
z2
−10 −5 0 5 10
−10 −5 0 5 10 z1
􏰉1 0􏰊 Cov(z)=Σz= 0 1
29/86

􏰉xa􏰊 Uncorrelated (Non-Standardized) Data: x = xb
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
􏰉4 0􏰊 Cov(x)=Σx= 0 4
30/86

􏰉xa􏰊 Uncorrelated (Non-Standardized) Data: x = xb
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
􏰉3 0􏰊 Cov(x)=Σx= 0 1
31/86

􏰉xa􏰊 Uncorrelated (Non-Standardized) Data: x = xb
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
􏰉3 0􏰊 Cov(x)=Σx= 0 1
V2
V1
31/86

Eigenvalues of Σx with Uncorrelated Data 􏰉3 0􏰊
Σx= 0 1
􏰒 What are the eigenvalues and eigenvectors of Σx ?
􏰒 Uncorrelated assets: eigenvalues are variances of each asset return! 􏰒 Eigenvectors:
􏰉1􏰊 􏰉0􏰊 v1= 0 , v2= 1
􏰒 First eigenvalue points in the direction of the largest variance 􏰒 We sometimes write the eigenvectors together as a matrix:
􏰉1 0􏰊 Γ=(v1 v2)= 0 1
32/86

􏰉xa􏰊 Uncorrelated (Non-Standardized) Data: x = xb
||V1||=λ1=3
V2
V1
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
􏰉3 0􏰊 Cov(x)=Σx= 0 1
32/86

􏰉xa􏰊 Uncorrelated (Non-Standardized) Data: x = xb
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
􏰉1 0􏰊 Cov(x)=Σx= 0 3
33/86

Eigenvalues of Σx with Uncorrelated Data 􏰉1 0􏰊
Σx= 0 3
􏰒 What are the eigenvalues and eigenvectors of Σx ?
􏰒 With uncorrelated assets eigenvalues are just the variances of each asset return!
􏰒 Eigenvectors:
􏰉0􏰊 􏰉1􏰊 v1= 1 , v2= 0
􏰒 Note that the first eigenvalue points in the direction of the largest variance
􏰒 We sometimes write the eigenvectors together as a matrix:
􏰉0 1􏰊 Γ=(v1 v2)= 1 0
34/86

􏰉xa􏰊 Correlated Data: z = xb
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
􏰉2 1􏰊 Cov(x)=Σx= 1 2
35/86

Eigenvalues of Σx with Correlated Data 􏰉2 1􏰊
Σx= 1 2
􏰒 What are the eigenvalues and eigenvectors of Σx ?
􏰒 With correlated assets eigenvalues are a bit trickier
􏰒 The eigenvalues are 3 and 1
􏰒 Which of the following is not an eigenvector of Σx ? 􏰇1 −􏰇1 2􏰇1
􏰇1 􏰇1  􏰇1 r=2,w=2,s=2
􏰒 menti.com
222
36/86

􏰉xa􏰊 Correlated Data: z = xb
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
􏰉2 1􏰊 Cov(x)=Σx= 1 2
V1
37/86

Eigenvectors are Γ = (v1 v2) = 2 2 22
V2=(−1 1)’ V1=(1 1)’
−10 −5 0 5 10 xa
􏰉2 1􏰊 Cov(x)=Σx= 1 2
􏰇1 −􏰇1 􏰇1 􏰇1
xb
−10 −5 0 5 10
37/86

Eigenvectors of Σx with Correlated Data 􏰉2 1􏰊
Σx= 1 2
􏰇1 −􏰇1
􏰇1 􏰇1 Γ=(v1 v2)= 2 2
22
􏰒 Just as with uncorrelated data, first eigenvector finds the direction with the most variability
􏰒 Second eigenvector points in the direction that explains the maximum amount of the remaining variance
􏰒 Note that the two are perpendicular
􏰒 This is the geometric implication of the fact that they are orthogonal:
vi′vj =0
􏰒 The fact that they are orthogonal also implies:
Γ′ = Γ−1
38/86

Eigenvalues of Σx with Correlated Data 􏰉2 1􏰊
Σx= 1 2 λ1=3 λ2=1
􏰒 The eigenvalues are the same as our uncorrelated data
􏰒 Note that the scatter plot looks quite similar to our uncorrelated data
􏰒 Just rotated a bit
􏰒 Imagine rotating the data so that the first eigenvector is lined up with the x-axis
􏰒 The first eigenvalue is the variance (along the x axis) of this rotated data
􏰒 The second eigenvalue is the variance along the y axis
39/86

Eigenvalues Represent Variance along the Eigenvectors
V2=(−1 1)’ V1=(1 1)’
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
􏰉2 1􏰊 Cov(x)=Σx= 1 2
40/86

Eigenvalues Represent Variance along the Eigenvectors
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
􏰉2 1􏰊 Cov(x)=Σx= 1 2
41/86

What is This Rotation
􏰒 So with a little rotation, we take our data drawn from 􏰉xa􏰊
With
x= xb
􏰉2 1􏰊
Σx= 1 2
􏰒 And back what looks like our uncorrelated data, which was
generated by
̃ 􏰉3 0􏰊 Σ=01
􏰒 How do we rotate x into this uncorrelated data? Γ′x
42/86

Topic 3: Introduction to Principal Components Analysis
43/86

Principal Components Analysis
􏰒 This notion of rotation underlies the concept of Principal Components Analysis
􏰒 Consider a general cross-section of returns on m assets xt
􏰐􏰏􏰎􏰑
m×1
E[xt] = α Cov(xt) = Σx
44/86

Principal Components Analysis
xt 􏰐􏰏􏰎􏰑
m×1 E[xt] = α Cov(xt) = Σx
􏰒 Define the normalized asset returns: x ̃t = xt − α 􏰒 Let the eigenvalues of Σx be given by:
λ1 ≥λ2 ≥λ3 ≥···≥λm 􏰒 Let the eigenvectors be given by:
v1,v2,v3,···vm
45/86

Principal Components Analysis
Cov(xt) = Σx
􏰒 Note that the eigenvectors are orthogonal vi′vj =0
􏰒 Because the scaling doesn’t matter, we can normalize: vi′vi =1
􏰒 These scaled, normalized vectors are called orthonormal
􏰒 As before, let Γi be the matrix with eigenvectors as columns:
Γi = [v1 v2 ···vm]
46/86

Principal Components Analysis
􏰒 Define the Principal components variables as the rotation: p = Γ ′ x ̃ t
􏰒 Or written out further:
􏰒E[p]= 0 􏰐􏰏􏰎􏰑
m×1
􏰐􏰏􏰎􏰑
m×1
 v 1′ ( x t − α )  v′(xt −α)
2 p= .  
v m′ ( x t − α )
47/86

Principal Components Analysis
􏰒 Recall the eigendecomposition:
Σx = ΓΛΓ′
􏰒 Where
􏰒 Hence
λ1 ··· 0 …
Λ= . .. .  0 ··· λm
Cov(p) = Cov(Γ′x ̃ ) = Γ′Cov(x )Γ tt
= Γ′ΓΛΓ′Γ = Λ
48/86

Aside: Variance Decomposition
􏰒 A nice result from linear algebra:
mm ∑var(xit)= ∑λi
i=1 i=1
􏰒 So the proportion of the total variance of xi that is explained by the
largest eigenvalue λi is simply:
λi ∑mi=1 λi
49/86

Principal Components Analysis
􏰒 Our Principal components variables provide a transformation of the data into variables that are:
􏰒 Uncorrelated (orthogonal)
􏰒 Ordered by how much of the total variance they explain (size of
eigenvalue)
􏰒 What if we have many m, but the first few (2, 5, 20) Principal components explain most of the variation:
􏰒 Idea: Use these as “factors” 􏰒 Dimension reduction!
50/86

Principal Components Analysis
􏰒 Note that because Γ′ = Γ−1
xt =α+Γp
􏰒 We can also partition Γ into the first K < m eigenvectors and the remaining m−K Γ = [Γ1 Γ2] 􏰒 Partition p into its first K elements and the remaining m − K 􏰒 We can then write 􏰉p1􏰊 p= p2 xt =α+Γ1p1+Γ2p2 51/86 Principal Components Analysis xt =α+Γ1p1+Γ2p2 􏰒 This looks just like a factor model: 􏰒 With: 􏰒 B=Γ1 􏰒 f = p1 􏰒 εt =Γ2p2 􏰒 One minor difference: xt = α + Bft + εt Cov(εt) = Ψ = Γ2Λ2Γ′2 which is (likely) not diagonal, as assumed in factor analysis 52/86 Implementing Principal Components Analysis xt =α+Γ1p1+Γ2p2 􏰒 Recall the sample covariance matrix: Σˆ x = 1 X ̃ X ̃ ′ T 􏰒 Calculate this, and perform the eigendecomposition (using a computer): Σˆx =ΓΛΓ′ 􏰒 We now have everything need to compute the sample Principal components at each t: P = [ p 1 p 2 · · · p T ] = Γ ′ X ̃ 􏰐􏰏􏰎􏰑 m×T 53/86 An Example of Principal Components Analysis 1. Principal Components on Yield Changes for US Treasuries 2. Most Patterns Explained by 2 Principal Components 54/86 Recall: Implementing Principal Components Analysis 􏰒 In each period t, we see a vector of asset returns xt: 􏰒 We’ll consider the covariance of asset returns Cov(xt) = Σx 􏰒 Today, we’ll consider the yields on US Treasury debt 􏰒 Yields on T-bills, notes and bonds 􏰒 Constant Maturity Treasuries from 3 months to 20 years 55/86 The Yield Curve: May 24th, 2001 6 4 2 0 36 12 24 36 60 84 120 240 Maturity (Months) ●● ● ● ● ● ● ● ● 56/86 Yield The Yield Curve: January 6th, 2003 6 4 2 0 36 12 24 36 60 84 120 240 Maturity (Months) ● ●● ● ● ● ● ● ● 57/86 Yield The Yield Curve: January 4th, 2005 6 4 2 0 36 12 24 36 60 84 120 240 Maturity (Months) ● ● ● ● ● ● ● ● ● 58/86 Yield Constant Maturity Treasury Yields: 2001-2005 6 4 key DGS1 DGS10 DGS2 DGS20 DGS3 DGS3MO DGS5 DGS6MO DGS7 Yield 2 0 400 800 1200 Date 59/86 Implementing Principal Components Analysis 􏰒 What patterns do you see in the data? 􏰒 What two or three things characterize the data in a given period? 􏰒 Idea behind principal components: 􏰒 Can we choose a small number of variables that characterize most of the variation in the data? 􏰒 Do these variables have an intuitive meaning that helps us understand the data itself 60/86 Technical Point: We Will Actually Examine Differences 6 4 key DGS1 DGS10 DGS2 DGS20 DGS3 DGS3MO DGS5 DGS6MO DGS7 Yield 2 0 400 800 1200 Date xnt =Ynt−Ynt−1 61/86 Technical Point: We Will Actually Examine Differences 0.2 0.0 −0.2 −0.4 key DGS1 DGS10 DGS2 DGS20 DGS3 DGS3MO DGS5 DGS6MO DGS7 Yield Difference 0 400 800 1200 Date xnt =Ynt−Ynt−1 62/86 Setting up The Basics 􏰒 xt is cross-section of yield chages (at time t): Y3 month,t −Y3 month,t−1 Y6 month,t −Y6 month,t−1 xt =  .  . Y20 Year,t −Y20 Year,t−1 􏰒 We denote the mean by: α = E[xt] 􏰒 And will work with demeaned values: x ̃t = xt − α 63/86 Implementing Principal Components Analysis 􏰒 First step: computing covariance matrix Σˆx =coˆv(x ̃t) 􏰒 Below is the (slightly more intuitive) correlation matrix: 1.000 0.815 0.641 0.815 1.000 0.863 0.641 0.863 1.000 0.480 0.696 0.874 0.443 0.651 0.833 0.395 0.592 0.779 0.341 0.539 0.733 0.312 0.499 0.694 0.236 0.411 0.610 0.480 0.443 0.696 0.651 0.874 0.833 1.000 0.972 0.972 1.000 0.919 0.955 0.882 0.922 0.842 0.888 0.761 0.812 0.395 0.341 0.312 0.592 0.539 0.499 0.779 0.733 0.694 0.919 0.882 0.842 0.955 0.922 0.888 1.000 0.974 0.953 0.974 1.000 0.979 0.953 0.979 1.000 0.894 0.942 0.960 0.236 0.411 0.610 0.761 0.812 0.894 0.942 0.960 1.000 64/86 Implementing Principal Components Analysis 􏰒 With Σˆ x = coˆv (x ̃t ) in hand, can perform the eigendecomposition: Σˆx =ΓΛΓ′ 􏰒 Where Λ is the diagonal matrix of eigenvalues λ1 ··· 0 ... Λ= . .. .  0 ··· λm 􏰒 And Γ is the orthogonal matrix of eigenvectors 􏰒 This is easy in R: eigen(Sigma x) 65/86 Implementing Principal Components Analysis 􏰒 We can now construct the Principal components variables: p=Γ′(xt −α) 􏰐 􏰏􏰎 􏰑 m×1 􏰒 This creates 9 new principal components variables 􏰒 Each is a linear combination of our original 9 yields 􏰒 Can use them to recreate xt xt =α+Γp 􏰒 First variable now explains largest proportion of variance 66/86 Examining the First Few Principal Components 􏰒 By examining the first few factors, we get a sense of the key drivers of these yields 􏰒 In particular, exploring the first few columns of Γ (the Loadings) helps us interpret/name these drivers 􏰒 Examining the factors themselves gives a sense of how they have changed over time 67/86 Loadings on First Principal Component 0.0 −0.1 −0.2 −0.3 −0.4 3 Month 6 Month 1 Year 2 Year 3 Year Treasury 5 Year 7 Year 10 Year 20 Year 68/86 Loading (Cumulative) First Principal Component: 2001-2005 6 4 2 0 0 400 800 1200 Date 69/86 First Principal Component Constant Maturity Treasury Yields: 2001-2005 6 4 key DGS1 DGS10 DGS2 DGS20 DGS3 DGS3MO DGS5 DGS6MO DGS7 Yield 2 0 400 800 1200 Date 70/86 First Principal Component: Interpretation 􏰒 First principal component measures level shifts in the yield curve 􏰒 Basically just an average (weighted) of all the yields p1t =−0.11×CMT3Month−0.16×CMT6Month+···−0.31×CMT20Year 􏰒 Maybe not surprising—they all move together 􏰒 Highest weight given to middle (belly) of the yield curve 71/86 Loadings on Second Principal Component 0.4 0.2 0.0 −0.2 −0.4 3 Month 6 Month 1 Year 2 Year 3 Year Treasury 5 Year 7 Year 10 Year 20 Year 72/86 Loading (Cumulative) Second Principal Component: 2001-2005 6 4 2 0 0 400 800 1200 Date 73/86 Second Principal Component Constant Maturity Treasury Yields: 2001-2005 6 4 key DGS1 DGS10 DGS2 DGS20 DGS3 DGS3MO DGS5 DGS6MO DGS7 Yield 2 0 400 800 1200 Date 74/86 Second Principal Component: Interpretation 􏰒 Second principal component measures the slope of the yield curve 􏰒 Difference between yields on long and short maturities 􏰒 High when the yields are spread out, low when they are compressed 75/86 How much of the variation does each component explain? 􏰒 Recall: the proportion of the total variance of xt that is explained by principal component i can be measured by: λi ∑mi=1 λi 􏰒 Where λi is the eigenvalue associated with that principal component 76/86 Proportion of Variance Explained by Each Component 0.8 0.6 0.4 0.2 0.0 123456789 Principal Component 77/86 Fraction of Variance How much of the variation does each component explain? 􏰒 First principal component explains just under 85% of the variation in the data 􏰒 The second explains almost another 10% 􏰒 Together, these account for 95% of the variation 􏰒 Let’s use these two components to predict all yields over time 78/86 (Cumulative) First Two Principal Components: 2001-2005 6 4 2 0 First Two Principal Components 0 400 800 1200 Date 79/86 Summarizing with the first two factors 􏰒 Recall that we can split up 􏰒 Rewriting it as: 􏰒 Here: xt =α+Γp xt =α+Γ1p1+Γ2p2 􏰒 Γ1 is the first two columns of Γ 􏰒 p1 represents the first two principal components 􏰒 Now imagine considering the following “summary” of xt xˆ t = α + Γ 1 p 1 􏰒 We are forgetting about 7 of our principal components 􏰒 Might still do a decent job of capturing the patterns in the data: 80/86 Predicted Differences Using Two Principal Components 0.8 0.6 0.4 0.2 0.0 123456789 Principal Component 81/86 Fraction of Variance Predicted Yields Using Two Principal Components 1 0 −1 −2 −3 −4 key Predicted Yield 0 400 800 1200 Date V1 V2 V3 V4 V5 V6 V7 V8 V9 82/86 Summarizing With the First Two Components 􏰒 We started with 9 variables—thousands of observations each 􏰒 Used principal components analysis to summarize these with: 􏰒 Two variables 􏰒 Two columns (9×2) of loadings 􏰒 Able to capture (almost) all of the patterns in the data 􏰒 Nice interpretation of these key drivers of patterns in yields: 1. Level shifts in all yields 2. Slope of the yield curve 83/86 What Does Third Principal Component Represent? 0.50 0.25 0.00 −0.25 −0.50 3 Month 6 Month 1 Year 2 Year 3 Year Treasury 5 Year 7 Year 10 Year 20 Year 84/86 Loading Implement PCA Yourself 􏰒 On The Hub you will find yields1018.csv 􏰒 Yields between January 2018 and December 2018 􏰒 Conduct PCA (on differences) in this data 􏰒 Three tasks to perform 􏰒 What fraction of variance in the data is explained by first principal component? 􏰒 Plot the yields over time: what characterizes the yields on front end treasuries in this period 􏰒 How do loadings for first principal component differ from the earlier data? 85/86 Today: Four Parts 1. Geometric Interpretation of Eigenvalues and Eigenvectors 2. Geometric Interpretation of Correlation Matricies 3. An Introduction to PCA 4. An Example of PCA 86/86

Related Posts