Principal Components Analysis
Chris Hansman
Empirical Finance: Methods and Applications Imperial College Business School
February 15-16
1/86
Today: Four Parts
1. Geometric Interpretation of Eigenvalues and Eigenvectors 2. Geometric Interpretation of Correlation Matricies
3. An Introduction to PCA
4. An Example of PCA
2/86
Topic 1: Geometry of Eigenvalues and Eigenvectors
1. Technical definitions of eigenvalues and eigenvectors 2. Geometry of matrix multiplication: rotate and stretch 3. Eigenvectors are only stretched
4. Length of eigenvectors doesn’t matter
3/86
A Review of Eigenvalues and Eigenvectors
Consider a square n×n matrix A.
An eigenvalue λi of A is a (1×1) scalar:
The corresponding eigenvector of a ⃗vi is an (n × 1) vector Whereλi,⃗vi satisfy:
A⃗vi =λi⃗vi
4/86
Geometric Interpretation of Eigenvalues and Eigenvectors
Consider the square n×n matrix A.
A times any (n×1) vector gives an (n×1) vector
Useful to think of this as a linear function that: Takes n×1 vectors as inputs
Gives n×1 vectors as outputs:
f : Rn → Rn
Specifically, for the input vector⃗v, this is the function that outputs: f (⃗v ) = A⃗v
5/86
Geometric Interpretation of Eigenvalues and Eigenvectors
Consider the square n×n matrix A.
Think of this matrix as the function that maps vectors to vectors:
Lets say
f (⃗v ) = A⃗v
5 0 2 A= 2 3 and⃗v= 1
Whatisf(⃗v)? menti.com
6/86
The Matrix A Can be Thought of as a Function
Consider the square n×n matrix A.
Think of this matrix as the function that maps vectors to vectors:
Lets say
Whatisf(⃗v)?
f (⃗v ) = A⃗v
5 0 1
A= 2 3 and⃗v= 0
5 f(⃗v)=Ax= 2
7/86
The Matrix A Rotates and Stretches a Vector ⃗v Lets say
5 0 1 5 A= 2 3 and⃗v= 0 ⇒A⃗v= 2
v=(1 0)’
8/86
The Matrix A Rotates and Stretches a Vector ⃗v Lets say
5 0 1 5 A= 2 3 and⃗v= 0 ⇒A⃗v= 2
v=(1 0)’
Av=(5 2)’
9/86
The Matrix A Rotates and Stretches a Vector ⃗v Lets say
5 0 −1 −5 A= 2 3 and⃗v= 2 ⇒A⃗v= 4
v=(−1 2)’
10/86
The Matrix A Rotates and Stretches a Vector ⃗v Lets say
5 0 −1 −5 A= 2 3 and⃗v= 2 ⇒A⃗v= 4
Av=(−5 4)’
v=(−1 2)’
11/86
The Matrix A Rotates and Stretches a Vector ⃗v Lets say
5 0 0 0
A= 2 3 andv⃗2= 1 ⇒Av⃗2= 3 =3v⃗2
v2=(0 1)’
12/86
The Matrix A Rotates and Stretches a Vector ⃗v Lets say
5 0 0 0
A= 2 3 andv⃗2= 1 ⇒Av⃗2= 3 =3v⃗2
Av2=(0 3)’
v2=(0 1)’
13/86
For Some Vectors⃗v, Matrix A Only Stretches Lets say
5 0 0 0
A= 2 3 andv⃗2= 1 ⇒Av⃗2= 3 =3v⃗2
Av2=(0 3)’ = 3v2
v2=(0 1)’
14/86
For Some Vectors⃗v, Matrix A Only Stretches Lets say
5 0 0 0
A= 2 3 andv⃗2= 1 ⇒Av⃗2= 3 =3v⃗2
0
Some vectors, like v⃗2 = 1 have a special relationship with A:
The matrix A only stretches v⃗2 No rotations!
0 Are there any other vectors like v⃗2 = 1 ?
1 Lets try v⃗1 = 1
15/86
For Some Vectors⃗v, Matrix A Only Stretches Lets say
5 0 1 5
A= 2 3 andv⃗1= 1 ⇒Av⃗1= 5 =1v⃗1
v1=(1 1)’
16/86
For Some Vectors⃗v, Matrix A Only Stretches Lets say
5 0 1 5
A= 2 3 andv⃗1= 1 ⇒Av⃗1= 5 =1v⃗1
Av1=(5 5)’
v1=(1 1)’
17/86
For Some Vectors⃗v, Matrix A Only Stretches Lets say
5 0 1 5
A= 2 3 andv⃗1= 1 ⇒Av⃗1= 5 =5v⃗1
Av1=(5 5)’ = 5v1
v1=(1 1)’
18/86
For Some Vectors⃗v, Matrix A Only Stretches
For the matrix A, we’ve found two vectors with this special property: 1 5
v⃗1= 1 withAv⃗1= 5 = 5 v⃗1
λ1
0 0
v⃗2= 1 withAv⃗2= 3 = 3 v⃗2
λ1 We call these vectors eigenvectors of the matrix A
19/86
For Some Vectors⃗v, Matrix A Only Stretches
For the matrix A, we’ve found two vectors with this special property: 1 5
v⃗1= 1 withAv⃗1= 5 = 5 v⃗1
λ1
0 0
v⃗2= 1 withAv⃗2= 3 = 3 v⃗2
λ2 Note that they get stretched by different factors
5forv⃗1,3forv⃗2
We call these stretching factors eigenvalues:
λ1=5, λ2=3
20/86
Defining Eigenvalues and Eigenvectors
This notion of only stretching is the defining feature of eigenvalues and eigenvectors:
Eigenvalue λi and corresponding eigenvector ⃗vi are λi , ⃗vi such that: A⃗vi =λi⃗vi
In our example:
And:
A⃗v1 = λ1⃗v1
5 01 1
231=51
A⃗v2 = λ2⃗v2
5 00 1
231=31
21/86
Length of Eigenvector Doesn’t Change Anything
Imagine multiplying an eigenvector by some constant (e.g. 1 ): 2
5 0 1 5
A= 2 3 andv⃗1= 1 ⇒Av⃗1= 5 =5v⃗1
Av1=(5 5)’ = 5v1
v1=(1 1)’
22/86
Length of Eigenvector Doesn’t Change Anything
Imagine multiplying an eigenvector by some constant (e.g. 1 ): 2
5 0 0.5 2.5
A= 2 3 andv⃗1= 0.5 ⇒Av⃗1= 2.5 =5v⃗1
v1=(0.5 0.5)’
22/86
Length of Eigenvector Doesn’t Change Anything
Imagine multiplying an eigenvector by some constant (e.g. 1 ): 2
5 0 0.5 2.5
A= 2 3 andv⃗1= 0.5 ⇒Av⃗1= 2.5 =5v⃗1
Av1=(2.5 2.5)’ = 5v1
v1=(0.5 0.5)’
22/86
Length of Eigenvector Doesn’t Change Anything
Imagine multiplying an eigenvector by some constant (e.g. 1 ): 2
5 0 0.5 2.5
A= 2 3 andv⃗1= 0.5 ⇒Av⃗1= 2.5 =5v⃗1
Av1=(2.5 2.5)’ = 5v1
v1=(0.5 0.5)’
22/86
Length of Eigenvector Doesn’t Change Anything
Imagine multiplying an eigenvector by some constant (e.g. 2): 5 0 0 0
A= 2 3 andv⃗2= 2 ⇒Av⃗2= 6 =3v⃗2
Av2=(0 3)’ = 3v2
v2=(0 1)’
23/86
Length of Eigenvector Doesn’t Change Anything
Imagine multiplying an eigenvector by some constant (e.g. 2): 5 0 0 0
A= 2 3 andv⃗2= 2 ⇒Av⃗2= 6 =3v⃗2
v2=(0 2)’
23/86
Length of Eigenvector Doesn’t Change Anything
Imagine multiplying an eigenvector by some constant (e.g. 2): 5 0 0 0
A= 2 3 andv⃗2= 2 ⇒Av⃗2= 6 =3v⃗2
Av2=(0 6)’
v2=(0 2)’
23/86
Length of Eigenvector Doesn’t Change Anything
Imagine multiplying an eigenvector by some constant (e.g. 2): 5 0 0 0
A= 2 3 andv⃗2= 2 ⇒Av⃗2= 6 =3v⃗2
Av2=(0 6)’ = 3v2
v2=(0 2)’
23/86
Length of Eigenvector Doesn’t Change Anything
Any multiple of an eigenvector is also an eigenvector: if vi is an an eigenvector, so is cvi for any scalar c.
As a result, often normalize them so that they have unit length i.e. vi′vi =1
Best to think of an eigenvector vi as a direction Think of eigenvalue λi as a stretching factor
24/86
Finding Eigenvalues of Symmetric Matricies
From here, focus on symmetric matricies (like covariance Σx ) How do we calculate the eigenvalues?
Use a computer
But if you have to, in the 2×2 case:
a b A=bd
(a+d)+(a−d)2 +4b2 2 (a+d)−(a−d)2 +4b2 2
7 0 A=02
λ1 =
λ2 = What are the eigenvalues of:
Menti.com
25/86
Finding Eigenvectors of Symmetric Matricies
From here, we will focus on symmetric matricies (like Σx )
Given the eigenvalues, how do we calculate the eigenvectors?
Again, use a computer
But if you have to, simply use:
Avi =λivi
Important note: Symmetric matrices have orthogonal eigenvectors,
that is:
for any i ̸=j
vi′vj =0
26/86
Finding Eigenvalues of Diagonal Matrices
Diagonal matrices are a subset of symmetric matrices: a 0
A=0d
How do we calculate the eigenvalues and eigenvectors?
27/86
Topic 2: Geometric Interpretations of Correlation Matrices
1. Uncorrelated assets: eigenvalues are variances
2. Correlated assets: first eigenvector finds direction of maximum variance
28/86
za Uncorrelated Standardized Data: z = zb
z2
−10 −5 0 5 10
−10 −5 0 5 10 z1
1 0 Cov(z)=Σz= 0 1
29/86
xa Uncorrelated (Non-Standardized) Data: x = xb
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
4 0 Cov(x)=Σx= 0 4
30/86
xa Uncorrelated (Non-Standardized) Data: x = xb
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
3 0 Cov(x)=Σx= 0 1
31/86
xa Uncorrelated (Non-Standardized) Data: x = xb
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
3 0 Cov(x)=Σx= 0 1
V2
V1
31/86
xa Uncorrelated (Non-Standardized) Data: x = xb
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
3 0 Cov(x)=Σx= 0 1
V2
V1
31/86
Eigenvalues of Σx with Uncorrelated Data 3 0
Σx= 0 1
What are the eigenvalues and eigenvectors of Σx ?
Uncorrelated assets: eigenvalues are variances of each asset return! Eigenvectors:
1 0 v1= 0 , v2= 1
First eigenvalue points in the direction of the largest variance We sometimes write the eigenvectors together as a matrix:
1 0 Γ=(v1 v2)= 0 1
32/86
xa Uncorrelated (Non-Standardized) Data: x = xb
||V1||=λ1=3
V2
V1
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
3 0 Cov(x)=Σx= 0 1
32/86
xa Uncorrelated (Non-Standardized) Data: x = xb
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
1 0 Cov(x)=Σx= 0 3
33/86
Eigenvalues of Σx with Uncorrelated Data 1 0
Σx= 0 3
What are the eigenvalues and eigenvectors of Σx ?
With uncorrelated assets eigenvalues are just the variances of each asset return!
Eigenvectors:
0 1 v1= 1 , v2= 0
Note that the first eigenvalue points in the direction of the largest variance
We sometimes write the eigenvectors together as a matrix:
0 1 Γ=(v1 v2)= 1 0
34/86
xa Correlated Data: z = xb
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
2 1 Cov(x)=Σx= 1 2
35/86
Eigenvalues of Σx with Correlated Data 2 1
Σx= 1 2
What are the eigenvalues and eigenvectors of Σx ?
With correlated assets eigenvalues are a bit trickier
The eigenvalues are 3 and 1
Which of the following is not an eigenvector of Σx ? 1 −1 21
1 1 1 r=2,w=2,s=2
menti.com
222
36/86
xa Correlated Data: z = xb
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
2 1 Cov(x)=Σx= 1 2
V1
37/86
Eigenvectors are Γ = (v1 v2) = 2 2 22
V2=(−1 1)’ V1=(1 1)’
−10 −5 0 5 10 xa
2 1 Cov(x)=Σx= 1 2
1 −1 1 1
xb
−10 −5 0 5 10
37/86
Eigenvectors of Σx with Correlated Data 2 1
Σx= 1 2
1 −1
1 1 Γ=(v1 v2)= 2 2
22
Just as with uncorrelated data, first eigenvector finds the direction with the most variability
Second eigenvector points in the direction that explains the maximum amount of the remaining variance
Note that the two are perpendicular
This is the geometric implication of the fact that they are orthogonal:
vi′vj =0
The fact that they are orthogonal also implies:
Γ′ = Γ−1
38/86
Eigenvalues of Σx with Correlated Data 2 1
Σx= 1 2 λ1=3 λ2=1
The eigenvalues are the same as our uncorrelated data
Note that the scatter plot looks quite similar to our uncorrelated data
Just rotated a bit
Imagine rotating the data so that the first eigenvector is lined up with the x-axis
The first eigenvalue is the variance (along the x axis) of this rotated data
The second eigenvalue is the variance along the y axis
39/86
Eigenvalues Represent Variance along the Eigenvectors
V2=(−1 1)’ V1=(1 1)’
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
2 1 Cov(x)=Σx= 1 2
40/86
Eigenvalues Represent Variance along the Eigenvectors
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
2 1 Cov(x)=Σx= 1 2
41/86
What is This Rotation
So with a little rotation, we take our data drawn from xa
With
x= xb
2 1
Σx= 1 2
And back what looks like our uncorrelated data, which was
generated by
̃ 3 0 Σ=01
How do we rotate x into this uncorrelated data? Γ′x
42/86
Topic 3: Introduction to Principal Components Analysis
43/86
Principal Components Analysis
This notion of rotation underlies the concept of Principal Components Analysis
Consider a general cross-section of returns on m assets xt
m×1
E[xt] = α Cov(xt) = Σx
44/86
Principal Components Analysis
xt
m×1 E[xt] = α Cov(xt) = Σx
Define the normalized asset returns: x ̃t = xt − α Let the eigenvalues of Σx be given by:
λ1 ≥λ2 ≥λ3 ≥···≥λm Let the eigenvectors be given by:
v1,v2,v3,···vm
45/86
Principal Components Analysis
Cov(xt) = Σx
Note that the eigenvectors are orthogonal vi′vj =0
Because the scaling doesn’t matter, we can normalize: vi′vi =1
These scaled, normalized vectors are called orthonormal
As before, let Γi be the matrix with eigenvectors as columns:
Γi = [v1 v2 ···vm]
46/86
Principal Components Analysis
Define the Principal components variables as the rotation: p = Γ ′ x ̃ t
Or written out further:
E[p]= 0
m×1
m×1
v 1′ ( x t − α ) v′(xt −α)
2 p= .
v m′ ( x t − α )
47/86
Principal Components Analysis
Recall the eigendecomposition:
Σx = ΓΛΓ′
Where
Hence
λ1 ··· 0 …
Λ= . .. . 0 ··· λm
Cov(p) = Cov(Γ′x ̃ ) = Γ′Cov(x )Γ tt
= Γ′ΓΛΓ′Γ = Λ
48/86
Aside: Variance Decomposition
A nice result from linear algebra:
mm ∑var(xit)= ∑λi
i=1 i=1
So the proportion of the total variance of xi that is explained by the
largest eigenvalue λi is simply:
λi ∑mi=1 λi
49/86
Principal Components Analysis
Our Principal components variables provide a transformation of the data into variables that are:
Uncorrelated (orthogonal)
Ordered by how much of the total variance they explain (size of
eigenvalue)
What if we have many m, but the first few (2, 5, 20) Principal components explain most of the variation:
Idea: Use these as “factors” Dimension reduction!
50/86
Principal Components Analysis
Note that because Γ′ = Γ−1
xt =α+Γp
We can also partition Γ into the first K < m eigenvectors and the remaining m−K
Γ = [Γ1 Γ2]
Partition p into its first K elements and the remaining m − K
We can then write
p1 p= p2
xt =α+Γ1p1+Γ2p2
51/86
Principal Components Analysis
xt =α+Γ1p1+Γ2p2 This looks just like a factor model:
With:
B=Γ1
f = p1
εt =Γ2p2
One minor difference:
xt = α + Bft + εt
Cov(εt) = Ψ = Γ2Λ2Γ′2
which is (likely) not diagonal, as assumed in factor analysis
52/86
Implementing Principal Components Analysis
xt =α+Γ1p1+Γ2p2
Recall the sample covariance matrix:
Σˆ x = 1 X ̃ X ̃ ′ T
Calculate this, and perform the eigendecomposition (using a computer):
Σˆx =ΓΛΓ′
We now have everything need to compute the sample Principal
components at each t:
P = [ p 1 p 2 · · · p T ] = Γ ′ X ̃
m×T
53/86
An Example of Principal Components Analysis
1. Principal Components on Yield Changes for US Treasuries 2. Most Patterns Explained by 2 Principal Components
54/86
Recall: Implementing Principal Components Analysis
In each period t, we see a vector of asset returns xt: We’ll consider the covariance of asset returns
Cov(xt) = Σx
Today, we’ll consider the yields on US Treasury debt
Yields on T-bills, notes and bonds
Constant Maturity Treasuries from 3 months to 20 years
55/86
The Yield Curve: May 24th, 2001
6
4
2
0
36 12
24 36
60 84 120 240
Maturity (Months)
●●
●
●
●
●
●
●
●
56/86
Yield
The Yield Curve: January 6th, 2003
6
4
2
0
36 12
24 36
60 84 120 240
Maturity (Months)
● ●●
●
●
●
●
●
●
57/86
Yield
The Yield Curve: January 4th, 2005
6
4
2
0
36 12
24 36
60 84 120 240
Maturity (Months)
● ●
●
●
●
●
●
●
●
58/86
Yield
Constant Maturity Treasury Yields: 2001-2005
6
4
key
DGS1
DGS10 DGS2 DGS20 DGS3 DGS3MO DGS5 DGS6MO DGS7
Yield
2
0 400
800 1200
Date
59/86
Implementing Principal Components Analysis
What patterns do you see in the data?
What two or three things characterize the data in a given period?
Idea behind principal components:
Can we choose a small number of variables that characterize most of
the variation in the data?
Do these variables have an intuitive meaning that helps us understand the data itself
60/86
Technical Point: We Will Actually Examine Differences
6
4
key
DGS1
DGS10 DGS2 DGS20 DGS3 DGS3MO DGS5 DGS6MO DGS7
Yield
2
0 400
800 1200
Date
xnt =Ynt−Ynt−1
61/86
Technical Point: We Will Actually Examine Differences
0.2
0.0
−0.2
−0.4
key
DGS1
DGS10 DGS2 DGS20 DGS3 DGS3MO DGS5 DGS6MO DGS7
Yield Difference
0 400
800 1200
Date
xnt =Ynt−Ynt−1
62/86
Setting up The Basics
xt is cross-section of yield chages (at time t):
Y3 month,t −Y3 month,t−1
Y6 month,t −Y6 month,t−1 xt = . .
Y20 Year,t −Y20 Year,t−1 We denote the mean by: α = E[xt]
And will work with demeaned values: x ̃t = xt − α
63/86
Implementing Principal Components Analysis
First step: computing covariance matrix Σˆx =coˆv(x ̃t)
Below is the (slightly more intuitive) correlation matrix:
1.000 0.815 0.641 0.815 1.000 0.863 0.641 0.863 1.000 0.480 0.696 0.874 0.443 0.651 0.833 0.395 0.592 0.779 0.341 0.539 0.733 0.312 0.499 0.694 0.236 0.411 0.610
0.480 0.443 0.696 0.651 0.874 0.833 1.000 0.972 0.972 1.000 0.919 0.955 0.882 0.922 0.842 0.888 0.761 0.812
0.395 0.341 0.312 0.592 0.539 0.499 0.779 0.733 0.694 0.919 0.882 0.842 0.955 0.922 0.888 1.000 0.974 0.953 0.974 1.000 0.979 0.953 0.979 1.000 0.894 0.942 0.960
0.236 0.411 0.610 0.761 0.812 0.894 0.942 0.960 1.000
64/86
Implementing Principal Components Analysis
With Σˆ x = coˆv (x ̃t ) in hand, can perform the eigendecomposition: Σˆx =ΓΛΓ′
Where Λ is the diagonal matrix of eigenvalues λ1 ··· 0
... Λ= . .. .
0 ··· λm
And Γ is the orthogonal matrix of eigenvectors
This is easy in R: eigen(Sigma x)
65/86
Implementing Principal Components Analysis
We can now construct the Principal components variables: p=Γ′(xt −α)
m×1
This creates 9 new principal components variables
Each is a linear combination of our original 9 yields Can use them to recreate xt
xt =α+Γp
First variable now explains largest proportion of variance
66/86
Examining the First Few Principal Components
By examining the first few factors, we get a sense of the key drivers of these yields
In particular, exploring the first few columns of Γ (the Loadings) helps us interpret/name these drivers
Examining the factors themselves gives a sense of how they have changed over time
67/86
Loadings on First Principal Component
0.0
−0.1
−0.2
−0.3
−0.4
3 Month
6 Month
1 Year
2 Year
3 Year
Treasury
5 Year
7 Year
10 Year
20 Year
68/86
Loading
(Cumulative) First Principal Component: 2001-2005
6
4
2
0
0 400
800 1200
Date
69/86
First Principal Component
Constant Maturity Treasury Yields: 2001-2005
6
4
key
DGS1
DGS10 DGS2 DGS20 DGS3 DGS3MO DGS5 DGS6MO DGS7
Yield
2
0 400
800 1200
Date
70/86
First Principal Component: Interpretation
First principal component measures level shifts in the yield curve Basically just an average (weighted) of all the yields
p1t =−0.11×CMT3Month−0.16×CMT6Month+···−0.31×CMT20Year
Maybe not surprising—they all move together
Highest weight given to middle (belly) of the yield curve
71/86
Loadings on Second Principal Component
0.4
0.2
0.0
−0.2
−0.4
3 Month
6 Month
1 Year
2 Year
3 Year
Treasury
5 Year
7 Year
10 Year
20 Year
72/86
Loading
(Cumulative) Second Principal Component: 2001-2005
6
4
2
0
0 400
800 1200
Date
73/86
Second Principal Component
Constant Maturity Treasury Yields: 2001-2005
6
4
key
DGS1
DGS10 DGS2 DGS20 DGS3 DGS3MO DGS5 DGS6MO DGS7
Yield
2
0 400
800 1200
Date
74/86
Second Principal Component: Interpretation
Second principal component measures the slope of the yield curve
Difference between yields on long and short maturities
High when the yields are spread out, low when they are compressed
75/86
How much of the variation does each component explain?
Recall: the proportion of the total variance of xt that is explained by principal component i can be measured by:
λi ∑mi=1 λi
Where λi is the eigenvalue associated with that principal component
76/86
Proportion of Variance Explained by Each Component
0.8
0.6
0.4
0.2
0.0
123456789
Principal Component
77/86
Fraction of Variance
How much of the variation does each component explain?
First principal component explains just under 85% of the variation in the data
The second explains almost another 10%
Together, these account for 95% of the variation
Let’s use these two components to predict all yields over time
78/86
(Cumulative) First Two Principal Components: 2001-2005
6
4
2
0
First Two Principal Components
0 400
800 1200
Date
79/86
Summarizing with the first two factors
Recall that we can split up
Rewriting it as: Here:
xt =α+Γp
xt =α+Γ1p1+Γ2p2
Γ1 is the first two columns of Γ
p1 represents the first two principal components
Now imagine considering the following “summary” of xt
xˆ t = α + Γ 1 p 1
We are forgetting about 7 of our principal components
Might still do a decent job of capturing the patterns in the data:
80/86
Predicted Differences Using Two Principal Components
0.8
0.6
0.4
0.2
0.0
123456789
Principal Component
81/86
Fraction of Variance
Predicted Yields Using Two Principal Components
1
0
−1
−2
−3
−4
key
Predicted Yield
0 400
800
1200
Date
V1 V2 V3 V4 V5 V6 V7 V8 V9
82/86
Summarizing With the First Two Components
We started with 9 variables—thousands of observations each Used principal components analysis to summarize these with:
Two variables
Two columns (9×2) of loadings
Able to capture (almost) all of the patterns in the data
Nice interpretation of these key drivers of patterns in yields: 1. Level shifts in all yields
2. Slope of the yield curve
83/86
What Does Third Principal Component Represent?
0.50
0.25
0.00
−0.25
−0.50
3 Month
6 Month
1 Year
2 Year
3 Year
Treasury
5 Year
7 Year
10 Year
20 Year
84/86
Loading
Implement PCA Yourself
On The Hub you will find yields1018.csv
Yields between January 2018 and December 2018
Conduct PCA (on differences) in this data
Three tasks to perform
What fraction of variance in the data is explained by first principal
component?
Plot the yields over time: what characterizes the yields on front end treasuries in this period
How do loadings for first principal component differ from the earlier data?
85/86
Today: Four Parts
1. Geometric Interpretation of Eigenvalues and Eigenvectors 2. Geometric Interpretation of Correlation Matricies
3. An Introduction to PCA
4. An Example of PCA
86/86