CS计算机代考程序代写 Principal Components Analysis: Part 2

Principal Components Analysis: Part 2
Hamid Dehghani School of Computer Science Birmingham
April 2021

PCA
• What is the principal component.
– By finding the eigenvalues and eigenvectors of the covariance matrix, we find that the eigenvectors with the largest eigenvalues correspond to the dimensions that have the strongest correlation in the dataset.
• PCA is a useful statistical technique that has found application in:
– fields such as face recognition and image compression
– finding patterns in data of high dimension.

Basic Theory • Let xi be a set of ‘n’ N ⨉ 1 numbers

Basic Theory
• Let X be the [N ⨉ n] matrix with columns
• Subtracting the mean is equivalent to translating the coordinate system to the location of the mean.

Basic Theory • LetQ=XXT bethe[N⨉N]matrix:
• Q is square
• Q is symmetric
• Q is the covariance matrix [aka scatter matrix]
• Q can be very large (in vision, N is often the number of pixels in an image!)

PCA Theorem
• Each xj can be written as:
• where ei are the n eigenvectors of Q with non-zero eigenvalues • Remember
– The eigenvectors e1 e2 … en span an eigenspace
– e1 e2 … en are N ⨉ 1 orthonormal vectors (directions in N-Dimensional
space)
– The scalars gji are the coordinates of xj in the space.

Using PCA to Compress Data
• Expressing x in terms of e1 e2 … en has not changed the size of the data
• BUT:
– if the points are highly correlated many of the coordinates of x will be zero or closed to zero.
– This means they lie in a lower-dimensional linear subspace
• Sort the eigenvectors ei according to their eigenvalue:
• Assuming that:
• Then:

Example –STEP 1

Example –STEP 2
• Calculate the covariance matrix
• Since the non-diagonal elements in this covariance matrix are positive, we should expect that both the x and y variable increase together.
• Calculate the eigenvectors and eigenvalues of the covariance matrix

Example –STEP 3
• eigenvectors are plotted as diagonal dotted lines on the plot.
• Note they are perpendicular to each other.
• Note one of the eigenvectors goes through the middle of the points, like drawing a line of best fit.
• The second eigenvector gives us the other, less important, pattern in the data, that all the points follow the main line, but are off to the side of the main line by some amount.

Example –STEP 4
• Feature Vector
– Feature Vector = (eig1 eig2 eig3 … eign)
• We can either form a feature vector with both of the eigenvectors:
• or, we can choose to leave out the smaller, less significant component and only have a single column:

Example –STEP 5 • Deriving new data coordinates
– FinalData = RowZeroMeanData ⨉ RowFeatureVector
• RowFeatureVector is the matrix with the eigenvectors with the most
significant eigenvector first
• RowZeroMeanData is the mean-adjusted data, i.e. the data items are in each column, with each row holding a separate dimension.
• Note: This is essentially rotating the coordinate axes so higher-variance axes come first

Implementing PCA
• Finding the ’first k eigenvectors of Q:
• Q is N ⨉ N (N could be the number of pixels in an image. For a 256 x 256 image, N = 65536 !!)
• Don’t want to explicitly compute Q!!!!

Singular Value Decomposition (SVD)
• Any m ⨉ n matrix X can be written as the product of 3 matrices:
• where:
– U is m x m and its columns are orthonormal vectors
– V is n x n and its columns are orthonormal vectors
– D is m x n diagonal and its diagonal elements are called the singular values of X, and are such that:
– s1>s2>s3>…>0

SVD Properties
• The columns of U are the eigenvectors of XXT
• The columns of V are the eigenvectors of XTX
• The diagonal elements of D are the eigenvalues of XXT and XTX