STA 5106: Homework Assignment #4
1. PCA and Images: Consider the problem of analysis of images. Each (gray scale) image can be thought of as a matrix of numbers, say I ∈Rm1xm2 . We can rewrite this matrix as a long vector
X ∈Rm1m2 . Setting n = m1m2, we want to use PCA to reduce dimension from n to d. For the data file provided to you on the website perform PCA and present the following results:
- (a) Show images of the first three principal directions of the data. That is, take the vectors U1, U2, and U3 and display them as images. (Use the commands below to form images from vectors.)
- (b) Take the first image in the data, and show its projection onto the principal subspace for d = 50 and d = 100. The projection of the first image into first d components is:
d ∑(X(1,:)*Ui)*Ui.
i=1
Load the data file using “load hw4_1_data”. This will give you a 200 × 644 matrix where each
row of this matrix is a vector form of an image with m1 = 28 and m2 = 23. So there are 200 images in this dataset.
For a 644 length vector v you can form and display it as an image using:
I = reshape(v,28,23); imagesc(I); colormap(gray)
axis equal;
2. LDA: Consider a labeled data set X with the following properties: there are m = 5 classes, each class has k = 10 observations, and each observation is a vector of size n = 3. Therefore, X can be thought of as three-dimensional array with dimensions 3 × 5 × 10. In Matlab, X(:, i, j) denotes the jth observation vector of ith class.
Given this data, perform a linear discriminant analysis of the data for d = 1, and find the projection U ∈ Rn×d that is optimal for separating observed classes. You can use the eig function in matlab to perform generalized eigen decomposition. For the resulting U:
(a) Plot the original data using command plot3.
(b) State U.
(c) Project the data X into Z, and plot the observations of Z.
Download X in “hw4_2_data” from the blackboard website.
3. LLE: In the LLE framework, we minimize the error for a point X! with K neighbors η!
j
!K!2K!!2
X−∑j=1Wjηj =∑j=1Wj(X−ηj) =∑jkWjWkGjk
where the Gram matrix
Prove that the optimal reconstruction weights are
G =(X!−η!)⋅(X!−η!) jk j k
∑ [G−1]jk W =∑k
j [G−1]lm lm