CS代写 MAST90138 Week 6 Lab

MAST90138 Week 6 Lab
Problems:
Consider iid data (X1, Z1), . . . , (Xn, Zn), where Xi ∈ Rp and Zi ∈ R, coming from the linear regression model
E(Zi|Xi =x)=α+βTx where α ∈ R and β ∈ Rp are unknown parameters.
1. The least squares estimator of α and β is found by minimising
n
􏰋(Zi −α−βTXi)2
i=1
wrt α and β. Show that the least squares estimators of α and β are equal to
βˆ = ( X CT X C ) − 1 X CT Z C , αˆ = Z ̄ − βˆ T X ̄
where XC is the n×p matrix of centered data XiC = Xi −X ̄ and ZC = (ZC,1,…,ZC,n)T, with
ZC,i = Zi − Z ̄. Solution:
n
􏰋(Zi −α−βTXi)2
i=1
= 􏰋(Zi − Z ̄ + Z ̄ − α − βT (Xi − X ̄ + X ̄))2
=􏰋(Zi −Z ̄−βT(Xi −X ̄)+Z ̄−α−βTX ̄)2
=􏰋(Zi −Z ̄−βT(Xi −X ̄))2 +n(Z ̄−α−βTX ̄)2 +2(Z ̄−α−βTX ̄)􏰋(Zi −Z ̄−βT(Xi −X ̄)) 􏰣 􏰢􏰡 􏰤
2. To predict a new value Znew from a new value Xnew, we take
Zˆnew =αˆ+βˆTXnew =Z ̄+βˆT(Xnew −X ̄).
Since
βˆ = ( X CT X C ) − 1 X CT Z C
is also the LS estimator of β in the model for centered data
(1)
ZC,i = βT XC,i + εi,
for prediction one can simply estimate β from centered data and then take Zˆnew at (1). In
other words, we don’t even need to estimate α.
3. As seen in class, when data are centered, we have
βTX ≈βT Y . C PC(1)
Since Xnew −X ̄ is centered, this suggests that a good prediction can also be obtained by taking Zˆ =Z ̄+βˆTY ,
new,P C
PC (1,)new
1
=0

where βˆPC is the LS estimator of βPC and
Y(1,)new = ΓT(1)(Xnew − X ̄).
The data Xtrainphoneme and Ztrainphoneme on LMS comes from a linear model as above where p = 256 and n = 300. Predict the value of Z for the new 1417 X observations in Xtestphoneme using the standard LS predictor, and using the PC predictor for q = 2, . . . , 10. The true values of Z for those test data are in the file Ztestphoneme. For each estimator computed above, compare your predicted values with the true values using boxplots of the log of the squared prediction errors. Also, for LS and for the best of the PC predictor, plot the predicted values against the true values of Z. What can you conclude?
2

Solution:
X = as.matrix(read.table(“Xtrainphoneme.txt”,sep = “,”))[, -257]
XXnew = as.matrix(read.table(“Xtestphoneme.txt”,sep = “,”))[, -257]
Z = as.numeric(read.table(“Ztrainphoneme.txt”,sep = “,”))
Znew = as.numeric(read.table(“Ztestphoneme.txt”,sep = “,”))
n = 300
nnew = 1417
Xcent = scale(X,scale = F)
Zcent = Z – mean(Z)
betahat = solve(t(Xcent)%*%Xcent)%*%t(Xcent)%*%Zcent
repbarX = matrix(rep(colMeans(X),nnew),nrow=nnew,byrow = T)
ZpredLS = mean(Z)+ (XXnew-repbarX)%*%betahat
compo = prcomp(X,retx = T)
gam = (compo$rotation)
Y = compo$x
Zpred = c()
for(q in 2:10)
{
YDATA=Y[,1:q, drop = FALSE]
betahatPC= solve(t(YDATA)%*%YDATA)%*%t(YDATA)%*%Zcent
YNew=(XXnew-repbarX)%*%gam
Zpred=cbind(Zpred,mean(Z)+YNew[,1:q]%*%betahatPC)
}
par(mfrow = c(2,2))
boxplot(log((Zpred[,1]-Znew)^2),
log((Zpred[,2]-Znew)^2),
log((Zpred[,3]-Znew)^2),
log((Zpred[,4]-Znew)^2),
log((Zpred[,5]-Znew)^2),
log((Zpred[,6]-Znew)^2),
log((Zpred[,7]-Znew)^2),
log((Zpred[,8]-Znew)^2),
log((Zpred[,9]-Znew)^2),
log((ZpredLS-Znew)^2),ylim=c(-20,20)
)
plot(Znew,ZpredLS)
abline(0,1)
plot(Znew,Zpred[,9])
abline(0,1)
Conclude that PC works better, especially if q is well chosen.
3