MAST90083 Computational Statistics & Data Mining Linear Regression
Tutorial & Practical 1: Linear Regression
Question 1
Given the model
y = Xβ + �
where y ∈ Rn, X ∈ Rn×p is full rank p and � ∈ Rn ∼ N (0, σ2In). Let β̂ be the estimate of β
obtained by least square estimation and H = X
(
X>X
)−1
X> the orthogonal projector onto
the subspace spanned the columns of X
� Write U = y−Xβ̂ as a function of H and �
� Write V = X
(
β̂ − β
)
as a function of H and �
� Using the results obtained find Cov (U, V )
� Use this results to show that
(y−Xβ)> (y−Xβ) =
(
y−Xβ̂
)> (
y−Xβ̂
)
+
(
β̂ − β
)>
X>X
(
β̂ − β
)
� What is the conclusion you can make from the left side of the above equation?
� Using the above decomposition find the distribution of
(
y−Xβ̂
)> (
y−Xβ̂
)
/σ2
Question 2
Let
y = 1nβ + �
where y ∈ Rn and � ∈ Rn ∼ N (0, σ2In).
� Find of the least square estimate of β?
� What does represent β̂ and the associated ŷ
� Provide the expression of the projector generating ŷ
� Use this projector to provide an alternative expression for
∑n
i=1 (yi − ȳ)
2
� Show that ȳ is independent of
∑n
i=1 (yi − ȳ)
2
� Show that
∑n
i=1 (yi − ȳ)
2
/σ2 is χ2n−1
1
MAST90083 Computational Statistics & Data Mining Linear Regression
Question 3
Let
y = Xβ + �
where y ∈ Rn, X ∈ Rn×p is full rank p and � ∈ Rn ∼ N (0, σ2In).
� Give the expression of the log-likelihood ` (β, σ2)
� Show that the least square estimate for β is also the maximum likelihood estimator for
β
� Find the maximum likelihood estimator of σ2
� Using the expression of β̂ and σ̂2 provide the expression of `
(
β̂, σ̂2
)
� Taking θ =
(
β>, σ2
)>
derive the expression of the information matrix I = −E
[
∂2`
∂θ∂θ>
]
� Discuss the efficiency of β̂ and σ̂2
Question 4
Let
y = Xβ + �
where y ∈ Rn, X ∈ Rn×p is full rank p and � ∈ Rn ∼ N (0, σ2In).
Furthermore to the above model there are a number of restrictions or constraints that β need
to satisfies, these constraints are described by
Aβ = c
where A is a known q × p matrix of rank q and c is a known q × 1.
Find an estimate of β that minimizes the squared error subject to these constraint (use
Lagrange multipliers, one for each constraint aiβ = ci, i = 1, …, q).
Question 5
Let
y = Xβ + �
where y ∈ Rn, X ∈ Rn×p is full rank r < p < n and � ∈ Rn ∼ N (0, σ2In). � Write the least square estimate of β using the n×n matrix S, the diagonal matrix Σ = diag (σ1, ..., σr) and the p×p matrix Q, obtained from the singular value decomposition of X, X = SΣQ>
� Write the least square estimate of β as a function σi, si and qi where si and qi are the
column vectors of S and Q respectively.
2
MAST90083 Computational Statistics & Data Mining Linear Regression
� Consider the estimate of β which can be represented as a projection βp = Qkz where
Qk = (q1, …, qk), what is the relation between the least square estimate of β̂LS and βp
where z is obtained by minimizing ‖y−XQkz‖2
� Provide an expression for the residual Rk = ‖y−Xβ̂p‖2
3