Simple linear regression: II. point estimation
Miaoyan Wang
Department of Statistics UW Madison
Copyright By PowCoder代写 加微信 powcoder
Simple linear regression
References:
Chapter 2 in JF ( . Faraway)
Chapter 2.1-2.9, 2.11 in RC ( )
Both textbooks are available in Canvas/files/textbook/
Recall: simple linear regression model
A straight line relationship between the response variable Y and the explanatory variable X:
Yi=β0+β1xi+εi where E(εi)=0 Equal variance:
Independence: Normal distribution:
Var(εi) = σ2.
Cov(εi,εi′) = 0 for i ̸= i′.
εi ∼N(0,σ2).
Q: εi are iid. How about Yi ? iid? Not iid? It depends?
Model Parameters
The model parameters are β0,β1, and σ2 (population parameters). β0 and β1: regression coefficients.
β0: intercept.
When the model scope includes x = 0, β0 can be interpreted as the meanofY atx=0.
β1: slope.
Interpreted as the change in the mean of Y per unit increase in x.
σ2: error variance, sometimes written as σ2 or σ2 . ε Y|x
Q: How to estimate the model parameters based on data?
Estimation of Model Parameters
Our goal is to estimate these model parameters by estimators βˆ0,βˆ1, and σˆ2, based on data.
Two methods:
Least squares (LS).
Maximum likelihood (ML).
Additional notation:
Let Yˆi = βˆ0 + βˆ1Xi denote the ith fitted value. Let ei = Yi − Yˆi denote the i th residual.
Estimation of β0 and β1
Both LS and ML give the same estimator for β0 and β1:
ni=1(Xi −X ̄)(Yi −Y ̄)
ni=1(Xi −X ̄)2 1nn
= Y ̄ − βˆ 1 X ̄ . We now give two methods for these estimations.
Y i − βˆ 1 X i i=1 i=1
Least Squares (LS) Estimation
Consider the criterion:
Q=(Yi −β0−β1Xi)2.
The LS estimators of β0 and β1 are those values, βˆ0 and βˆ1, that minimize Q, for the given observed data (X1, Y1), . . . , (Xn, Yn).
Graphical interpretation?
● ●●●●● ●● ●●●●●●●
● ● ● ●● ●● ●
● ● ● ● ● ●●●●● ● ●●●
● ● ● ● ● ●●●●
0.25 0.50 0.75 Percent forest cover
Number of species
Brainstorm
Why do we use vertical distance to define the fitted line?
Other choices?
The sum of the squares of perpendicular distance The sum of absolute value of the distance
Approach 1: LS Derivation
Differentiate Q with respect to β0 and β1: ∂Q n
(a) : ∂β =−2 (Yi−β0−β1Xi) 0 i=1
(b) : ∂β =−2 (Yi−β0−β1Xi)Xi
Set (a) and (b) equal to 0 and let the solutions to these two
equations be βˆ0 and βˆ1.
Let β = (β0, β1)′.
Since ∂2Q ′ is positive definite, βˆ0 and βˆ1 minimize Q. ∂β∂β
Approach 2: ML Derivation
Let θ = (β0, β1, σ2)′.
WehaveYi ∼i.i.d.N(β0+β1Xi,σ2). Thus,
11 2 fi(yi;θ)=√ 2exp −2σ2{yi−(β0+β1xi)} .
2πσ The likelihood function is
L(θ;y) = fi(yi;θ)
= √ 2 exp −2σ2 {yi −(β0 +β1xi)} i=1 2πσ
ML Derivation (Cont.)
Solve for the parameters and obtain the ML estimates:
ni=1(Xi −X ̄)(Yi −Y ̄) β1 = ni=1(Xi −X ̄)2
βˆ0 = Y ̄−βˆ1X ̄
2 ni=1(Yi −Yˆi)2 σ ̃= n
Properties of Fitted Regression Line
For the fitted values Yˆi = βˆ0 + βˆ1Xi and residuals ei = Yi − Yˆi , we have:
The regression line always goes through (X ̄,Y ̄). ni=1 ei2 is a minimum.
ni=1ei =0.
ni=1Xiei =0.
ni=1Yi =ni=1Yˆi. ni=1Yˆiei =0.
Exercises: Proofs of the above properties.
Geometric Interpretation
Define “hat matrix” (projection matrix): H = X(X′X)−1X′ and Yˆ =Xβˆ=HY
H projects Y onto the span of X.
I − H projects Y onto the space orthogonal to X .
Exercise: What is the algebraic properties of the hat matrix H? rank, eigenvalues-vectors, semi positive definite, idempotent, etc.
Estimation of σ2
Define an error sum of squares (SSE) (or, residual sum of
nn SSE=(Yi −Yˆi)2 =ei2.
Under simple linear regression, an unbiased estimate of σ2 is an error
mean square (MSE) (or, residual mean square): 2 SSE ni=1 ei2
σˆ =MSE=n−2= n−2 . The biased ML estimate of σ2 is:
2 SSE ni=1 ei2 σ ̃ = n = n .
Example: Wetland Species Richness
In the wetland species richness example, we have
SSE=(y −yˆ)2 =e2 =479.04 iii
Under LS, we have
σˆ2 = MSE = SSE = 479.04 = 8.554
Under ML, we have
σ ̃2 = SSE = 479.04 = 8.259. n 58
Which estimator is better?
Q: Why df = n − 2 for MSE? Which estimator is better?
Note: E(σ ̃2) = n−2σ2. n
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com