程序代写 1 Inference on the simple linear regression

1 Inference on the simple linear regression
2 Random Vectors/Matrices
Simple Linear Regression Model in Matrix Terms
4 Estimation of E(Yh )

Estimation vs. Prediction

Recall the simple linear regression (SLR) model Yi=β0+β1Xi+εi, εi∼i.i.d.N(0,σ2),
for all i = 1,…,n.
The least-squares (LS) estimates:
􏰉ni=1(Xi −X ̄)(Yi −Y ̄) β 1 = 􏰉 ni = 1 ( X i − X ̄ ) 2 ,
βˆ0 = Y ̄−βˆ1X ̄,
σˆ2 = n−2 (Yi−Yˆi)2.
What are the sampling distributions of βˆ0, βˆ1, and σˆ2?

Sampling distribution of SLR estimation
Under a simple linear regression model,
􏰬􏰭 􏰬􏰭1 X ̄2 −X ̄ 
βˆ0 ∼MVN β0 ,σ2 n +􏰉n (Xi−X ̄)2 􏰉n (Xi−X ̄)2 .
  i=1 i=1  ˆ −X ̄ 1
Furthermore, let σˆ2 =
square. Then
􏰉 ni = 1 ( X i − X ̄ ) 2 􏰉 ni = 1 ( X i − X ̄ ) 2 (Yi − Yˆi )2 be the residual mean
( n − 2 ) σˆ 2 2
σ2 ∼ χn−2,
􏰉n n−2 i=1
and is independent of βˆ0 and βˆ1.
This section is to prove the theorem, i.e., the sampling distributionof(βˆ0,βˆ1)T andσˆ2.

1 Inference on the simple linear regression
2 Random Vectors/Matrices
Simple Linear Regression Model in Matrix Terms
4 Estimation of E(Yh )
Estimation vs. Prediction

Random Vector and Matrix
A random vector or a random matrix contains elements that are random variables.
SLR: The response variables Y1, . . . , Yn can be written in the form of a random vector
Alternative notation: Y = (Y1, . . . , Yn)′.
Yn×1= .  Yn
 Y1  .

Expectation of Random Vector/Matrix
The expectation of an n × 1 random vector Y is
E(Y)n×1 = [E(Yi) : i = 1,…,n] =  SLR: What is E(Y|X)?
SinceE(Yi|X)=β0+β1Xi fori=1,…,n, β0+β1X1 
. E(Y|X) =  . 
In general, the expectation of an n1 × n2 random matrix Y
E(Y)n1×n2 = [E(Yii′) : i = 1,…,n1;i′ = 1,…,n2].

Variance-Covariance Matrix of Random Vector
The variance-covariance matrix of an n × 1 random vector Y is
Var(Y) = E 􏰣(Y − E(Y))(Y − E(Y))′􏰤
 Var(Y1) Cov(Y1,Y2) ··· Cov(Y1,Yn) 
 ··· Var(Y2) ··· Cov(Y2,Yn)  =.. .
… ··· ··· ··· Var(Yn)
Note: Var(Y ) is symmetric.
Why? Cov(Yi,Yi′) = Cov(Yi′,Yi). SLR: What is Var(Y|X)?
σ2 0 ··· 0
0σ2···0 2 Var(Y|X)= . . . =σ In×n.
… 0 0 ··· σ2

Variance-Covariance Matrix of Random Vector
The random errors ε1,…,εn can be written in the form of a
random vector
εn×1= .  εn
SLR: What is E(ε)?
E(ε) = 0n×1.
SLR: What is the variance-covariance matrix of ε?
σ2 0 ··· 0
0σ2···0 2 Var(ε)= . . . =σ In×n.
… 0 0 ··· σ2
 ε1  .

Multivariate Normal Distribution
Let Yn×1 = (Y1, . . . , Yn)′ follow a multivariate normal distribution with mean
μn×1 = (μ1,…,μn)′ Σn×n=[σ2′ :i=1,…,n;i′=1,…,n].
We denote this by
Y ∼MVN(μ,Σ). The probability density function is
1 􏰰1 ′−1 􏰱 f(Y)=(2π)n/2|Σ|1/2exp −2(Y−μ)Σ (Y−μ) ,
where |Σ| is the determinant of Σ.
and variance

Preliminaries (Rencher and Schaalj, Chapter 4.4) Properties of random vectors
For Y (n × 1 random vector), A (n × n non-random matrix), and b (n × 1 non-random vector), we have
E(AY+b) = AE(Y)+b Var(AY+b) = AVar(Y)A′
Properties of Derivative
For θ (p × 1 vector of parameters), c (p × 1 vector of variables), and C (p × p symmetric matrix of variables), we have
∂(θ′Cθ) ∂θ

1 Inference on the simple linear regression
2 Random Vectors/Matrices
Simple Linear Regression Model in Matrix Terms
4 Estimation of E(Yh )
Estimation vs. Prediction

 Y1  .
Let Yn×1 =  .  denote the n × 1 vector of response Yn
variables.
LetXn×p = . . . . denotethen×p
 1 X11 ··· Xp−1,1  ….
1 X1n ··· Xp−1,n design matrix of predictor variables.
εn errors.
Let βp×1 =  .  denote the p × 1 vector of regression
 ε1  .
Letεn×1 = . denotethen×1vectorofrandom
coefficients.

Linear Regression in Matrix Terms
The linear regression model in matrix terms is
ε ∼ MVN(0,σ2I).
Equivalently, we have
Y ∼MVN(Xβ,σ2I).
Why? Since E(ε) = 0n×1 and Var(ε) = σ2In×n.
E(Y) = E(Xβ+ε)=Xβ+E(ε)=Xβ Var(Y) = Var(Xβ+ε)=Var(ε)=σ2I.

Least Squares Method
Recall that the least squares method for 2-covariates SLR minimizes
n Q(β0,β1)=􏰍(Yi −β0 −β1Xi)2
In general, the least-square for p-covariates can be written as
(Y − Xβ)′(Y − Xβ)
= Y′Y − β′X′Y − Y′Xβ + β′X′Xβ
= Y′Y − 2β′X′Y + β′X′Xβ

Normal Equations
 ∂Q  ∂β0
􏰬􏰭 ∂Q ∂Q =∂β1 .
∂β . p×1  . 
Differentiate Q with respect to β to obtain: ∂Q = −2X′Y + 2X′Xβ.
Set the equation above to 0p×1 and obtain a set of normal equations:
X′Xβ = X′Y.

Estimated Regression Coefficients βˆ
Let βˆp×1 denote the least squares estimate of β.
Thus the least squares estimate of β is βˆ = (X′X)−1X′ Y
non-random
assuming that the matrix X′X is nonsingular and thus invertible.
What is the distribution of Y ?
What is the distribution of βˆ = (βˆ0, · · · , βˆp−1)′?

Mean and Variance of βˆ Recall that βˆ = (X′X)−1X′ Y
non-random What is the expectation of βˆ?
E(βˆ) = E((X′X)−1X′Y) = (X′X)−1X′Xβ = β What is the variance-covariance matrix of βˆ?
Var(βˆ) = Var((X′X)−1X′Y)
= (X′X)−1X′Var(Y)X(X′X)−1
= σ2(X′X)−1
In the special case for simple linear regression with p = 2:
􏰮 􏰯1X ̄2 −X ̄ ˆˆˆ+􏰉􏰉
Var(β0) Cov(β0,β1) 2
ˆˆˆ −X ̄ 1
(X −X ̄)2 n (X −X ̄)2 Cov (β1 , β0 ) Var (β1 ) 􏰉ni =1 (Xi −X ̄ )2 􏰉ni =1 (Xi −X ̄ )2
=σi=1i i=1i
What is the distribution of βˆ = (βˆ0, βˆ1)′?

We have proved the first part of the following theorem.
Sampling distribution of SLR estimators
Under a simple linear regression model,
􏰬􏰭 􏰬􏰭1 X ̄2 −X ̄ 
βˆ0 ∼MVN β0 ,σ2 n +􏰉n (Xi−X ̄)2 􏰉n (Xi−X ̄)2 .
  i=1 i=1  ˆ −X ̄ 1
Furthermore, let σˆ2 =
square. Then
􏰉 ni = 1 ( X i − X ̄ ) 2 􏰉 ni = 1 ( X i − X ̄ ) 2 (Yi − Yˆi )2 be the residual mean
( n − 2 ) σˆ 2 2
σ2 ∼ χn−2,
􏰉n n−2 i=1
and is independent of βˆ0 and βˆ1.
How to use the above result to perform the hypothesis testing?
H0 :β1 =0, v.s. HA :β1 ̸=0

Sampling distribution of βˆ 1
We have known that
ˆ 􏰬 σ2 􏰭 βˆ1−β1
∼N(0,1). But, we do not know σ2. A natural (unbiased) estimator of
β1∼N β1,􏰉n (X−X ̄)2 ⇐⇒􏰚 σ2
i = 1 i 􏰉 ni = 1 ( X i − X ̄ ) 2
and (n−2)σˆ2
Consider the test statistic
σˆ 2 = n − 2
by previous theorem.
􏰚 σˆ2 ∼Tn−2,whereσˆ=n−2 ei
􏰉 ni = 1 ( X i − X ̄ ) 2 i = 1
The denominator is also referred to as the estimated standard error of βˆ1.

Hypothesis Testing for β1 A test of interest is:
H0 :β1 =0 vs. HA :β1 ̸=0. The test statistic is:
T = s􏰒e(βˆ1), wheres􏰒e(βˆ1)=
􏰜 σˆ 2 􏰉ni=1(Xi −X ̄)2
UndertheH0 :β1 =0,
T = βˆ 1 − 0 ∼ T n − 2
s􏰒e ( βˆ 1 )
p-value = 2 × P(Tn−2 > |t∗|), where t∗ is the observed test
statistic by plugging the data. Similar procedure for CI.

Example: Wetland Species Richness
In the wetland species richness example, the summary statistics are:
x ̄ = 0.5210, y ̄ = 7.9483, n = 58
􏰉ni=1(xi − x ̄)(yi − y ̄) = −10.7775, 􏰉ni=1(xi − x ̄)2 = 2.3316 􏰉ni=1(yi −yˆi)2 =479.03,􏰉ni=1(yi −y ̄)2 =528.84
The least squares estimated slope is:
βˆ1 = −10.7775 = −4.622
The least squares estimated intercept is:
βˆ0 = 7.9483 − (−4.622) × 0.5210 = 10.357 The estimated error variance is:
σˆ2 = 479.03 = 8.554 56

Example: Wetland Species Richness
The estimated standard error of βˆ1 is: 􏰜2􏰛
ˆ def σˆ 8.554
s􏰒e(β1) = 􏰉ni=1(xi − x ̄)2 = 2.3316 = 1.915
Note that tn−2,α/2 = t56,0.025 = 2.003. Thus, a 95% CI for β1 is
βˆ1 ± tn−2,α/2s{βˆ1} =
= −4.622 ± 3.836
Interpretation:
The 95% CI for β1 is [−8.459, −0.785].
−4.622 ± 2.003 × 1.915 = [−8.459, −0.785].

Example: Wetland Species Richness
To test whether there is a linear relationship between the # of species and the percent forest cover:
H0 :β1 =0 vs. HA :β1 ̸=0. The observed test statistic is
t∗ = βˆ1 = −4.622 = −2.413. s􏰒e(βˆ1 ) 1.915
Compared with T56, the p-value is
2 × P(T56 > 2.413) = 0.0191.
Interpretation: Reject H0 at α = 0.05 level. There is moderate evidence that there is a linear relationship between # of species and percent forest cover.

Understanding the R output
Sampling distribution of SLR estimators
● ● ● ● ● ● ● ●●
●● ● ● ● ● ●●
●●●● ●●● ●
●●● ●●●● ● ● ●●●● ●●
●● ● ●● ● ●
●●●● ● ●●● ●●●
● ●●● ● ●●●●●●●
iris$Petal.Length
Consider a simple linear regression model. Let
σˆ2 = 1 􏰉n (Yi − Yˆi )2 be the residual mean square. Then
( n − 2 ) σˆ 2 2
σ2 ∼ χn−2,
and is independent of βˆ0 and βˆ1. (Proof in next lecture)
iris$Petal.Width
0.5 1.0 1.5 2.0 2.5

1 Inference on the simple linear regression
2 Random Vectors/Matrices
Simple Linear Regression Model in Matrix Terms
4 Estimation of E(Yh )
Estimation vs. Prediction

Estimation of E(Yh)
Xh = the level of X for which we want to estimate the mean
Xh could be observed or not, but should be within the
range of {Xi }.
μh = E(Yh) = β0 + β1Xh = the mean response at Xh. The estimate of μh is
μˆ h = βˆ 0 + βˆ 1 X h . μˆh ∼N(μh,Var(μˆh)).Why?

Estimation of E(Yh) The variance of μˆh is
2􏰰1 (Xh−X ̄)2 􏰱 Var(μˆh)=σ n+􏰉ni=1(Xi−X ̄)2 . (proof?)
The estimated variance of μˆh is
2􏰰1 (Xh−X ̄)2 􏰱 Var(μˆh)=σˆ n+􏰉ni=1(Xi −X ̄)2 .
It can be shown that
μˆ h − μ h
Var(μˆh ) A (1 − α) CI for μh is
μˆh ± tn−2,α/2

Example: Wetland Species Richness
The estimated mean number of species at xh = 0.10 is μˆh = βˆ0 + βˆ1xh = 10.357 − 4.622 × 0.10 = 9.895.
The estimated variance of μˆh is
2􏰰1 (xh−x ̄)2 􏰱 Var(μˆh) = σˆ n + 􏰉ni=1(xi − x ̄)2
􏰰 1 (0.10 − 0.521)2 􏰱
= 8.554 58 + 2.331 = 0.80.
The 95% CI for the mean number of species at Xh = 0.10 is
μˆh ± tn−2,α/2 Interpretation:
= 9.895 ± 1.789 = [8.105, 11.684].
Var(μˆh) =
9.895 ± 2.003 ×

Example: Wetland Species Richness
●● ●●●●●●●●
0.2 0.4 0.6 0.8
Percent forest cover
Number of species

1 Inference on the simple linear regression
2 Random Vectors/Matrices
Simple Linear Regression Model in Matrix Terms
4 Estimation of E(Yh )
Estimation vs. Prediction

Example: Wetland Species Richness
The fitted regression line is yˆ = 10.357 − 4.622x . The estimated error variance is σˆ2 = 479.03 = 8.554.
Questions of interest:
1 What is the population mean number of species for a 10% forest cover around the wetland?
2 What is the number of species for a 10% forest cover around a wetland yet to be sampled?
In both cases, the estimated/predicted value is: yˆ = 10.357 − 4.622 × 0.10 = 9.895.
Q: Which quantity has larger uncertainty?

Estimation vs. Prediction
Simple linear regression model
Yi =β0 +β1Xi +εi, εi ∼iidN(0,σ2), i =1,…,n.
mean response at X = 0.1: β0 + β1 × 0.1 “new” response at X = 0.1: β0 + β1 × 0.1 + ε sub-population vs. single observation

Estimation vs. Prediction
Consider a simple model (with covariate 0)
Yi =μ+εi, εi ∼iidN(0,σ2).
1 Then, estimate μ by What is Var(μˆ)?
Var(Y)= n.
2 Also, predict a new observation Y by
Yˆ(new) = Y ̄
What is the variance of the prediction error?
Var(Y(new) − Yˆ(new)) = Var(Y(new)) + Var(Y ̄) = σ2 + σ2 n

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts