Sta$s$cal Inference STAT 431
Lecture 12: Simple Regression (II) Probabilis$c Model and Basic Inferences
Review of Last Lecture
• Simple regression summarizes the rela$onship between a predictor and a response
• The LS regression line gives a linear equa$on for the rela$onship
ˆ =βˆ +βˆ
βˆ = · , βˆ = ̄ −βˆ ̄
• Transforma$on to new coordinates allows LS regression to capture nonlinear trend as well (Tukey’s bulging rule)
STAT 431
•
A Probabilis$c Model for Simple Regression
Yi =β0 +β1xi +i, i=1,…,n Signal Noise
i values are noises (errors) sa$sfying the following assump$ons
1. Independence:i’saremutuallyindependentrandomvariables
2. Homoscedas/city:i’shavecommonmean0,andcommonvarianceσ2
3. Normality:The’sarenormallydistributed i
The model assumes that there exists a large or infinite (possibly hypothe$cal) popula$on
•
STAT 431
Normality Assump$on
i.i.d. 2 Pricei =β0 +β1 ×Sqfti +i, where i ∼ N(0,σ )
STAT 431
Subpopula$ons
Parameter Es$ma$on
• Least square es$mators for β0 and β1 (what are their units?)
sy ni=1(xi −x ̄)(Yi −Y ̄) β1=r·s= n(x−x ̄)2
ˆ
βˆ = Y ̄ − βˆ x ̄
x i=1i 01
• Terminology
– FiSedvalues: Yˆ =βˆ +βˆx, i=1,…,n
i01i
– Residuals: E =Y −Yˆ, i=1,…,n
iii
– Errorsumofsquares(SSE):
• Es$mator for σ2 n E2 SSE
n−2 n−2
– S is called root mean square error (RMSE), or residual standard error
STAT 431
n
S S E = E i2
i=1 S2= i=1 i=
Sampling Distribu$ons • Sampling distribu$ons of βˆ and βˆ [deriva$on in class]
– MeanandSDofβˆ andβˆ 01
01 – DefineSxx= ni=1(xi−x ̄)2
x2i nSxx
E(βˆ)=β, SD(βˆ)=√σ
1 1 1 Sxx
E(βˆ)=β, SD(βˆ)=σ 000
– Normality
SD(βˆ )
• Sampling distribu$on of S2 : (n − 2)S2 SSE 2
βˆ − β
0 0 ∼N(0,1),
βˆ − β
1 1 ∼N(0,1)
SD(βˆ ) 01
σ2 = σ2 ∼χn−2
– Important fact: S2 is independent of both βˆ and βˆ 01
STAT 431
Inferences for Regression Coefficients
• Typically, we do not know σ , so SD(βˆ ) and SD(βˆ ) are es$mated by
SE(βˆ)=Sx2i, SE(βˆ)=√S
0 nSxx 1 Sxx
• Since S2 is independent of both βˆ and βˆ , we obtain 01
βˆ − β
0 0 ∼tn−2,
βˆ − β
1 1 ∼tn−2
01
SE(βˆ ) 01
SE(βˆ ) Pivotal R.V.’s
• Based on these pivotal r.v.’s, we can use the t distribu$on to construct tes$ng procedures and CI’s for β0 and β1
STAT 431
• 100(1−α)%CIfor β0 and β1
βˆ ±t SE(βˆ), βˆ ±t
SE(βˆ) 0 n−2,α/2 0 1 n−2,α/2 1
• Tes$ng H0 :β1 =β10 vs.H1 :β1 =β10 atlevel α
Reject H0 if |t| = 1 1
> tn−2,α/2 • Case of par$cular interest: H0 : β1 = 0vs.H1 : β1 = 0
– Teststa$s$c βˆ t=1
| βˆ − β 0 |
S E ( βˆ ) 1
– Reject H0 if |t| > tn−2,α/2
S E ( βˆ ) 1
STAT 431
Linear Regression in R
> regmodel <- lm(Price/1000 ~ Sqft., data = newton) > summary(regmodel)
Call:
lm(formula = Price/1000 ~ Sqft., data = newton)
Residuals:
Min 1Q Median 3Q Max
-445.09 -125.97 36.45 107.27 281.39
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -25.96758 54.77713 -0.474 0.638
Sqft. 0.35607 0.02152 16.549 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 180.7 on 44 degrees of freedom Multiple R-squared: 0.8616, Adjusted R-squared: 0.8584 F-statistic: 273.9 on 1 and 44 DF, p-value: < 2.2e-16
STAT 431
Analysis of Variance for Simple Regression
• Three sums of squares
– Total sum of squares (SST): SST = (yi − y ̄)2
• Measure varia$ons of the yi’s around y ̄
– Regression sum of squares (SSR): SSR = (yˆ − y ̄)2
• Represents varia$ons in the responses that can be explained by the predictor – Error sum of squares (SSE): SSE = (y − yˆ )2
• Coefficient of determina$on: r2(R2) = SSR/SST (= squared correla$on coef.)
• Pythagorean theorem: SST = SSR + SSE (Geometric representa$on?)
Degrees of freedom: n-1 1 n-2 (Why?)
• Fact: When β1 = 0, we have F = SSR/1 ∼ F1,n−2
ii
i
SSE/(n − 2) STAT 431
• Key points of this class
– Modelingassump$onsinsimpleregression – Terminologyforregressionanalysis
– Es$matorsofthethreeparameters • Sampling distribu$ons
• Independence
– Pivotalrandomvariablesforinferenceoftheinterceptandtheslope – Threesumsofsquares
• Reading parts of Sec$ons 10.1—10.3 of the textbook
• Next class: Simple Regression (III) (parts of Ch.10.3)
STAT 431