Multiple Linear Regression
• is used to relate a continuous response
(or dependent) variable Y to several
explanatory (or independent) (or
predictor) variables X1, X2, . . . , Xk
• assumes a linear relationship between
mean of Y and the X’s with additive
normal errors
• Xij is the value of independent variable
j for subject i.
• Yi is the value of the dependent variable
for subject i, i = 1, 2, . . . , n.
• Statistical model
Yi = β0+β1Xi1+β2Xi2+. . .+βkXik+�i
the additive errors are assumed to be a
random sample from N(0, σ2)
• the mean of Y at X1, . . . , Xk is
β0 + β1Xi1 + β2Xi2 + . . . + βkXik
1
• as before β0 is the intercept, the value of
the mean when all other predictors are
zero
• βj , j = 1, . . . , k, is the partial slope for
predictor Xj , giving the change in the
mean for a unit change in Xj when all
other predictors are held fixed
Types of (Linear) Regression Models
• there are many possible model forms
• choosing the best one is a complicated
process
• the predictors can be continuous
variables, or counts, or indicators
• indicator or “dummy” variables take the
values 0 or 1 and are used to combine
and contrast information across binary
variables, like gender
• some examples are shown below
2
Curve
• Conc = β0 + β1t + β2t
2
time
C
o
n
c
0 10 20 30 40 50 60
0
.0
0
.0
5
0
.1
0
0
.1
5
Conc = 0 + 0.00460*t -0.00004*t^2
One continuous, one binary predictor
Two parallel lines
• Conc = β0 + β1time + β2X, where X =
0 for Males, 1 for Females
Time
C
o
n
c
0 10 20 30 40 50 60
0
.0
0
.1
0
.2
0
.3
Conc=.01 + .0015*t + .2*X
3
Two nonparallel lines
• Conc = β0+β1time+β2X+β3time∗X,
where X = 0 for Males, 1 for Females
Time
C
o
n
c
0 10 20 30 40 50 60
0
.0
0
.1
0
.2
0
.3
0
.4
0
.5
Conc = .01 + .0015*t + .12*X + .0030*t*X
Two continuous predictors
First order
• Conc = β0 + β1time + β2Dose
• effect of dose constant over time
Time
C
o
n
c
0 10 20 30 40 50 60
0
.0
0
.1
0
.2
0
.3
Conc=.01 + .0015*t + 20*dose
Dose = .01
Dose = .10
4
Interaction
• Conc =
β0 +β1time+β2Dose+β3 ∗ time ∗ dose
• effect of dose changes with time
Time
C
o
n
c
0 10 20 30 40 50 60
0
.0
0
.1
0
.2
0
.3
0
.4
0
.5
Conc = .09 + .0015*t + 1.1*dose + .0185*t*dose
D = .01
D = .1
Estimation and ANOVA
• The regression parameters are estimated
using least squares.
• That is, we choose β0, β1, . . . , βk to
minimize
SSE =
n∑
i=1
(yi−β0−β1xi1− . . .−βkxik)
2
5
• Minitab can fit multiple regression
models easily
• we will soon learn a formula for these
estimates using matrices
• the error variance is estimated as before
s
2 =
SSE
n − k − 1
= MSE
• The ANOVA table similar to that for
simple linear regression, with changes to
degrees of freedom to match the number
of predictor variables.
Source d.f. SS MS
Regression k SSR MSR=SSR/k
Residual n-k-1 SSE MSE=SSE/(n-k-1)
Total n-1 SST
• later we will see that SSR can be
partitioned into a part explained by one
set of predictors, SSR(X1) and the
6
remainder, SSR(X2|X1), explained by
the rest of the variables
• the coefficient of determination R2 is
R
2 =
SSR
SST
as before, and is the fraction of the total
variability in y accounted for by the
regression line
• it ranges between 0 and 1
• R2 = 1.00 indicates a perfect (linear) fit
• R2 = 0.00 is a complete lack of linear fit.
7