Properties of the least squares fit
1. the fitted line passes through (x̄, ȳ)
• see this by substituting xi = x̄
into the fitted line
ŷi = ȳ + β̂1(xi − x̄)
2. the mean of the fitted values is the
same as the mean of the observed
responses
• the mean of the fitted values is
¯̂y =
1
n
n∑
i=1
ŷi
=
1
n
n∑
i=1
(ȳ + β̂1(xi − x̄))
= ȳ +
β̂1
n
n∑
i=1
(xi − x̄)
= ȳ
1
3. the mean of the residuals is zero
• the residuals are êi = yi − ŷi
• so, using (2) above
¯̂e =
1
n
n∑
i=1
(yi − ŷi)
= ȳ − ¯̂y = 0
4. the residuals have zero correlation
with the predictor
• we can show SSêX = 0
SSêX =
n∑
i=1
(êi − ē)(xi − x̄)
=
n∑
i=1
(yi − ŷi)(xi − x̄)
=
n∑
i=1
(yi − ȳ − β̂1(xi − x̄))(xi − x̄)
2
= SSXY − β̂1SSXX = 0
• the residuals have zero
correlation with the fitted values
• we can show SSêŷ = 0
SSêŷ =
n∑
i=1
êi(ŷi − ȳ)
=
n∑
i=1
(yi − ŷi)β̂1(xi − x̄)
=
n∑
i=1
(yi − ȳ − β̂1(xi − x̄))β̂1(xi − x̄)
= β̂1SSXY − β̂
2
1SSXX
= β̂1SSXY − β̂1SSXY = 0
3
Ozone example: the fitted values,
residuals, sums and crossproducts are
shown below
xi yi ŷi êi = yi − ŷi êixi
.02 242 247.563 -5.563 -.1113
.07 237 232.887 4.113 .28791
.11 231 221.146 9.854 1.0840
.15 201 209.404 -8.404 -1.2606
Sum 911 911 0 0
• the observed and fitted responses
have the same sum
• the residuals have zero sum
• the correlation between residuals and
predictors will be zero because the
sum of cross products is zero
4
Plotting residuals to assess fit
• from (3) above, the residuals have
zero mean, and from (4) and (5)
they are uncorrelated with the
predictor x and the fitted values ŷ
• a scatterplot of the residuals versus x
should show random scatter about 0,
with no linear association with x
• the scatterplot of residuals versus
fitted values should be similar
• various problems can be revealed
from the plot of ê versus x or ŷ
– curvature indicates that the form
of the model is not correct
∗ this can be fixed by adding
the term x2 to the model or
by transforming the response
variable
5
– the magnitude of the residuals
may increase or decrease with
the predictor – sometimes called
‘fanning’ out
∗ when we use least squares
and minimize SSE, we give
equal weight to all n
deviations
∗ this implicitly assumes that
the deviations are all roughly
the same size
∗ this problem can be fixed
using a weighted least
squares criterion (giving
smaller weight to the larger
deviations) or by
transformation
6
Example: Lumber example – useable
volume versus diameter at chest height
MTB > plot c3 c1
–
– *
25+ * * *
– *
C3 – *
– *
–
0+ * * * *
– * *
– * * *
– *
–
-25+
– * *
–
– *
–
+———+———+———+———+———+——C1
15.0 20.0 25.0 30.0 35.0 40.0
• there is clearly some curvature here
• one remedy is to add a quadratic
term in the equation, giving
y = β0 + β1x + β2x
2
7
• MINITAB can fit this too
MTB > let c3 = c1**2
MTB > regress c2 2 c1 c3;
SUBC> residuals c4.
The regression equation is
volume = 29.7 – 5.62 diameter + 0.290 C3
Predictor Coef Stdev t-ratio p
Constant 29.74 51.39 0.58 0.570
diameter -5.620 3.792 -1.48 0.157
C3 0.29037 0.06572 4.42 0.000
s = 14.27 R-sq = 97.8% R-sq(adj) = 97.6%
Analysis of Variance
SOURCE DF SS MS F p
Regression 2 156236 78118 383.54 0.000
Error 17 3463 204
Total 19 159698
SOURCE DF SEQ SS
diameter 1 152259
C3 1 3976
MTB > plot c4 c1
C4 –
– *
–
20+
– * *
– * *
– * *
– * * *
0+ * *
– *
– *
– * *
– * *
-20+ *
– *
–
–
+———+———+———+———+———+——diameter
15.0 20.0 25.0 30.0 35.0 40.0
8
• the new residual plot shows no
curvature
Example: PCBs in lake trout
• consider the PCB concentration in
Cayuga Lake Trout, plotted against
the age of the fish
•
•
•
•
•
•
• ••
•
••
• •
•
•
•
•
•
•
•
•
•
•
•
•
•
•
age
pc
bs
(
pp
m
)
2 4 6 8 10 12
0
5
10
15
20
25
30
• the fitted least squares line is
PCB = −1.45 + 1.56age
9
• the residuals, however show problems
•
•
•
•
•
•
•
••
•
••
•
•
•
•
•
•
•
•
•
•
•
•
• •
•
•
pcb.age
p
cb
re
g
$
re
si
d
u
a
ls
2 4 6 8 10 12
-1
0
-5
0
5
1
0
1
5
• the residuals are larger at larger ages
• there is some curvature in the plot
• the plot of log(PCB) versus age, with
least squares line is shown
10
• the least squares fit is
log(PCB) = .03 + .259age
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
• •
•
•
age
lo
g
e
(p
cb
)
2 4 6 8 10 12
0
1
2
3
11
• the residual plot shows even spread
for all ages
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
age
re
si
d
u
a
l
2 4 6 8 10 12
-1
.0
-0
.5
0
.0
0
.5
1
.0
• the model says
PCB = e.03+.259age
• comparing model predictions at age
and age + 1 gives
12
PCBage+1
PCBage
=
e.03+.259(age+1)
e.03+.259age
= e.259 = 1.3
so
PCBage+1 = 1.3PCBage
• this is an example of exponential
growth
– where growth increases by a fixed
percentage of the previous total
– linear growth increases by a fixed
amount
– growth of bacteria, compound
interest are both examples of
exponential growth
13