CS计算机代考程序代写 Residual Analysis for two-way ANOVA

Residual Analysis for two-way ANOVA

The twoway model with K replicates, including inter-
action, is Yijk = µij + �ijk = µ + αi + βj + γij + �ijk
with i = 1, . . . , I, j = 1, . . . , J , k = 1, . . . , K.

In carrying out the F tests for interaction, and for the
main effects of factors A and B, we have assumed that
�ijk are as sample from N(0, σ

2).
Among other things, this means that:

• the distribution of the errors (and in particular, the
variance σ2) does not differ depending on the level
of factor A, the level of factor B, or the mean of
the response (µij = µ + αi + βj + γij)

• the errors are a sample from a normal distribution

If these assumptions hold, then the p-values for the
tests of interaction and main effects are valid. If the as-
sumptions do not hold, then the p-values may substan-
tially over- or under-estimate the evidence against the null
hypotheses.

Residuals are usually defined as the difference “data-
prediction”.

In the twoway anova model with interaction, the pre-
dicted value of Yijk is µ̂ij , and so the residuals are

rijk = Yijk − µ̂ij = Yijk − Ȳij.

(Another way of writing the residual for the twoway model

with interaction is rijk = Yijk − µ̂ − α̂i − β̂j − γ̂ij .)

If the sample size is moderately large, the residuals should
be approximately equal to the errors �ijk, and so we use
the residuals (which are known to us) in place of the errors
�ijk (which are unknown) to assess the plausibility of the
model assumptions.

The following plots are often useful in this regard:

1. A QQ plot of the residuals is used to assess the
assumption of normality of errors

2. To assess the assumption that the distribution of
the errors (in particular the variance of the distri-
bution) does not depend on the levels of either fac-
tor A or factor B, the residuals should be plotted
against:

(a) the levels of factor A

(b) the levels of factor B

(c) the fitted values Ȳij.

Example: A two factor experiment was carried out
in which the survival times (in units of 10 hours) were
measured for groups of four animals (replicates) randomly
allocated to three poisons and four treatments.

The data were as follows:

Poison Treatment Data
I A 0.31 0.45 0.46 0.43
II A 0.36 0.29 0.40 0.23
III A 0.22 0.21 0.18 0.23
I B 0.82 1.10 0.88 0.72
II B 0.92 0.61 0.49 1.24
III B 0.30 0.37 0.38 0.29
I C 0.43 0.45 0.63 0.76
II C 0.44 0.35 0.31 0.40
III C 0.23 0.25 0.24 0.22
I D 0.45 0.71 0.66 0.62
II D 0.56 1.02 0.71 0.38
III D 0.30 0.36 0.31 0.33

The data were entered into minitab, and a twoway anova
was carried out, as follows:

MTB > print c1

0.31 0.45 0.46 0.43

0.36 0.29 0.40 0.23

0.22 0.21 0.18 0.23

0.82 1.10 0.88 0.72

0.92 0.61 0.49 1.24

0.30 0.37 0.38 0.29

0.43 0.45 0.63 0.76

0.44 0.35 0.31 0.40

0.23 0.25 0.24 0.22

0.45 0.71 0.66 0.62

0.56 1.02 0.71 0.38

0.30 0.36 0.31 0.33

MTB > set c2

DATA> 4(1 1 1 1 2 2 2 2 3 3 3 3)

DATA> set c3

DATA> 12(1) 12(2) 12(3) 12(4)

MTB > twoway c1 c2 c3;

SUBC> residuals c4;

SUBC> fits c5.

Two-way ANOVA: C1 versus C2, C3

Source DF SS MS F P

C2 2 1.03301 0.516506 23.22 0.000

C3 3 0.92121 0.307069 13.81 0.000

Interaction 6 0.25014 0.041690 1.87 0.112

Error 36 0.80073 0.022242

Total 47 3.00508

S = 0.1491 R-Sq = 73.35% R-Sq(adj) = 65.21%

MTB > nscores c4 c6

The normal scores and residual plots are as follows:

Figure 1: Normal scores plot of residuals

Figure 2: Plot of residuals vs type of poison

Figure 3: Plot of residuals vs treatment

Figure 4: Plot of residuals vs fitted values

If the QQ plot shows evidence of non-normality, or if
the disribution of the residuals appears to depend on the
levels of one or both factors, then the inferences (eg p-
values) concerning the model parameters may be invalid.

In this case, the QQ plot provides some suggestion of
non-normality. The plots of residual vs factor level suggest
that the variance of the residuals is not constant across

levels of either factor. A definite pattern can be seen in
the plot of residuals vs predicted values, in which variance
of the residual is increasing as the fitted value increases.
This suggests that the variance of Y is increasing with
the mean of Y . Consequently, our conclusions regarding
the significance of effects and interactions may be in error
due to incorrect assumptions.

In such cases, one approach which is often taken is to
try to find a transformation of the dependent variable
to a form for which the model assumptions are better
satisfied. Transformations which are sometimes tried are
to replace Y by


Y , log(Y ), or 1/Y .

There are some results from probability and statistical
theory which provide techniques to search for so-called
variance stabilizing transformations. These ideas are
studied in some higher level statistics courses. After care-
ful examination of the pattern of residuals, we are led
to consider the reciprocal transformation, Zijk = 1/Yijk.
(In this case, where Y are measurements of time, then
Z = 1/Y are described as rates, and have units of
1/time.)

A twoway model was fit for Zijk, leading to the fol-
lowing output:

MTB > let c7=1/c1

MTB > twoway c7 c2 c3;

SUBC> residuals c8;

SUBC> fits c9.

Two-way ANOVA: C7 versus C2, C3

Source DF SS MS F P

C2 2 34.8771 17.4386 72.63 0.000

C3 3 20.4143 6.8048 28.34 0.000

Interaction 6 1.5708 0.2618 1.09 0.387

Error 36 8.6431 0.2401

Total 47 65.5053

S = 0.4900 R-Sq = 86.81% R-Sq(adj) = 82.77%

MTB > nscores c8 c10

The normal scores and residual plots for the trans-
formed data are as follows:

Figure 5: Plot of residuals vs poison – transformed data

Figure 6: Plot of residuals vs treatment – transformed
data

Figure 7: Plot of residuals vs fitted values – transformed
data

Figure 8: Normal scores plot of residuals – transformed
data

These residual plots suggest few departures from the
model assumptions, and so we can be confident about the
validity of our conclusions for the transformed data.

Reference: Box and Cox (1964) An analysis of trans-
formation. J.Roy.Stat.Soc.B, 26, 211.

The data were part of a larger investigation to combat
the effects of toxic agents.