CS代写 One-factor ANOVA

One-factor ANOVA

Do three lizard populations have the same mean length?
Island 1 Island 2 Island 3

Copyright By PowCoder代写 加微信 powcoder

H0: The mean length is the same for all three. μ1 =μ2 =μ3
HA: The mean length is not the same for all three.

Plot suggests a difference. Let’s test it
Length (Y)
10 8 6 4 2 0

Calculate mean squares and take their ratio to get the F statistic
𝑀𝑆)*+&, = 𝑆𝑆)*+&, = 39.45 = 19.72 𝜈)*+&, 2
𝑀𝑆!”#$%&'( = 𝑆𝑆!”#$%&'( = 22.90 = 2.55 𝜈!”#$%&'( 9
𝐹 = 19.72 = 7.75 2.55

ANOVA table for lizard length example
Probability 0.6 density 0.4
F distribution for νGroup = 2, νResidual = 9
Observed value
Therefore, at 𝛼 = 0.05, we
reject the null hypothesis
that all three island means are equal (ANOVA: F2, 9 =
7.75, P = 0.01).
0.00 1 2 3 4 5 6 7 8 9 10 Value of F statistic

Question 1
A researcher performs an F test for an ANOVA calculates a statistic of 0.02 with 2 and 9 degrees of freedom. The probability that the statistic will be this small or smaller is 0.02. Should the researcher reject the null hypothesis?
A. Yes B. No
0.6 density 0.4
Probability
0.00 1 2 3 4 5 6 7 8 9 10
Value of F statistic Observed value

Probability density
0.6 0.4 0.2 0.0
H0 is rejected only for large values of the F statistic.
0 1 2 3 4 5 6 7 8 9 10 Value of F statistic
Observed value

Probability density
0.6 0.4 0.2 0.0
In this example, the P-value is 0.98!
0 1 2 3 4 5 6 7 8 9 10 Value of F statistic
Observed value

Fill in the ANOVA table
Circadian rhythm example
𝑌) = – 0 . 4 3

ANOVA table for circadian rhythm example
• Under H0 the probability that F ≥ 8.59 = 0.0007.
• At 𝛼 = 0.05, we reject the null hypothesis that all four treatment means are equal
(ANOVA: F3, 20 = 8.59, P = 0.0007).

Goodness of fit for ANOVA

ANOVA table for circadian rhythm example
• Under H0 the probability that F ≥ 8.59 = 0.0007.
• At 𝛼 = 0.05, we reject the null hypothesis that all four treatment means are equal
(ANOVA: F3, 20 = 8.59, P = 0.0007).

The total sum of squares measures variation due to all causes
SSTotal = SSGroups + SSResidual νTotal = νGroups + νResidual

ANOVA table for circadian rhythm example
• Under H0 the probability that F ≥ 8.59 = 0.0007.
• At 𝛼 = 0.05, we reject the null hypothesis that all four treatment means are equal
(ANOVA: F3, 20 = 8.59, P = 0.0007).

How much variability is explained by differences between group means?
The coefficient of determination (R2) provides a simple measure:
𝑆𝑆”#$%& 𝑆𝑆’$()*
Derived from… SSTotal = SSGroup + SSResidual

ANOVA table for circadian rhythm example
• Under H0 the probability that F ≥ 8.59 = 0.0007.
• At 𝛼 = 0.05, we reject the null hypothesis that all four treatment means are equal
(ANOVA: F3, 20 = 8.59, P = 0.0007).
• R2 = 0.56. (6.61/11.74)

Question 2
Which has a higher R2?
10 10 123123
Island Island
Length (cm)

Similar total spread of points implies similar SSTotal
Higher difference in averages in A implies higher SSGroups
Lower spread of points within each island in A implies lower SSResidual
10 10 123123
Island Island
Length (cm)

Assumptions of analysis of variance

General linear models
Model: A mathematical representation of the relation between a dependent variable and one or more independent variables

Linear regression is one kind of general linear model
The ith observed value of Y
Expected value of Y based on explanatory variable X
Random deviation from expected value due to all other causes
𝑌! = 𝛼 + 𝛽𝑋! + 𝜀!

Graphic display of linear regression
𝑌! = 𝛼 + 𝛽𝑋! + 𝜀!
64 63 62 61 60
70 75 80 85 90
Temperature (X)
Chirp rate (Y)

ANOVA is also a kind of general linear model
𝑌!” = 𝜇! + 𝜀!” Mean
length of lizards on island i
Length of jth lizard on island i
Random deviation from island mean due to all other causes

ANOVA as a general linear model
Each length is the sum of the island mean… …plus a random deviation
Length (Y)
123 Island

The key assumptions of all linear models are about the distribution of the random 𝜀 terms.
𝑌! = 𝛼 + 𝛽𝑋! + 𝜀! 𝑌!” = 𝜇! + 𝜀!”

All of the random terms ε are assumed to be
drawn from a normal distribution with mean
zero and variance 𝜎- . !”#$%&'(
σ Residual 0ε

Strategy for testing assumption of normality and equal variance
1. Estimate the values of ε.
2. Compare the variance of ε across
different groups.
3. Test for normality of ε.

The values of ε can be estimated as the residuals of the ANOVA
Residuals are the deviations of each observation from its group average.
Residualij =Yij −Yi

Question 3
Length(𝑌!” )
What is the residual for the third lizard from island 1?
𝑅𝑒𝑠𝑖𝑑𝑢𝑎𝑙 =𝑌 −𝑌, !” !” !

Length average Residual
(𝑌) (𝑌))(𝑌−𝑌)) Island !” ! !” !
Calculating residuals

Carry out Levene’s test on the residuals • Question: Do groups 1 through k all
have the same variance?
• Null and alternative hypotheses:
𝐻#: 𝜎$% = 𝜎%= 𝜎&%…= 𝜎’%
𝐻(: At least one variance is different from the others.

Test for normality by making a Normal Probability Plot of the residuals
Observed value of residuals
Theoretical quantiles (if residuals are normal)
Perfectly normal data lie on a straight line
Note: You can also test the residuals with a Shapiro-Wilk test.
−2 −1 0 1 2

What to do when data violate assumptions?
Three possibilities
1. Rely on robustness of ANOVA.
2. Transform the data.
3. Use a nonparametric test instead.

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com