代写代考 MAST20005/MAST90058"

title: “Week 9 Lab, MAST20005/MAST90058”
author: “School of Mathematics and Statistics, University of Melbourne”
date: “Semester 2, 2022”
institute: “University of Melbourne”

fontsize: 11pt
classoption: a4paper
html_document:
toc_float:
collapsed: false

**Goals:** (i) Analysis of variance; (ii) Study the distribution of the test
statistics and p-values through simulations; (iii) Compare tests by simulating
power curves.

**Data for the [Analysis of variance (ANOVA)] section:** Corn data
(`corn.txt`). The yield obtained, in bushels per acre, of 4 test plots for
each of 4 different varieties of corn. The data file can be obtained from the
shared folder in the computer labs, or from the LMS.

“`{r echo=FALSE}
set.seed(1409)

# Analysis of variance (ANOVA)

Consider the corn dataset. Let $\mu_i$ be the average yield of variety $i$ of
corn, $i = 1, 2, 3, 4$. Test the hypothesis $H_0\colon \mu_1 = \mu_2 = \mu_3
= \mu_4$ at the 5\% level of significance.

| Variety | Yield | | | |
|:——-:|——:|——:|——:|——:|
| 1 | 68.82 | 76.99 | 74.30 | 78.73 |
| 2 | 86.84 | 75.69 | 77.87 | 76.18 |
| 3 | 90.16 | 78.84 | 80.65 | 83.58 |
| 4 | 61.58 | 73.51 | 74.57 | 70.75 |

Assuming the data file is in your working directory, the following
commands will load the data into R, convert `Corn` to a ‘factor’
(categorical variable) and plot the data.
“`{r results=”hide”, fig.keep=”none”}
data <- read.table("corn.txt", header = TRUE) Corn <- factor(data[, 1]) Yield <- data[, 2] table(Corn) tapply(Yield, list(Corn), mean) # group means boxplot(Yield ~ Corn) Do an ANOVA: ```{r results="hide", fig.keep="none"} m1 <- lm(Yield ~ Corn) qqnorm(residuals(m1)) summary(m1) Hence complete the following table: | Source | df | SS | MS | $F$ | p-value | |:----------|:--:|:--:|:--:|:---:|:-------:| | Treatment | | | | | | | Error | | | | | | | Total | | | | | | Enter the command: ```{r results="hide"} pairwise.t.test(Yield, Corn, pool.sd = FALSE, p.adjust.method = "none") which gives the p-values for all possible paired t-tests. (Note for advanced users: this version does not adjust for multiple testing.) To confirm this table, try commands like: ```{r results="hide"} t.test(Yield[Corn == 1], Yield[Corn == 3]) What are your conclusions? Can you interpret the coefficients in the `lm()` output? What assumptions have you made? Do you think they are reasonable? # Simulate the distribution of test statistics and p-values In this task we explore the null distribution of test statistics. To this end, it is useful to know how to extract various objects from the R test output. For ```{r results="hide"} x <- rnorm(5) y <- rnorm(5) test.result <- t.test(x, y, var.equal = TRUE) names(test.result) To extract `statistic`, we use the dollar sign operator: ```{r results="hide"} test.result$statistic We cannot learn much by looking at a single value of the statistic. Thus, we generate many such statistics before we can look at their properties. To generate 1000 t-statistics from testing two groups of 10 standard random normal numbers, we can use ts <- replicate(1000, t.test(rnorm(5), rnorm(5), var.equal = TRUE)$statistic) Under appropriate assumptions, the test statistic follows a t-distribution with 8 degrees of freedom. How can we test if that is true? One way to check this is to plot the theoretical density of the t-statistic we should be seeing, and superimposing the density of our simulated statistics on top of it. ```{r fig.width=6, fig.height=4} hist(ts, freq = FALSE, nclass = 25, col = "grey", ylim = c(0, 0.4), xlab = "T statistic", ylab = "Density") # histogram lines(density(ts), lty = 2, lwd = 2) # smooth density estimate curve(dt(x, df = 8), from = -4.5, to = 4.5, add = TRUE, col = "red", type = "l", lwd = 2) # theoretical density Another way is to use a QQ plot: ```{r fig.width=4.5, fig.height=4.5} qqplot(ts, rt(1000, df = 8), col = 4, xlab = "Test statistics", ylab = "Theoretical t-distribution (simulated)") abline(0, 1) The central part of the graph seems to agree, but there are various discrepancies near the tails. The tails of a distribution are the most difficult part to measure. This is unfortunate, because those are often the values that interest us most since they provide us with enough evidence to reject a null hypothesis. Because the tails are so important, another way to test to see if a distribution of a sample follows some hypothesized distribution is to calculate the quantiles of some tail probabilities (using the quantile function) and compare them to the theoretical probabilities from the distribution. probs <- c(0.9, 0.95, 0.99) quantile(ts, probs) qt(probs, df = 8) The quantiles agree fairly well, especially at the 0.95 and 0.99 quantiles. Using a larger number of simulations runs (e.g.\ try 5000) or using a large sample size for the two groups would probably result in values closer to what we have theoretically predicted. What is the behaviour of the p-values? While it may not be immediately apparent, p-values are actually random variables. Under the null hypothesis, the p-values for any statistical test should form a uniform distribution between 0 and 1 (see the Appendix). Consider Welch's t-test and the equal-variances t-test. ```{r fig.width=8, fig.height=4} pvals.welch <- replicate(1000, t.test(rnorm(10), rnorm(10))$p.value) pvals.ttest <- replicate(1000, t.test(rnorm(10), rnorm(10), var.equal = TRUE)$p.value) par(mfrow = c(1, 2)) hist(pvals.welch, freq = FALSE, col = 8, xlab = "p-values", main = "Welch") hist(pvals.ttest, freq = FALSE, col = 8, xlab = "p-values", main = "Pooled") The idea that the p-values follow a uniform distribution seems reasonable. Now, let's look at some of the quantiles probs <- c(0.5, 0.7, 0.9, 0.95, 0.99) quantile(pvals.welch, probs) quantile(pvals.ttest, probs) There is not that much of a difference between theoretical and simulated quantiles for both tests. What happens to the distribution of the p-values under the alternative hypothesis? For example, try $\mu_X - \mu_Y = 1$. ```{r fig.width=8, fig.height=4} pvals.welch <- replicate(1000, t.test(rnorm(10), rnorm(10, 1))$p.value) pvals.ttest <- replicate(1000, t.test(rnorm(10), rnorm(10, 1), var.equal = TRUE)$p.value) par(mfrow = c(1, 2)) hist(pvals.welch, freq = FALSE, col = 8, xlab = "p-values", main = "Welch") hist(pvals.ttest, freq = FALSE, col = 8, xlab = "p-values", main = "Pooled") quantile(pvals.welch, probs) quantile(pvals.ttest, probs) The distribution is not uniform. Thus, the probability that the p-value is smaller than $\alpha = 0.05$ under the alternative hypothesis is higher than under the null hypothesis, and this effect is more pronounced as the true difference $\mu_X - \mu_Y$ moves away from zero. # Power comparisons by simulation We carry out a a sequence of simulations to compare the power curves of two tests. We generate samples of size $n = n_X = n_Y = 5$ from $X \sim \mathrm{N}(\mu_X, 1)$ and $Y \sim \mathrm{N}(\mu_Y, 1)$ for various values of $\delta = \mu_X - \mu_Y$. We consider $B = 1000$ simulation runs and for each run we carry out the t-test (equal variances) and the Wilcoxon rank-sum test. B <- 1000 # number of simulation runs R <- 50 # number of power values n <- 5 # sample sizes delta.seq <- seq(-3, 3, length = R) # sequence of true differences power.t <- numeric(R) # initialize power vectors power.w <- numeric(R) for (i in 1:R) { delta <- delta.seq[i] # Simulate B p-values for each test. pvals.t <- replicate(B, t.test(rnorm(n), rnorm(n, delta), var.equal = TRUE)$p.value) pvals.w <- replicate(B, wilcox.test(rnorm(n), rnorm(n, delta), exact = FALSE)$p.value) # Record the estimated power (proportion of rejections). power.t[i] <- mean(pvals.t < 0.05) power.w[i] <- mean(pvals.w < 0.05) Now the vectors `power.t` and `power.w` contain simulated power values for a range of values of $\delta$. This allows us to draw the (estimated) power curves: ```{r fig.width=6, fig.height=4} # Plot simulated power for t- and Wilcoxon tests. plot(delta.seq, power.t, type = "l", ylim = c(0, 1), ylab = "Power", xlab = expression(delta)) lines(delta.seq, power.w, lty = 2, col = 2) abline(v = 0, lty = 3) This will take a few seconds since you are running 1000 simulations times 50 values for $\delta$. The simulated power curve for the t-test is uniformly better than that of the Wilcoxon test for any $\vert \delta \vert > 0$. How
could you explain this behaviour? What is the meaning of the points where power
curves intersect the vertical axis in the plot?

# Appendix: p-values and their distribution under $H_0$

The p-value, denoted here by $U$, can be seen as a random variable depending on
the sample. Suppose we are interested in testing $H_0\colon \mu=\mu_0$ vs
$H_1\colon \mu>\mu_0$ by a one-sample t-test. Then the p-value for the given
observed statistic, $t$, is
\[ u = \Pr(T \geqslant t \mid H_0) = 1 – F_0(t), \]
where $F_0$ denotes the cdf of $T$ under the null hypothesis. In other words,
the p-value is just a statistic (a function of the data) and therefore has a
sampling distribution. In particular, the above formula shows a realisation of
$U = 1 – F_0(T)$, which is the random-variable version of the p-value.

(The p-value is the probability of obtaining a test statistic at least as
extreme as the one that was actually observed, assuming that the null
hypothesis is true. Importantly, note that the p-value does not correspond to
the probability that $H_0$ is true given the data we actually observed.
Rather, it is a probability that relates to the data under hypothetical
repetitions of the experiment.)

Next, we show that the p-value, $U$, follows a uniform distribution on the unit
interval, $[0,1]$. We will use a trick called probability integral transform.

Thanks to the fact that the cdf $F_0(t)$ is monotonic, increasing and
(left-)continuous, we can write:
\[ \Pr(T \geqslant t \mid H_0)
= \Pr(F_0(T) \geqslant F_0(t))
= 1 – \Pr(F_0(T) \leqslant F_0(t)). \]
This shows $\Pr(F_0(T) \leqslant F_0(t)) = F_0(t)$. But recall that the random
variable $X$ follows the uniform distribution $X \sim \mathrm{Unif}(0, 1)$ if and only
if its cdf is $P(X \leqslant x) = x$. Hence, $F_0(T) \sim \mathrm{Unif}(0, 1)$ and also
$U = 1 – F_0(T) \sim \mathrm{Unif}(0, 1)$.

# Exercises

*Note:* some of the R commands you’ll need to complete these exercises
were covered in the lectures but not in these particular lab notes.

1. Consider the `InsectSprays` dataset, which comes provided by
default with R. It shows the counts of insects in an agricultural experiment
that compares different insecticides (you can read more about it on its help
page by running `?InsectSprays`).

a) Do a one-way ANOVA to show that the experiment provides strong evidence
that the insect sprays that were tested vary in their effectiveness.
# Insert answer here…

b) Inspect the data with a boxplot. Is there an assumption we make in the
ANOVA that you suspect might not hold?
# Insert answer here…

c) A square root transformation is often applied to count data to make them
more compatible with the ANOVA assumptions. Take the square root of the
counts and repeat the ANOVA. Have your conclusions changed?
# Insert answer here…

2. Consider the `ToothGrowth` dataset, also provided with R, which
describes an experiment to measure the effect of vitamin C on tooth growth (see
the help page for more info).

a) Do a two-way ANOVA (do not include interaction terms) to assess the
impact of vitamin C dose and delivery method on the length of the
odontoblasts.
# Insert answer here…

b) Is there evidence for an interaction? Do an ANOVA with interaction
terms and assess the output.
# Insert answer here…

c) Draw an interaction plot for these data and interpret it.
# Insert answer here…

3. Consider the same scenario as in the [Power comparisons by simulation]
section but this time with $n_X = 5$ and $n_Y = 10$. Use simulations to
compare the power of Welch’s t-test and the equal-variances t-test under the
following conditions:

a) The variances of both groups are equal to 1.
# Insert answer here…

b) The variance of the $X$ group is 1 and the $Y$ group is 4.
# Insert answer here…

What do you notice?

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts