CS代考 MAST20005/MAST90058″

title: “Week 10 Lab, MAST20005/MAST90058”
author: “School of Mathematics and Statistics, University of Melbourne”
date: “Semester 2, 2022”
institute: “University of Melbourne”

Copyright By PowCoder代写 加微信 powcoder

fontsize: 11pt
classoption: a4paper
html_document:
toc_float:
collapsed: false

**Goals:** Bayesian inference for: (i) proportions; (ii) mean of a normal
distribution; (iii) simple linear regression.

“`{r echo=FALSE}
set.seed(1528)

# Bayesian estimation of proportions

Let $Y$ be the number of successes in $n$ independent Bernoulli trials. Then
the data and prior distributions are, respectively,
\[ Y \mid \theta \sim \mathrm{Bi}(n, \theta), \quad
\theta \sim \mathrm{Beta}(\alpha, \beta). \]
The posterior pdf can be obtained as,
\[ f(\theta \mid y) = \frac{\Gamma(\alpha + \beta + n)}
{\Gamma(\alpha + y) \Gamma(\beta + n – y)}
\theta^{\alpha + y – 1}
(1 – \theta)^{\beta + n – y – 1},
\quad 0 < \theta < 1, \] i.e.\ $\theta \mid y \sim \mathrm{Beta}(\alpha + y, \beta + n - y)$. Suppose we observe $y = 8$ successes out of $n = 20$ trials. Plot the prior and posterior densities assuming $\alpha = \beta = 1$. What kind of information is given by this prior? Comment on the information gain on $\theta$ gained after observing the data. ```{r fig.height=3} theta <- seq(0, 1, 0.01) prior <- dbeta(theta, a, b) posterior <- dbeta(theta, a + y, b + n - y) par(mar = c(4, 4, 1, 1)) # tighter margins plot(theta, posterior, type = "l", lwd = 2, col = "blue", xlab = expression(theta), ylab = expression(f(theta))) points(theta, prior, type = "l", lty = 2, lwd = 2, col = "red") Note that $\mathrm{Beta}(1,1) = \mathrm{Unif}(0,1)$, meaning that we consider all values of $\theta$ to be equally plausible before observing any data. After observing the data we obtain a posterior centered around 0.4 and the uncertainty on $\theta$ is much reduced. Find the posterior mean. (a + y) / (a + b + n) Find the central 95% credible interval. qbeta(c(0.025, 0.975), a + y, b + n - y) Estimate the posterior odds $\mathbb{E}\left(\frac{\theta}{1 - \theta} \mid y\right)$ by simulating from the posterior distribution. sample.p <- rbeta(1000, a + y, b + n - y) mean(sample.p / (1 - sample.p)) In the above code, we generated 1000 observations from the posterior, $\theta_1, \dots, \theta_{1000} \sim \mathrm{Beta}(\alpha + y, \beta + n - y)$ and then obtained, \[ \mathbb{E}\left(\frac{\theta}{1 - \theta} \mathrel{\Big|} y \right) \frac{1}{1000} \sum_{i=1}^{1000} \frac{\theta_i}{1 - \theta_i}. \] The package `Bolstad` contains some useful routines for simple Bayesian models (if it is not installed, run `install.packages("Bolstad")` to install it). For example the following carries out a Bayesian analysis for proportions: ```{r message=FALSE, fig.height=4} library(Bolstad) par(mar = c(4, 4, 1, 1)) # tighter margins binobp(8, 20, 5, 5, plot = TRUE) Note that the third and fourth arguments in `binobp()` are $\alpha$ and $\beta$. Try different values for the prior parameters and report how the information affects the posterior distribution. Particularly, recall that the posterior mean can be written as, \[ \mathbb{E}\left(\theta \mid y\right) = \left(\frac{n}{\alpha + \beta + n}\right) \left(\frac{y}{n}\right) + \left(\frac{\alpha + \beta}{\alpha + \beta + n}\right) \left(\frac{\alpha}{\alpha + \beta} \right), \] i.e.\ a weighted average between the prior mean $\alpha / (\alpha + \beta)$ and the MLE $y / n$. The estimator $\tilde{\theta} = (Y+2)/(n+4)$ is sometimes used instead of the sample proportion $\hat{\theta} = Y/n$. Note that $\tilde{\theta}$ is the posterior mean when using $\mathrm{Beta}(2,2)$ as the prior. Using this prior, compare the central 95\% credible interval with the confidence interval $\tilde{\theta} \pm 1.96 \times \sqrt{\tilde{\theta} (1 - \tilde{\theta}) / n}$, by simulation for different values of the true proportion $\theta$. The following code computes estimates of the coverage probability for the two types of intervals. theta <- 0.1 # true parameter n <- 20 # sample size a <- 2 # prior parameter (alpha) b <- 2 # prior parameter (beta) nsim <- 1000 # number of simulations # Simulate data. y <- rbinom(nsim, size = n, prob = theta) # Calculate limits for credible interval. l1 <- qbeta(0.025, a + y, b + n - y) u1 <- qbeta(0.975, a + y, b + n - y) # Calculate limits for confidence interval. p.tilde <- (y + 2) / (n + 4) se <- sqrt(p.tilde * (1 - p.tilde) / n) l2 <- p.tilde - 1.96 * se u2 <- p.tilde + 1.96 * se # Calculate coverage probabilities. mean(l1 < theta & theta < u1) mean(l2 < theta & theta < u2) Comment on the accuracy of the interval estimates obtained by the two methods for different choices of $\theta$ and $n$. # Normal model To illustrate how one might use the posterior distribution for inference, consider a study analysing the effect of specific training programs before a standardised test (Gelman, 2006). Assume a random sample of 240 differences of scores. Let $y_i = z_{2i} - z_{1i}$, where $z_{i1}$ is the score for individual $i$ before being coached, and the score $z_{2i}$ is recorded after being coached. Suppose that the sample mean of these differences is $\bar{y} = 7.87$ and the standard deviation for the differences, $\sigma = 64.5$, is known. Particularly, we assume the data model $\bar{y} \mid \theta \sim \mathrm{N}(\theta, \sigma^2 / n)$ and prior $\theta \sim \mathrm{N}(\theta_0, \sigma_0^2)$. Then the posterior distribution is a normal distribution with \[ E(\theta \mid \bar{y}) = \frac{\sigma_0^2 }{\sigma_0^2 + \sigma^2 / n} \, \bar{y} + \frac{\sigma^2 / n}{\sigma_0^2 + \sigma^2 / n} \, \theta_0 \] and variance, \[ \mathrm{var}(\theta \mid \bar{y}) = \frac{(\sigma^2 / n) \, \sigma_0^2}{\sigma^2 / n + \sigma^2_0}. \] What are the implications of a large value of $\sigma_0^2$? (Discuss with classmates) Find a 95\% credible interval for $\theta$ when $\sigma_0^2 \rightarrow \infty$ and compare such an interval with the confidence interval for $\theta$. Note that the posterior in this case is, \[ \theta \mid \bar{y} \sim \mathrm{N}(\bar{y}, \sigma^2/n). \] Thus a 95\% credible interval is obtained as follows: post.mean <- 7.87 # posterior mean post.sd <- 64.5 / sqrt(240) # posterior standard deviation qnorm(c(0.025, 0.975), post.mean, post.sd) Note that this is the same as the confidence interval $\bar{y} \pm 1.95 \, \sigma / \sqrt{n}$, which we usually calculate using: 7.87 + c(-1, 1) * 1.96 * 64.5 / sqrt(240) How does the Bayesian interval change if $\sigma_0 = 5.1$ and the prior mean is $\theta_0 = 15.2$? (Try this out, and discuss with classmates.) This example can be extended to illustrate how one can use a Bayesian approach to consider differences in means of multiple populations as in ANOVA. Suppose coaching programs were implemented in 4 schools, with school averages given in the following table. | School | $i$ | Difference ($\bar{y}_{i \cdot}$) | Squared standard error ($s_i^2 / n_i$) | |:------:|:---:|---------------------------------:|---------------------------------------:| | A | 1 | 28 | 15 | | B | 2 | 8 | 10 | | C | 3 | -3 | 16 | | D | 4 | 7 | 11 | \[ y_{ij} \mid \theta_i \sim \mathrm{N}(\theta_i, \sigma_i^2) \] where $y_{ij}$ is the $j$th observation from the $i$th school, $\theta_i$ is the mean effect of school $i$ and $\sigma_i^2$ is the known variance of school $i$. Further, assume normal priors for each $\theta_i$ so that \[ \theta_i \sim \mathrm{N}(7.87, 17.35^2) \] for all $i=1, \dots, 4$. Give a graphical illustration of the prior and posterior distributions using boxplots for each school. A graph is simply obtained by sampling from the ```{r fig.height=3} A <- rnorm(100, 7.87, 17.35) B <- rnorm(100, 7.87, 17.35) C <- rnorm(100, 7.87, 17.35) D <- rnorm(100, 7.87, 17.35) par(mar = c(4, 4, 1, 1)) # tighter margins boxplot(A, B, C, D, horizontal = TRUE, col = "lightblue") To represent the posteriors, we assume $\sigma_{i}^2 / n_i = s_i^2 / n_i$ and draw from the posteriors: ```{r fig.height=3} y.bar <- c(28, 8, -3, 7) se2 <- c(15, 10, 16, 11) theta.0 <- rep( 7.87, 4) sigma.0 <- rep(17.35, 4) post.means <- (sigma.0^2 / (sigma.0^2 + se2)) * y.bar + ( se2 / (sigma.0^2 + se2)) * theta.0 post.sd <- sqrt(sigma.0^2 * se2 / (sigma.0^2 + se2)) A <- rnorm(100, post.means[1], post.sd[1]) B <- rnorm(100, post.means[2], post.sd[2]) C <- rnorm(100, post.means[3], post.sd[3]) D <- rnorm(100, post.means[4], post.sd[4]) par(mar = c(4, 4, 1, 1)) # tighter margins boxplot(A, B, C, D, horizontal = TRUE, col = "lightblue") # Bayesian regression In this section we look at the familiar regression model $Y_i \sim \mathrm{N}(\alpha + \beta(x_i - \bar{x}), \sigma^2)$ from a Bayesian perspective. To this end, one can consider priors $\alpha \sim \mathrm{N}(\mu_\alpha, \sigma_\alpha^2 )$, and $\beta \sim \mathrm{N}(\mu_\beta, \sigma_\beta^2 )$, while $\sigma$ is regarded as known. The function `bayes.lin.reg()` computes posterior distributions for $\alpha$ and $\beta$ and it enables us to specify either specific parameters for the prior or improper uniform ('flat') priors. If unspecified, the regression variance $\sigma^2$ is estimated as usual from the residuals. Consider 11 measurements on heart rate vs oxygen uptake. ```{r fig.width=6, fig.height=5} O2 <- c(0.47, 0.75, 0.83, 0.98, 1.18, 1.29, 1.40, 1.60, 1.75, 1.90, 2.23) HR <- c(94, 96, 94, 95, 104, 106, 108, 113, 115, 121, 131) plot(HR, O2, xlab = "Heart Rate", ylab = "Oxygen uptake (Percent)") x <- HR - mean(HR) coef(summary(lm(y ~ x))) # usual regression fit Next compute posterior distributions for the parameters under flat and informative priors. Explore the effect of different choices of prior on the final inference for the regression slope. Fit the Bayesian regression model with flat priors. bayes.lin.reg(y = O2, x = HR, slope.prior = "flat", intcpt.prior = "flat") Re-do it with prior parameters $\mu_\alpha = 1.2$, $\sigma_\alpha = 0.05$, $\mu_\beta = 0.04$, $\sigma_\beta = 0.005$: bayes.lin.reg(O2, HR, slope.prior = "n", intcpt.prior = "n", ma0 = 1.2, mb0 = 0.04, sa0 = 0.05, sb0 = 0.005) # Exercises 1. Consider question 2 from the tutorial problems. For both choices of prior, do the following: a) Draw a plot showing the prior and posterior distributions. # Insert answer here... b) Calculate a 95\% credible interval. # Insert answer here... c) For the drug the be commercially successful, you need it to be effective in at least 3\% of patients. What is your posterior probability that it will be successful? # Insert answer here... 2. Refer back to the comparison of the two estimators $\tilde\theta$ and $\hat\theta$ from the [Bayesian estimation of proportions] section, and the resulting comparison of a credible interval and a confidence interval. Estimate, via simulations, the coverage of the two interval estimates for $n = 20$ and $\theta \in \{0.05, 0.10, \dots, 0.95\}$. Draw a plot comparing these results. What do you notice? What are the relative widths of these intervals? # Insert answer here... 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com