程序代写代做代考 Bayesian MAST20005/MAST90058: Week 11 Lab

MAST20005/MAST90058: Week 11 Lab
Goals: Bayesian inference for: (i) proportions; (ii) mean of a normal distribution; (iii) simple
linear regression.
1 Bayesian estimation of proportions
Let Y be the number of successes in n independent Bernoulli trials. Then the data and prior distributions are, respectively,
Y | ✓ ⇠ Bi(n,✓), ✓ ⇠ Beta(↵,). The posterior pdf can be obtained as,
f(✓ | y) = (↵ + + n) ✓↵+y1(1 ✓)+ny1, 0 < ✓ < 1, (↵ + y)( + n y) i.e. ✓ | y ⇠ Beta(↵ + y, + n y). 1. Suppose we observe y = 8 successes out of n = 20 trials. Plot the prior and posterior densities assuming ↵ = = 1. What kind of information is given by this prior? Comment on the information gain on ✓ gained after observing the data. a <- 1 b <- 1 y <- 8 n <- 20 theta <- seq(0, 1, 0.01) prior <- dbeta(theta, a, b) posterior <- dbeta(theta, a + y, b + n - y) par(mar = c(4, 4, 1, 1)) # tighter margins plot(theta, posterior, type = "l", lwd = 2, col = "blue", xlab = expression(theta), ylab = expression(f(theta))) points(theta, prior, type = "l", lty = 2, lwd = 2, col = "red") 0.0 0.2 0.4 Note that Beta(1, 1) = Unif(0, 1), meaning that we consider all values of ✓ to be equally plausible before observing any data. After observing the data we obtain a posterior centered around 0.4 and the uncertainty on ✓ is much reduced. 1 θ 0.6 0.8 1.0 f(θ) 0123 2. Find the posterior mean. (a + y) / (a + b + n) ## [1] 0.4090909 3. Find the central 95% credible interval. 4. Estimate the posterior odds E ✓ | y by simulating from the posterior distribution. 1✓ In the above code, we generated 1000 observations from the posterior, ✓1, . . . , ✓1000 ⇠ Beta(↵ + y, + n y) and then obtained, X E✓ ✓ y◆⇡ 1 1000 ✓i . 1 ✓ 1000 i=1 1 ✓i 5. The package Bolstad contains some useful routines for simple Bayesian models (if it is not installed, run install.packages("Bolstad") to install it). For example the following carries out a Bayesian analysis for proportions: qbeta(c(0.025, 0.975), a + y, b + n - y) ## [1] 0.2181969 0.6156456 sample.p <- rbeta(1000, a + y, b + n - y) mean(sample.p / (1 - sample.p)) ## [1] 0.7544855 library(Bolstad) par(mar = c(4, 4, 1, 1)) # tighter margins binobp(8, 20, 5, 5, plot = TRUE) Posterior Prior 0.0 0.2 0.4 0.6 0.8 1.0 2 π Density 012345 ## Posterior Mean : 0.4333333 ## Posterior Variance : 0.0079211 ## Posterior Std. Deviation : 0.0890008 ## ## Prob. Quantile ## --------------- ## 0.005 0.2190745 ## 0.010 0.2369340 ## 0.025 0.2644553 ## 0.050 0.2892715 ## 0.500 0.4318325 ## 0.950 0.5825361 ## 0.975 0.6106372 ## 0.990 0.6425740 ## 0.995 0.6637734 Note that the third and fourth arguments in binobp() are ↵ and . Try di↵erent values for the prior parameters and report how the information a↵ects the posterior distribution. Particularly, recall that the posterior mean can be written as, E(✓|y)=✓ n ◆⇣y⌘+✓ ↵+ ◆✓ ↵ ◆, ↵++n n ↵++n ↵+ i.e. a weighted average between the prior mean ↵/(↵ + ) and the MLE y/n. 6. The estimator ✓ ̃ = (Y + 2)/(n + 4) is sometimes used instead of the sample proportion ✓ˆ = Y /n. Note that ✓ ̃ is the posterior mean when using Beta(2, 2) as the prior. Using this pqrior, compare the central 95% credible interval with the confidence interval ✓ ̃ ± 1.96 ⇥ ̃ ̃ ✓(1 ✓)/n, by simulation for di↵erent values of the true proportion ✓. The following code computes estimates of the coverage probability for the two types of intervals. theta <- 0.1 n <- 20 a <- 2 b <- 2 # true parameter # sample size # prior parameter (alpha) # prior parameter (beta) # number of simulations nsim <- 1000 # Simulate data. y <- rbinom(nsim, size = n, prob = theta) # Calculate limits for credible interval. l1 <- qbeta(0.025, a + y, b + n - y) u1 <- qbeta(0.975, a + y, b + n - y) # Calculate limits for confidence interval. p.tilde <- (y + 2) / (n + 4) se <- sqrt(p.tilde * (1 - p.tilde) / n) l2 <- p.tilde - 1.96 * se u2 <- p.tilde + 1.96 * se 3 # Calculate coverage probabilities. mean(l1 < theta & theta < u1) ## [1] 0.868 mean(l2 < theta & theta < u2) ## [1] 0.986 Comment on the accuracy of the interval estimates obtained by the two methods for di↵erent choices of ✓ and n. 2 Normal model To illustrate how one might use the posterior distribution for inference, consider a study analysing the e↵ect of specific training programs before a standardised test (Gelman, 2006). Assume a random sample of 240 di↵erences of scores. Let yi = z2i z1i, where zi1 is the score for individual i before being coached, and the score z2i is recorded after being coached. Sup- pose that the sample mean of these di↵erences is y ̄ = 7.87 and the standard deviation for the di↵erences, = 64.5, is known. Particularly, we assume the data model y ̄ | ✓ ⇠ N(✓, 2/n) and prior ✓ ⇠ N(✓0, 02). Then the posterior distribution is a normal distribution with mean, and variance, 02 2 / n E ( ✓ | y ̄ ) = 02 + 2 / n y ̄ + 02 + 2 / n ✓ 0 (2/n) 02 v a r ( ✓ | y ̄ ) = 2 / n + 02 . 1. What are the implications of a large value of 02? 2. Find a 95% credible interval for ✓ when 02 ! 1 and compare such an interval with the confidence interval for ✓. Note that the posterior in this case is, ✓ | y ̄ ⇠ N(y ̄, 2/n). Thus a 95% credible interval is obtained as follows: post.mean <- 7.87 # posterior mean post.mean ## [1] 7.87 post.sd <- 64.5 / sqrt(240) # posterior standard deviation post.sd ## [1] 4.163457 qnorm(c(0.025, 0.975), post.mean, post.sd) ## [1] -0.290226 16.030226 4 Note that this is the same as the confidence interval y ̄ ± 1.95 /pn, which we usually calculate using: 3. How does the Bayesian interval change if 0 = 5.1 and the prior mean is ✓0 = 15.2? 4. This example can be extended to illustrate how one can use a Bayesian approach to consider di↵erences in means of multiple populations as in ANOVA. Suppose coaching programs were implemented in 4 schools, with school averages given in the following table. 7.87 + c(-1, 1) * 1.96 * 64.5 / sqrt(240) ## [1] -0.2903759 16.0303759 School i Di↵erence (y ̄i·) A128 B28 C3-3 D47 Squared standard error (s2i /ni) 15 10 16 11 Assume: where yij is the jth observation from the ith school, ✓i is the mean e↵ect of school i and yij | ✓i ⇠ N(✓i, i2) i2 is the known variance of school i. Further, assume normal priors for each ✓i so that ✓i ⇠ N(7.87, 17.352) Give a graphical illustration of the prior and posterior distributions using boxplots for for all i = 1,...,4. each school. A graph is simply obtained by sampling from the priors: A <- rnorm(100, 7.87, 17.35) B <- rnorm(100, 7.87, 17.35) C <- rnorm(100, 7.87, 17.35) D <- rnorm(100, 7.87, 17.35) par(mar = c(4, 4, 1, 1)) # tighter margins boxplot(A, B, C, D, horizontal = TRUE, col = "lightblue") ●● −20 0 20 40 5 1234 To represent the posteriors, we assume i2/ni = s2i /ni and draw from the posteriors: y.bar <- c(28, 8, -3, 7) se2 <- c(15, 10, 16, 11) theta.0 <- rep( 7.87, 4) sigma.0 <- rep(17.35, 4) post.means <- (sigma.0^2 / (sigma.0^2 + se2)) * y.bar + ( se2 / (sigma.0^2 + se2)) * theta.0 post.sd <- sqrt(sigma.0^2 * se2 / (sigma.0^2 + se2)) A <- rnorm(100, post.means[1], post.sd[1]) B <- rnorm(100, post.means[2], post.sd[2]) C <- rnorm(100, post.means[3], post.sd[3]) D <- rnorm(100, post.means[4], post.sd[4]) par(mar = c(4, 4, 1, 1)) # tighter margins boxplot(A, B, C, D, horizontal = TRUE, col = "lightblue") ● ● −10 0 10 20 30 3 Bayesian regression In this section we look at the familiar regression model Yi ⇠ N(↵+(xi x ̄), 2) from a Bayesian perspective. To this end, one can consider priors ↵ ⇠ N(μ↵,↵2), and ⇠ N(μ,2), while is regarded as known. The function bayes.lin.reg() computes posterior distributions for ↵ and and it enables us to specify either specific parameters for the prior or improper uniform (‘flat’) priors. If unspecified, the regression variance 2 is estimated as usual from the residuals. Consider 11 measurements on heart rate vs oxygen uptake. O2 <- c(0.47, 0.75, 0.83, 0.98, 1.18, 1.29, 1.40, 1.60, 1.75, 1.90, 2.23) HR <- c(94, 96, 94, 95, 104, 106, 108, 113, 115, 121, 131) plot(HR, O2, xlab = "Heart Rate", ylab = "Oxygen uptake (Percent)") 6 1234 ● ● ● ● ● ● ● ● ● ● ● 100 110 120 130 Heart Rate x <- HR - mean(HR) y <- O2 coef(summary(lm(y ~ x))) # usual regression fit ## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.30727273 0.039282918 33.27840 9.840739e-11
## x 0.04265141 0.003379797 12.61952 5.009294e-07
Next compute posterior distributions for the parameters under flat and informative priors. Explore the e↵ect of di↵erent choices of prior on the final inference for the regression slope.
1. Fit the Bayesian regression model with flat priors.
bayes.lin.reg(y = O2, x = HR, slope.prior = “flat”, intcpt.prior = “flat”)
## Standard deviation of residuals: 0.13
## Posterior Mean Posterior Std. Deviation
## ————– ————————
## Intercept: 1.305 0.039253
## Slope: 0.04265 0.0033798
7
Oxygen uptake (Percent) 0.5 1.0 1.5 2.0

Prior, likelihood and posterior for β
Prior, likelihood and posterior for αx
Prior Likelihood Posterior
Prior Likelihood Posterior
0.030 0.040
0.050
1.15 1.25 (intercept)
β
α
(slope)
Predicitions with 95% bounds
1.35 1.45
Predicted value 95% lower bound 95% upper bound
70 80 90 100
120 140
x
2. Re-do it with prior parameters μ↵ = 1.2, ↵ = 0.05, μ = 0.04, = 0.005:
bayes.lin.reg(O2, HR, slope.prior = “n”, intcpt.prior = “n”,
ma0 = 1.2, mb0 = 0.04, sa0 = 0.05, sb0 = 0.005)
## Standard deviation of residuals: 0.13
## Posterior Mean Posterior Std. Deviation
## ————– ————————
## Intercept: 1.266 0.03089
## Slope: 0.04182 0.0028001
8
0 1
2 3 0 20 40 60 80 120
y
0 2 4 6 8 10

Prior, likelihood and posterior for β
Prior, likelihood and posterior for αx
Prior Likelihood Posterior
Prior Likelihood Posterior
70 80 90 100
Exercises
120 140
0.035 0.040
0.045 0.050
Predicitions with 95% bounds
1.20 1.25 (intercept)
β
α
(slope)
1.30 1.35
Predicted value 95% lower bound 95% upper bound
x
1. Consider question 2 from the tutorial problems. For both choices of prior, do the following:
(a) Draw a plot showing the prior and posterior distributions. (b) Calculate a 95% credible interval.
(c) For the drug the be commercially successful, you need it to be e↵ective in at least 3% of patients. What is your posterior probability that it will be successful?
2. Consider part 6 from Section 1. Estimate, via simulations, the coverage of the two interval estimates for n = 20 and ✓ 2 {0.05, 0.10, . . . , 0.95}. Draw a plot comparing these results. What do you notice? What are the relative widths of these intervals?
9
y
0 1 2 3 0 50 100 150
0 2 4 6 8 10 14