CS计算机代考程序代写 Bayesian Week 11 Lab

Week 11 Lab

MAST20005/MAST90058: Week 11 Lab

Goals: Bayesian inference for: (i) proportions; (ii) mean of a normal distribution; (iii) simple
linear regression.

1 Bayesian estimation of proportions

Let Y be the number of successes in n independent Bernoulli trials. Then the data and prior
distributions are, respectively,

Y | θ ∼ Bi(n, θ), θ ∼ Beta(α, β).

The posterior pdf can be obtained as,

f(θ | y) =
Γ(α + β + n)

Γ(α + y)Γ(β + n− y)
θα+y−1(1− θ)β+n−y−1, 0 < θ < 1, i.e. θ | y ∼ Beta(α + y, β + n− y). 1. Suppose we observe y = 8 successes out of n = 20 trials. Plot the prior and posterior densities assuming α = β = 1. What kind of information is given by this prior? Comment on the information gain on θ gained after observing the data. a <- 1 b <- 1 y <- 8 n <- 20 theta <- seq(0, 1, 0.01) prior <- dbeta(theta, a, b) posterior <- dbeta(theta, a + y, b + n - y) par(mar = c(4, 4, 1, 1)) # tighter margins plot(theta, posterior, type = "l", lwd = 2, col = "blue", xlab = expression(theta), ylab = expression(f(theta))) points(theta, prior, type = "l", lty = 2, lwd = 2, col = "red") 0.0 0.2 0.4 0.6 0.8 1.0 0 1 2 3 θ f(θ ) Note that Beta(1, 1) = Unif(0, 1), meaning that we consider all values of θ to be equally plausible before observing any data. After observing the data we obtain a posterior centered around 0.4 and the uncertainty on θ is much reduced. 1 2. Find the posterior mean. (a + y) / (a + b + n) ## [1] 0.4090909 3. Find the central 95% credible interval. qbeta(c(0.025, 0.975), a + y, b + n - y) ## [1] 0.2181969 0.6156456 4. Estimate the posterior odds E ( θ 1−θ | y ) by simulating from the posterior distribution. sample.p <- rbeta(1000, a + y, b + n - y) mean(sample.p / (1 - sample.p)) ## [1] 0.7544855 In the above code, we generated 1000 observations from the posterior, θ1, . . . , θ1000 ∼ Beta(α + y, β + n− y) and then obtained, E ( θ 1− θ ∣∣∣ y) ≈ 1 1000 1000∑ i=1 θi 1− θi . 5. The package Bolstad contains some useful routines for simple Bayesian models (if it is not installed, run install.packages("Bolstad") to install it). For example the following carries out a Bayesian analysis for proportions: library(Bolstad) par(mar = c(4, 4, 1, 1)) # tighter margins binobp(8, 20, 5, 5, plot = TRUE) 0.0 0.2 0.4 0.6 0.8 1.0 0 1 2 3 4 5 π D e n si ty Posterior Prior 2 ## Posterior Mean : 0.4333333 ## Posterior Variance : 0.0079211 ## Posterior Std. Deviation : 0.0890008 ## ## Prob. Quantile ## --------------- ## 0.005 0.2190745 ## 0.010 0.2369340 ## 0.025 0.2644553 ## 0.050 0.2892715 ## 0.500 0.4318325 ## 0.950 0.5825361 ## 0.975 0.6106372 ## 0.990 0.6425740 ## 0.995 0.6637734 Note that the third and fourth arguments in binobp() are α and β. Try different values for the prior parameters and report how the information affects the posterior distribution. Particularly, recall that the posterior mean can be written as, E (θ | y) = ( n α + β + n )(y n ) + ( α + β α + β + n )( α α + β ) , i.e. a weighted average between the prior mean α/(α + β) and the MLE y/n. 6. The estimator θ̃ = (Y + 2)/(n + 4) is sometimes used instead of the sample proportion θ̂ = Y/n. Note that θ̃ is the posterior mean when using Beta(2, 2) as the prior. Using this prior, compare the central 95% credible interval with the confidence interval θ̃ ± 1.96 ×√ θ̃(1− θ̃)/n, by simulation for different values of the true proportion θ. The following code computes estimates of the coverage probability for the two types of intervals. theta <- 0.1 # true parameter n <- 20 # sample size a <- 2 # prior parameter (alpha) b <- 2 # prior parameter (beta) nsim <- 1000 # number of simulations # Simulate data. y <- rbinom(nsim, size = n, prob = theta) # Calculate limits for credible interval. l1 <- qbeta(0.025, a + y, b + n - y) u1 <- qbeta(0.975, a + y, b + n - y) # Calculate limits for confidence interval. p.tilde <- (y + 2) / (n + 4) se <- sqrt(p.tilde * (1 - p.tilde) / n) l2 <- p.tilde - 1.96 * se u2 <- p.tilde + 1.96 * se 3 # Calculate coverage probabilities. mean(l1 < theta & theta < u1) ## [1] 0.868 mean(l2 < theta & theta < u2) ## [1] 0.986 Comment on the accuracy of the interval estimates obtained by the two methods for different choices of θ and n. 2 Normal model To illustrate how one might use the posterior distribution for inference, consider a study analysing the effect of specific training programs before a standardised test (Gelman, 2006). Assume a random sample of 240 differences of scores. Let yi = z2i − z1i, where zi1 is the score for individual i before being coached, and the score z2i is recorded after being coached. Sup- pose that the sample mean of these differences is ȳ = 7.87 and the standard deviation for the differences, σ = 64.5, is known. Particularly, we assume the data model ȳ | θ ∼ N(θ, σ2/n) and prior θ ∼ N(θ0, σ20). Then the posterior distribution is a normal distribution with mean, E(θ | ȳ) = σ20 σ20 + σ 2/n ȳ + σ2/n σ20 + σ 2/n θ0 and variance, var(θ | ȳ) = (σ2/n)σ20 σ2/n+ σ20 . 1. What are the implications of a large value of σ20? 2. Find a 95% credible interval for θ when σ20 →∞ and compare such an interval with the confidence interval for θ. Note that the posterior in this case is, θ | ȳ ∼ N(ȳ, σ2/n). Thus a 95% credible interval is obtained as follows: post.mean <- 7.87 # posterior mean post.mean ## [1] 7.87 post.sd <- 64.5 / sqrt(240) # posterior standard deviation post.sd ## [1] 4.163457 qnorm(c(0.025, 0.975), post.mean, post.sd) ## [1] -0.290226 16.030226 4 Note that this is the same as the confidence interval ȳ ± 1.95σ/ √ n, which we usually calculate using: 7.87 + c(-1, 1) * 1.96 * 64.5 / sqrt(240) ## [1] -0.2903759 16.0303759 3. How does the Bayesian interval change if σ0 = 5.1 and the prior mean is θ0 = 15.2? 4. This example can be extended to illustrate how one can use a Bayesian approach to consider differences in means of multiple populations as in ANOVA. Suppose coaching programs were implemented in 4 schools, with school averages given in the following table. School i Difference (ȳi·) Squared standard error (s 2 i /ni) A 1 28 15 B 2 8 10 C 3 -3 16 D 4 7 11 Assume: yij | θi ∼ N(θi, σ2i ) where yij is the jth observation from the ith school, θi is the mean effect of school i and σ2i is the known variance of school i. Further, assume normal priors for each θi so that θi ∼ N(7.87, 17.352) for all i = 1, . . . , 4. Give a graphical illustration of the prior and posterior distributions using boxplots for each school. A graph is simply obtained by sampling from the priors: A <- rnorm(100, 7.87, 17.35) B <- rnorm(100, 7.87, 17.35) C <- rnorm(100, 7.87, 17.35) D <- rnorm(100, 7.87, 17.35) par(mar = c(4, 4, 1, 1)) # tighter margins boxplot(A, B, C, D, horizontal = TRUE, col = "lightblue") ●● 1 2 3 4 −20 0 20 40 5 To represent the posteriors, we assume σ2i /ni = s 2 i /ni and draw from the posteriors: y.bar <- c(28, 8, -3, 7) se2 <- c(15, 10, 16, 11) theta.0 <- rep( 7.87, 4) sigma.0 <- rep(17.35, 4) post.means <- (sigma.0^2 / (sigma.0^2 + se2)) * y.bar + ( se2 / (sigma.0^2 + se2)) * theta.0 post.sd <- sqrt(sigma.0^2 * se2 / (sigma.0^2 + se2)) A <- rnorm(100, post.means[1], post.sd[1]) B <- rnorm(100, post.means[2], post.sd[2]) C <- rnorm(100, post.means[3], post.sd[3]) D <- rnorm(100, post.means[4], post.sd[4]) par(mar = c(4, 4, 1, 1)) # tighter margins boxplot(A, B, C, D, horizontal = TRUE, col = "lightblue") ●● ● 1 2 3 4 −10 0 10 20 30 3 Bayesian regression In this section we look at the familiar regression model Yi ∼ N(α+β(xi−x̄), σ2) from a Bayesian perspective. To this end, one can consider priors α ∼ N(µα, σ2α), and β ∼ N(µβ, σ2β), while σ is regarded as known. The function bayes.lin.reg() computes posterior distributions for α and β and it enables us to specify either specific parameters for the prior or improper uniform (‘flat’) priors. If unspecified, the regression variance σ2 is estimated as usual from the residuals. Consider 11 measurements on heart rate vs oxygen uptake. O2 <- c(0.47, 0.75, 0.83, 0.98, 1.18, 1.29, 1.40, 1.60, 1.75, 1.90, 2.23) HR <- c(94, 96, 94, 95, 104, 106, 108, 113, 115, 121, 131) plot(HR, O2, xlab = "Heart Rate", ylab = "Oxygen uptake (Percent)") 6 ● ● ● ● ● ● ● ● ● ● ● 100 110 120 130 0 .5 1 .0 1 .5 2 .0 Heart Rate O xy g e n u p ta ke ( P e rc e n t) x <- HR - mean(HR) y <- O2 coef(summary(lm(y ~ x))) # usual regression fit ## Estimate Std. Error t value Pr(>|t|)

## (Intercept) 1.30727273 0.039282918 33.27840 9.840739e-11

## x 0.04265141 0.003379797 12.61952 5.009294e-07

Next compute posterior distributions for the parameters under flat and informative priors.
Explore the effect of different choices of prior on the final inference for the regression slope.

1. Fit the Bayesian regression model with flat priors.

bayes.lin.reg(y = O2, x = HR, slope.prior = “flat”, intcpt.prior = “flat”)

## Standard deviation of residuals: 0.13

## Posterior Mean Posterior Std. Deviation

## ————– ————————

## Intercept: 1.305 0.039253

## Slope: 0.04265 0.0033798

0.030 0.040 0.050

0
2

0
4

0
6

0
8

0
1

2
0

Prior, likelihood and posterior for β

(slope)
β

Prior
Likelihood
Posterior

1.15 1.25 1.35 1.45

0
2

4
6

8
1

Prior, likelihood and posterior for αx

(intercept)
α

Prior
Likelihood
Posterior

70 80 90 100 120 140

0
1

2
3

Predicitions with 95% bounds

Predicted value
95% lower bound
95% upper bound

2. Re-do it with prior parameters µα = 1.2, σα = 0.05, µβ = 0.04, σβ = 0.005:

bayes.lin.reg(O2, HR, slope.prior = “n”, intcpt.prior = “n”,

ma0 = 1.2, mb0 = 0.04, sa0 = 0.05, sb0 = 0.005)

## Standard deviation of residuals: 0.13

## Posterior Mean Posterior Std. Deviation

## ————– ————————

## Intercept: 1.266 0.03089

## Slope: 0.04182 0.0028001

0.035 0.040 0.045 0.050

0
5

0
1

0
0

1
5

Prior, likelihood and posterior for β

(slope)
β

Prior
Likelihood
Posterior

1.20 1.25 1.30 1.35

0
2

4
6

8
1

0
1

Prior, likelihood and posterior for αx

(intercept)
α

Prior
Likelihood
Posterior

70 80 90 100 120 140

0
1

2
3

Predicitions with 95% bounds

Predicted value
95% lower bound
95% upper bound

Exercises

1. Consider question 2 from the tutorial problems. For both choices of prior, do the following:

(a) Draw a plot showing the prior and posterior distributions.

(b) Calculate a 95% credible interval.

(c) For the drug the be commercially successful, you need it to be effective in at least
3% of patients. What is your posterior probability that it will be successful?

2. Consider part 6 from Section 1. Estimate, via simulations, the coverage of the two interval
estimates for n = 20 and θ ∈ {0.05, 0.10, . . . , 0.95}. Draw a plot comparing these results.
What do you notice? What are the relative widths of these intervals?

Bayesian estimation of proportions
Normal model
Bayesian regression

Related Posts