CS计算机代考程序代写 chain Bayesian arm Rejection sampling: Non-conjugate Binomial example

Rejection sampling: Non-conjugate Binomial example

Rejection sampling: Non-conjugate Binomial example

7 September 2018

## Loading required package: coda

## ##
## ## Markov Chain Monte Carlo Package (MCMCpack)

## ## Copyright (C) 2003-2018 . Martin, . Quinn, and Park

## ##
## ## Support provided by the U.S. National Science Foundation

## ## (Grants SES-0350646 and SES-0350613)
## ##

## Loading required package: Matrix

## Loading required package: lme4

##
## arm (Version 1.9-3, built: 2016-11-21)

## Working directory is C:/Users/ /Documents/Patrick/Stat314-2018/code/src

##
## Attaching package: ‘arm’

## The following object is masked from ‘package:coda’:
##
## traceplot

##
## Attaching package: ‘logitnorm’

## The following objects are masked from ‘package:arm’:
##
## invlogit, logit

##
## Attaching package: ‘LearnBayes’

## The following object is masked from ‘package:MCMCpack’:
##
## rdirichlet

Illustration of the likelihood as a function of the model parameter.

Before looking at rejection sampling we illustrate the idea of the likelihood as function of the model parameter. Consider a study comprising 6 Binomial experiments each with N=10 trial The data is:

Y <- c(6,7,5,5,4,8) N <- rep(10,length(Y) ) We compute some summaries typically involved in Bayesian calculations. NmY <- N-Y sumY <- sum(Y) sumNmY <- sum(N-Y) sumY ## [1] 35 sumNmY ## [1] 25 Now we set-up a function to compute the log-likelihood. When doing numerical work with likelihoods it usually best to work on the (natural) log scale. loglike_binom <- function(theta,Y,N) { logl <- dbinom(Y,N,prob=theta,log=TRUE) return(sum(logl)) } ##Evaluate at a few points loglike_binom(theta=0.4,Y=Y,N=N) ## [1] -14.49359 loglike_binom(theta=0.1,Y=Y,N=N) ## [1] -52.87726 loglike_binom(theta=0.8,Y=Y,N=N) ## [1] -17.69874 ###Plot the log-likelihood for a range of sucesss probabilities testtheta <- seq(from=0.01,to=0.99,by=0.01) #range of theta's to test ##set up structure to store log-likelihoods for plotting loglstore <- vector(mode="numeric",length=length(testtheta)) ##Loop over theta values and compute the corresponding log-likelihood for (i in 1:length(testtheta)) { loglstore[i] <- loglike_binom(theta=testtheta[i],Y=Y,N=N) } loglstore ## [1] -131.08499 -107.07865 -93.14378 -83.33398 -75.78573 -69.66903 ## [7] -64.54114 -60.13782 -56.28864 -52.87726 -49.82074 -47.05783 ## [13] -44.54205 -42.23729 -40.11495 -38.15196 -36.32950 -34.63199 ## [19] -33.04639 -31.56169 -30.16850 -28.85878 -27.62555 -26.46277 ## [25] -25.36513 -24.32798 -23.34721 -22.41917 -21.54063 -20.70869 ## [31] -19.92077 -19.17453 -18.46790 -17.79900 -17.16612 -16.56774 ## [37] -16.00249 -15.46911 -14.96648 -14.49359 -14.04953 -13.63347 ## [43] -13.24470 -12.88255 -12.54647 -12.23593 -11.95052 -11.68986 ## [49] -11.45363 -11.24160 -11.05358 -10.88943 -10.74908 -10.63251 ## [55] -10.53976 -10.47093 -10.42619 -10.40574 -10.40987 -10.43894 ## [61] -10.49336 -10.57363 -10.68032 -10.81410 -10.97573 -11.16605 ## [67] -11.38605 -11.63682 -11.91958 -12.23572 -12.58679 -12.97456 ## [73] -13.40098 -13.86829 -14.37900 -14.93597 -15.54244 -16.20211 ## [79] -16.91925 -17.69874 -18.54629 -19.46852 -20.47323 -21.56968 ## [85] -22.76893 -24.08440 -25.53247 -27.13353 -28.91333 -30.90502 ## [91] -33.15229 -35.71434 -38.67425 -42.15368 -46.34134 -51.55344 ## [97] -58.38279 -68.16044 -85.13379 ###Plot the log-likelihood plot(testtheta,loglstore,main="log-likelihood for Binomial example") ##clearer if we plot a line? A continuous function after all plot(testtheta,loglstore,type="l",lwd=2.0,main="log-likelihood for Binomial example") Rejection sampler for a non-conjugte Binomial problem \[ \begin{aligned} Y_{i} & \stackrel{\mathrm{indep}}{\sim} \mathrm{Binomial}(10,\theta) \nonumber \\ \mathrm{logit}(\theta) & \sim \mathrm{Normal}(\mu,\sigma^{2}) \\ p(\theta|\mathbf{Y},N) & \propto \prod_{i}\mathrm{Binomial}(Y_{i}|10,\theta) \times \mathrm{logitN}(\theta|\mu,\sigma^{2}) \end{aligned} \] To get familiar with the logit-normal prior we plot this density for a few different parameter combinations. ##first define a function to simulate from the logit-normal rlogitnormal <- function(n,mu,sigma) { require(arm) l <- rnorm(n=n,mean=mu,sd=sigma) theta <- invlogit(l) return(theta) } ##specify parameters of the prior mu_prior <- logit(0.5) sigma_prior <- 5 ###Plot the prior testtheta <- rlogitnormal(n=10000,mu=mu_prior,sigma=sigma_prior) density_prior <- density(testtheta,from=0,to=1) plot(density_prior,main=paste("logitN(",round(mu_prior,2),",",sigma_prior,")",sep="")) summary(testtheta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.00000 0.03281 0.51585 0.50217 0.96748 1.00000 ###Try another set of parameters mu_prior <- logit(0.5) sigma_prior <- 1 ###Plot the prior testtheta <- rlogitnormal(n=10000,mu=mu_prior,sigma=sigma_prior) density_prior <- density(testtheta,from=0,to=1) plot(density_prior,main=paste("logitN(",round(mu_prior,2),",",sigma_prior,")",sep="")) summary(testtheta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.03173 0.33327 0.49800 0.49731 0.66036 0.98840 Rejection sampling. If our approximation is the prior, \(r(\theta)\) is the likelihood. Hence we accept draws with probability Likelihood/M where M is the value of the likelihood evaluated at the MLE. It is easier to work on the log-scale and so find \(\log(M)\) by computing the log-likelihood with \(\theta\) set to the MLE, which is \(\hat{\theta}=\sum_{i}Y_{i} /\sum_{i}N_{i}.\) logM <- loglike_binom(theta=(sum(Y)/sum(N)),Y=Y,N=N) ##specify the prior mu_prior <- logit(0.5) sigma_prior <- 5 ##set-up the rejection sampling loop simnum <- 2000 #desired simulation sample size storetheta <- vector(mode="numeric",length=simnum) accepted <- 0 rejected <- 0 while(accepted < simnum) { theta_prop <- rlogitnormal(n=1,mu=mu_prior,sigma=sigma_prior) ###Evaluate the likelihood at this proposal llike_prop <- loglike_binom(Y=Y,N=N,theta=theta_prop) U <- runif(1) if (log(U) < llike_prop - logM) { accepted <- accepted + 1 storetheta[accepted] <- theta_prop } else { rejected <- rejected + 1 } } summary(storetheta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.3771 0.5416 0.5843 0.5835 0.6275 0.7553 plot(density(storetheta)) accepted ## [1] 2000 rejected ## [1] 35058 acceptance_rate <- accepted/(accepted+rejected) acceptance_rate ## [1] 0.05396945 ###plot posterior density plot with prior superimposed #- to illustrate ideas around quality of ##approximatin density testtheta <- rlogitnormal(n=10000,mu=mu_prior,sigma=sigma_prior) density_prior <- density(testtheta,from=0,to=1) plot(density(storetheta), main="rejection sampler logit-normal(0,5) prior",lwd=2,xlim=c(0,1)) lines(density_prior,lwd=2,col="red") Now we repeat all this with an alternative alternative prior, which is less diffuse and again use the prior as the approximating density. ##specify parameters of the prior mu_prior <- logit(0.5) #logit(0.5) =0 sigma_prior <- 1 ###Plot the prior testtheta <- rlogitnormal(n=10000,mu=mu_prior,sigma=sigma_prior) density_prior <- density(testtheta,from=0,to=1) plot(density_prior,main=paste("logitN(",round(mu_prior,2),",",sigma_prior,")",sep="")) summary(testtheta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.02314 0.33660 0.49922 0.49927 0.66422 0.97089 ##Find log(M) - log of maximum value of the ratio of the unnormalized posterior density (q) to the approximating density logM <- loglike_binom(theta=(sum(Y)/sum(N)),Y=Y,N=N) ##working with the #unnormalized density simnum <- 2000 #desired simulation sample size storetheta <- vector(mode="numeric",length=simnum) accepted <- 0 rejected <- 0 while(accepted < simnum) { theta_prop <- rlogitnormal(n=1,mu=mu_prior,sigma=sigma_prior) ###Evaluate the likelihood at this proposal llike_prop <- loglike_binom(Y=Y,N=N,theta=theta_prop) U <- runif(1) if (log(U) < llike_prop - logM) { accepted <- accepted + 1 storetheta[accepted] <- theta_prop } else { rejected <- rejected + 1 } } summary(storetheta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.3704 0.5366 0.5781 0.5768 0.6196 0.7594 plot(density(storetheta)) accepted ## [1] 2000 rejected ## [1] 5949 acceptance_rate <- accepted/(accepted+rejected) acceptance_rate ## [1] 0.251604 ###plot posterior density plot with prior superimposed #- to illustrate ideas around quality of ##approximatin density plot(density(storetheta), main="rejection sampler logit-normal(0,1) prior",lwd=2,xlim=c(0,1)) lines(density_prior,lwd=2,col="red") Using a Beta approximation to the posterior as the approximating density. We consider again the diffuse logit-normal prior. ##Go back to the original prior mu_prior <- logit(0.5) sigma_prior <- 5 ##Re-draw the logit-normal sample from the prior testtheta <- rlogitnormal(n=10000,mu=mu_prior,sigma=sigma_prior) density_prior <- density(testtheta) plot(density(testtheta,from=0,to=1),main=paste("logitN(",round(mu_prior,2),",",sigma_prior,")",sep="")) summary(testtheta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.00000 0.03154 0.49078 0.49650 0.96405 1.00000 We can approximate the logit-normal prior with a Beta distribution with the same mean and variance. mean_theta <- mean(testtheta) var_theta <- (sd(testtheta))^2 mean_theta ## [1] 0.4965012 var_theta ## [1] 0.1738804 a <- (mean_theta^2*(1-mean_theta) - var_theta*mean_theta)/var_theta b <- a*(1-mean_theta)/mean_theta a ## [1] 0.2173184 b ## [1] 0.2203812 ###plot the beta density with these values ##could do exactly using dbeta but will simulate apprxtheta <- rbeta(10000,a,b) summary(apprxtheta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.00000 0.03064 0.48766 0.49498 0.96309 1.00000 mean(apprxtheta) ## [1] 0.4949783 (sd(apprxtheta))^2 ## [1] 0.1734506 density_prior_beta <- density(apprxtheta,from=0,to=1) plot(density_prior_beta,main=paste("beta(",round(a,2),",",round(b,2),")",sep="")) We can construct a beta approximation to the posterior using standard the standard conjugate udating formulae for the binomial-beta model. apos <- a + sumY #sum_i Y_i bpos <- b + sumNmY #sum_i (N_i-Y_i) #check approximate posterior mean apos/(apos+bpos) ## [1] 0.5827045 pos_approx <- rbeta(10000,apos,bpos) density_pos_betaapprx <- density(pos_approx) plot(density(pos_approx)) We Need to find the maximum of \(\mathrm{Binomial}(Y|N,\theta)\mathrm{logit}(\theta|mu,\sigma) / \mathrm{beta}(\theta|\mathrm{apos},\mathrm{bpos})\) on a log scale. We take a simple brute force approach by evaluating \(\log r(\theta)\) over the parameter space which is just \((0,1).\) ###set up a function for logr() logr <- function(phi) { rtheta <- (sum(Y)*log(phi) + sum(N-Y)*log(1-phi) + dlogitnorm(q=phi,mu=mu_prior,s=sigma_prior,log=TRUE)) - dbeta(phi,shape1=apos,shape2=bpos,log=TRUE) return(rtheta)} #evaluate logr over the (0,1) interval theta_test <- seq(from=0.001,to=0.999,by=0.0001) logrtheta<- logr(phi=theta_test) summary(logrtheta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## -238.37 -66.69 -50.47 -60.26 -43.79 -41.87 plot(theta_test,logrtheta) which.max(logrtheta) ###This identifies the position in the logrtheta vector ## [1] 5851 ###at which the maximum value of logrtheta is ##### found theta_maxr <- theta_test[which.max(logrtheta)] #Value of theta which maximises logrtheta logM2 <- logrtheta[which.max(logrtheta)] theta_maxr ## [1] 0.586 logM2 ## [1] -41.86694 ##check logM2 should equal the value of logr evaluated at theta_maxr logr(theta_maxr) ## [1] -41.86694 Now we know \(\log M\) we can set up the rejection sampler. ###Set up the rejection sampler simnum <- 2000 #desired simulation sample size storetheta2 <- vector(mode="numeric",length=simnum) accepted2 <- 0 rejected2 <- 0 while(accepted2 < simnum) { theta_prop <- rbeta(n=1,shape=apos,shape2=bpos) ###Evaluate the log - importance ratio at this proposal logr_prop <- logr(phi=theta_prop) U <- runif(1) if (log(U) < logr_prop - logM2) { accepted2 <- accepted2 + 1 storetheta2[accepted2] <- theta_prop } else { rejected2 <- rejected2 + 1 } } summary(storetheta2) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.4185 0.5546 0.5846 0.5843 0.6143 0.7632 plot(density(storetheta2)) accepted2 ## [1] 2000 rejected2 ## [1] 779 acceptance_rate2 <- accepted2/(accepted2 + rejected2) acceptance_rate2 ## [1] 0.7196833 ##Superimpose density plot for beta approximation densisty to #illustrate good approximator plot(density(storetheta2),main="rejection sampler using beta density as the approximator", xlim=c(0,1),lwd=2) lines(density_pos_betaapprx,lwd=2,col="red") Next we repeat the beta-approximation approach for the model with $ (0,1)$ mu_prior <- logit(0.5) sigma_prior <- 1 ##Re-draw the logit-normal sample from the prior testtheta <- rlogitnormal(n=10000,mu=mu_prior,sigma=sigma_prior) density_prior <- density(testtheta) plot(density(testtheta,from=0,to=1),main=paste("logitN(",round(mu_prior,2),",",sigma_prior,")",sep="")) summary(testtheta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.0191 0.3359 0.5030 0.4992 0.6590 0.9824 ###obtain a beta distribution with the same mean and variance as logit-normal sample mean_theta <- mean(testtheta) var_theta <- (sd(testtheta))^2 mean_theta ## [1] 0.4992291 var_theta ## [1] 0.04363631 a <- (mean_theta^2*(1-mean_theta) - var_theta*mean_theta)/var_theta b <- a*(1-mean_theta)/mean_theta a ## [1] 2.360934 b ## [1] 2.368226 ###plot the beta density with these values ##could do exactly using dbeta but will simulate apprxtheta <- rbeta(10000,a,b) summary(apprxtheta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.01475 0.33750 0.49697 0.49981 0.66027 0.99415 mean(apprxtheta) ## [1] 0.4998055 (sd(apprxtheta))^2 ## [1] 0.0441892 density_prior_beta <- density(apprxtheta,from=0,to=1) plot(density_prior_beta,main=paste("beta(",round(a,2),",",round(b,2),")",sep="")) ###beta approximation to the posterior -- #this will be the approximating density for the rejection sampler ## That is we are approximating the posterior with a distribution ## that could plausibly have been the posterior had a conjugate prior ##prior been adopted. apos <- a + sumY #sum_i Y_i bpos <- b + sumNmY #sum_i (N_i-Y_i) #check approximate posterior mean apos/(apos+bpos) ## [1] 0.5771886 pos_approx <- rbeta(10000,apos,bpos) density_pos_betaapprx <- density(pos_approx) plot(density(pos_approx)) ####Need to find max of #Binomial(Y|N,\theta)logit(theta|mu,\sigma) / beta(\theta|apos,bpos); # (on log scale) ##Function to compute logr logr <- function(phi) { rtheta <- (sum(Y)*log(phi) + sum(N-Y)*log(1-phi) + dlogitnorm(q=phi,mu=mu_prior,s=sigma_prior,log=TRUE)) - dbeta(phi,shape1=apos,shape2=bpos,log=TRUE) return(rtheta)} #evaluate logr over the (0,1) interval theta_test <- seq(from=0.001,to=0.999,by=0.0001) logrtheta<- logr(phi=theta_test) summary(logrtheta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## -259.66 -66.91 -49.49 -60.18 -42.36 -40.31 plot(theta_test,logrtheta) which.max(logrtheta) ###This identifies the position in the logrtheta vector ## [1] 5797 ###at which the maximum value of logrtheta is found theta_maxr <- theta_test[which.max(logrtheta)] #Value of theta which maximises logrtheta logM2 <- logrtheta[which.max(logrtheta)] theta_maxr ## [1] 0.5806 logM2 ## [1] -40.31172 ##check logM2 should equal the value of logr evaluated at theta_maxr logr(theta_maxr) ## [1] -40.31172 Now, we know \(\log M\) we can set up the rejection sampler. simnum <- 2000 #desired simulation sample size storetheta3 <- vector(mode="numeric",length=simnum) accepted3 <- 0 rejected3 <- 0 while(accepted3 < simnum) { theta_prop <- rbeta(n=1,shape=apos,shape2=bpos) ###Evaluate the log - importance ratio at this proposal logr_prop <- logr(phi=theta_prop) U <- runif(1) if (log(U) < logr_prop - logM2) { accepted3<- accepted3 + 1 storetheta3[accepted3] <- theta_prop } else { rejected3 <- rejected3 + 1 } } summary(storetheta3) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.4442 0.5486 0.5798 0.5787 0.6088 0.7089 plot(density(storetheta3)) accepted3 ## [1] 2000 rejected3 ## [1] 782 acceptance_rate3 <- accepted3/(accepted3 + rejected3) acceptance_rate3 ## [1] 0.7189073 ##Superimpose density plot for beta approximation densisty to #illustrate good approximator plot(density(storetheta3),main="rejection sampler using beta density as the approximator", xlim=c(0,1),lwd=2) lines(density_pos_betaapprx,lwd=2,col="red")