CS计算机代考程序代写 algorithm STAT314/461 Semester 2, 2021

STAT314/461 Semester 2, 2021
Assignment 4: Probability revision; rejection sampling;

importance sampling; Metropolis algorithm

Instructions:

1. Total Marks: This assessment will be worth up to 10% of your final
grade (7.5% for STAT461 students). Note that although the marks for
the questions below total to 25 the maximum mark that can be earned
for the assignment is 10. It is up to you which questions you submit for
assessment. Keep in mind the exam at the end of the course – working
through these questions is good practice / revision even if they don’t
directly contribute to your grade for the assignment. You are welcome
to submit all the questions

2. Due date: Due 11:55pm Wednesday October 6th 2018.

3. Your R code should be included with the submission as a separate
file(s). So if you are using R markdown to prepare your assignment
please include the .Rmd file as well as the html or pdf produced by
knitr. Otherwise include your .r file along with a pdf containing text
and graphics.

4. While you are welcome to discuss the assignment with other 314/461
students you must ultimately submit a document that persuades the
marker that you, individually, understand the material.

1. Probability practice
Suppose X and Y are discrete random variables that are exchangeable.
That is

Pr(X = a, Y = b) = Pr(X = b, Y = a),∀a, b.

Show that the marginal distributions of X and Y are identical. [ 2
marks]
( This may seem a bit obscure but we will find this result useful when
proving the convergence of the Metropolis-Hastings algorithm )

2. Likelihood and Rejection sampling Consider the simple but non-
conjugate model for the births data given in the R code contained in

births2021_assignment4.R

Yi
indep∼ Poisson(θNi), i = 1, . . . 16

θ ∼ logNormal(−4.5, 0.59) (1)

where Yi is the number of births for women aged 15-19 in the i
th region

and Ni is the number of women aged 15-19, in the i
th region. We

regard the population sizes as known constants. Under this model
E(Yi/Ni|θ) = θ, i = 1, . . . 16 That is, under the model, the expected
birth rate is identical for each of the regions and is equal to θ. The
parameter θ is sometimes referred to as the underlying rate. Under the
model the expected number of births in the ith region is E(Yi|θ) = θNi,
which varies over the regions on account of variations in population
size. The Poisson probability mass function in this case is

Pr(Yi = y|θ) = (θNi)y exp(−θNi)/y!

which can be computed in R with the dpois() function using

Pr(Yi = y|θ) = dpois(x = y, lambda = (θNi)),

with the addition of the

log=TRUE

option returning the log of the probability mass function.

logNormal() refers to the log−normal distribution which can be ac-
cessed using the lnorm family of functions in R; e.g. rlnorm(n =
1000,meanlog = 1, sdlog = 1) generates 1000 values from the log-
normal distribution with parameters (1, 1).

(a) Write a function in R to compute the log-likelihood, for any value
of θ based on the births data, and plot the log-likelihood for a
range of θ values spanning the MLE. [3 marks].
(Hint: the log-likelihood will be the sum of log-likelihood contri-
butions from each of the 16 regions, so you will need to include
a sum() statement at the end of your function in order to ensure
the overall log-likelihood for the data is returned). It would be
good practice to make the parameter θ the first argument of the
function.

(b) Write a R program to implement rejection sampling to approx-
imate the posterior for θ, using the prior as the approximating
density [3 marks]

(c) Write a R program to implement rejection sampling to approxi-
mate the posterior for θ, using a Gamma density as the approxi-
mating density [5 marks]

(d) Plot the posterior density for θ based on the sample obtained from
at least one of you rejection sampler methods with the approxi-
mating density shown also. [2 marks]

3. Importance sampling

(a) Write a progam to implement an importance sampler to simulate a
t3(0, 1) density, using a N(0, 1) as the approximating density. [3 marks]

(b) Compute the effective Monte Carlo sample size Neff [2 marks]

(c) Contrast the importance sampling estimates of mean, standard devia-
tion, and proportion less than -2.35 and contrast these with the true
values of the corresponding features of the t3(0, 1) distribution [2 marks]

(d) In light of your answers to (b) and (c) comment on the adequacy of
the choice of a N(0, 1) as the approximating density for the importance
sampler. [1 mark]

4. Metropolis algorithm for posterior inference
Write a R program to implement a Metropolis algorithm to obtain a
sample from the posterior distribution of the parameters of a logistic
regression model fitted to data from the Income Survey. You can use
the binary outcome Y = (income > $1, 250 per week) discussed in
lectures or another variable of your choosing. However, your model
should include at least two predictors and “make sense.” For example
I don’t want to see models predicting male sex from age and region.

A shell program has been provided to help get you going. This code is
in:

MH_examples2021_logisticregression_shell.r

This program reads in the Income Survey data and functions for com-
puting the log-likelihood and log- unnormalised posterior for a logistic
regression model as well as functions to help with setting priors for
logistic regression model parameters. In the main code you will need
to replace they symbol “…” with appropriate choices. In the course of
so-doing you will

(a) specify a proper prior for the parameters of your model.

(b) make clear how you choose your initial values

(d) decide whether your Metropolis algorithm has converged to the
target posterior distribution and produce evidence that the algo-
rithm has in fact converged.

(e) produce some posterior inferences for one parameter of interest in
your model: posterior density plot for the odds ratio correspond-
ing to the parameter of interest; 90% credible interval for the odds
ratio; the posterior probability that the odds ratio exceeds one.
[10 marks]

Related Posts