1
STAT6106
Assignment 3 (Chapter 7 – 9)
Due date: 6th Dec, 2021 (Please submit online on Blackboard)
1. (40 marks) Let 𝑋𝑋1, … ,𝑋𝑋𝑛𝑛 be a random sample from a measurement error model, so that
𝑋𝑋𝑖𝑖 = 𝜃𝜃 + 𝜖𝜖𝑖𝑖 , 𝑖𝑖 = 1, … ,𝑛𝑛 where, 𝜖𝜖𝑖𝑖 are the measurement errors that are mean zero,
unimodal and symmetric about 0. The parameter 𝜃𝜃 is of interest. For robustness
purpose, the model is formed as below:
𝑋𝑋𝑖𝑖~𝑁𝑁(𝜃𝜃,𝜎𝜎2)
𝜃𝜃 ∼ 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶ℎ𝑦𝑦(𝜇𝜇, 𝜏𝜏)
where 𝜎𝜎2, 𝜇𝜇 and 𝜏𝜏 are chosen hyperparameter values.
Remark: 𝑌𝑌 ∼ 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶ℎ𝑦𝑦(𝜇𝜇, 𝜏𝜏) implies that 𝑌𝑌 has pdf: 𝑓𝑓(𝑦𝑦) = 1
𝜋𝜋𝜋𝜋
𝜋𝜋2
𝜋𝜋2+(𝑦𝑦−𝜇𝜇)2
,−∞ < 𝑦𝑦 < ∞.
Remark: Γ �1
2
� = √𝜋𝜋.
a. Show that if
𝜃𝜃|𝜆𝜆~𝑁𝑁 �𝜇𝜇, 𝜋𝜋
2
𝜆𝜆
� and 𝜆𝜆 ∼ 𝐺𝐺𝐶𝐶𝐺𝐺𝐺𝐺𝐶𝐶 �1
2
, 1
2
�
Then the marginal distribution of 𝜃𝜃 is
𝜃𝜃~𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶ℎ𝑦𝑦(𝜇𝜇, 𝜏𝜏)
b. The posterior distribution of 𝜃𝜃 is relatively difficult to simulate directly because not
of a standard distribution form. However, the result of part a. can help to form a
Gibbs sampler to simulate from the posterior distribution of 𝜃𝜃. Please write down the
full conditional distributions of 𝜃𝜃|𝜆𝜆, 𝑥𝑥1, … , 𝑥𝑥𝑛𝑛 and 𝜆𝜆|𝜃𝜃, 𝑥𝑥1, … , 𝑥𝑥𝑛𝑛 which are the
building blocks of the Gibbs sampler.
c. Your observations {𝑥𝑥1, … , 𝑥𝑥10} are {2.909, 1.756, 2.536, -0.373, 0.914, -0.039, 0.411,
1.459, 3.779, 2.534}. Let’s choose 𝜎𝜎2 = 1, 𝜇𝜇 = 0, 𝜏𝜏 = 1.
i. Run the Gibbs sampler to generate a MCMC sample for the posterior
distribution of 𝜃𝜃.
ii. Approximate a 95% equal-tail credible interval for the posterior distribution of
𝜃𝜃 based on your MCMC sample.
2. (40 marks) In the Lecture notes of chapter 8, we have used the Zellner’s g-prior to fit a
linear regression model to the Oxygen uptake data. In this question, please fit the linear
regression model (consider the full model only) using semi-conjugate prior suggested in
p.12-13 of the lecture notes. Please assume Σ0 to be diagonal, i.e. the 𝛽𝛽’s are apriori
independent. Please provide summaries for the posterior distributions of the 𝛽𝛽’s and
provide some diagnostics to justify your MCMC simulation results. Please do this
exercise by performing Gibbs sampling in the following using 2 methods:
a. By writing a R program
b. Use rjags package
2
3. (20 marks) A common assumption when modelling genotypes of bi-allelic loci (e.g. loci with
alleles A and a) is that the population is “randomly mating”. From this assumption it follows
that if p and 1-p are the frequencies of the allele A and a respectively, then the
genotypes AA, Aa and aa will have frequencies p2, 2p(1-p) and (1-p)2 respectively.
Now suppose that we sampled 100 independent individuals and find that 45, 25, 30 of them are
of genotypes AA, Aa and aa respectively. Consider a U(0,1) prior for p.
Use R to write a Metropolis algorithm to approximate the posterior distribution for p. Give a
summary of your simulation result. Produce some plots to check the convergence of your
algorithm.
~~END~~