End-of-year Examinations, 2020 STAT314 / STAT461 -20S2 (C)
Page 1 of 9
No electronic/communication devices are permitted.
No exam materials may be removed from the exam room.
Mathematics and Statistics
EXAMINATION
End-of-year Examinations, 2020
STAT314-20S2 (C) Bayesian Inference
STAT461-20S2 (C) Bayesian Inference
Examination Duration: 180 minutes
Exam Conditions:
Open Book exam: Students may bring in any written or printed materials.
Calculators with a ‘UC’ sticker approved.
Materials Permitted in the Exam Venue:
Open Book exam: Students may bring in any written or printed materials.
Materials to be Supplied to Students:
1 x Standard 16-page UC answer book.
Instructions to Students:
Remember to write your name and student number on ALL answer booklets.
Start each question on a new page.
This is an open book examination.
Show all working and calculations.
Family Name _____________________
First Name _____________________
Student Number |__|__|__|__|__|__|__|__|
Venue ____________________
Seat Number ________
End-of-year Examinations, 2020 STAT314 / STAT461 -20S2 (C)
Page 2 of 9
Questions Start on Page 3
STAT314/461-20S2
TURN OVER
Page 3 of 9
Question 1 (5 marks)
Let v and w be continuous random variables and suppose that the following expression is the
unnormalized beta density for one of them (conditionally given the other):
.
Part (a) (1 mark)
Which random variable, v or w, is the above expression the unnormalized beta density for?
Part (b) (2 marks)
What are the parameters of the beta density in the above expression?
Part (c) (2 marks)
What are the supports of v and w?
Question 2. (5 marks)
Evaluate the following integral,
� 𝑥𝑥3(1 − 𝑥𝑥)−1 2⁄ 𝑑𝑑𝑥𝑥 ,
1
0
using only insights about statistical distributions, without doing integration by parts or using a
calculator. Show all of your working clearly and give your answer as a fraction in its simplest
(lowest) form.
The following facts about the gamma function, Γ(), may be helpful:
• Recursive property: For k > 1, Γ(k) = (k−1) Γ(k−1).
• Γ(1) = 1.
ww vv 2)1( −
STAT314/461-20S2
TURN OVER
Page 4 of 9
Question 3. (5 marks)
Let , where µ is mean and ν is precision. Suppose that we use a prior
distribution for such that the resulting posterior distribution can be factorized as
𝒑𝒑(𝝁𝝁,𝝂𝝂|𝑿𝑿𝒏𝒏) = 𝒑𝒑(𝝁𝝁|𝝂𝝂,𝑿𝑿𝒏𝒏)𝒑𝒑(𝝂𝝂|𝑿𝑿𝒏𝒏) ,
where 𝑿𝑿𝒏𝒏 = (𝒙𝒙𝟏𝟏, … , 𝒙𝒙𝒏𝒏) , 𝒑𝒑(𝝁𝝁|𝝂𝝂,𝑿𝑿𝒏𝒏) = 𝑵𝑵(𝝁𝝁|𝒎𝒎∗, 𝒔𝒔∗𝝂𝝂) , and 𝒑𝒑(𝝂𝝂|𝑿𝑿𝒏𝒏) = 𝒈𝒈𝒈𝒈𝒎𝒎𝒎𝒎𝒈𝒈(𝝂𝝂|𝒈𝒈∗,𝒃𝒃∗) , and
where , , and are posterior parameters. Recall that the posterior predictive distribution,
𝒑𝒑(𝒙𝒙�|𝑿𝑿𝒏𝒏), will be a generalized t distribution.
Now suppose that we wish to generate sample values from the posterior predictive distribution, but
we do not know how to generate from the generalized t distribution. Describe clearly how we can
obtain the required sample values by using an appropriate factorization of the joint posterior,
𝑝𝑝(𝑥𝑥�, 𝜇𝜇, 𝜈𝜈|𝑋𝑋𝑛𝑛).
),(~,|,…,1 νµνµ Nxx
iid
n
),( νµ
∗m ∗s ∗a ∗b
STAT314/461-20S2
TURN OVER
Page 5 of 9
Question 4. (15 marks)
Let Bernoulli(p), 𝑿𝑿𝒏𝒏 = (𝒙𝒙𝟏𝟏, … ,𝒙𝒙𝒏𝒏), and . We have seen in class that, using a
flat prior density for p, the posterior density for p is
,
which is .
Let in parts (b), (c) and (d) of this question.
Part (a) (3 marks)
Use a Bayesian argument to find the maximum likelihood estimate of p.
Part (b) (2 marks)
Now suppose that we re-parameterize the Bernoulli model using q. Write down the likelihood
function for the Bernoulli model with q as the parameter, i.e. what is equal to?
Part (c) (5 marks)
Given that with a flat prior density for p, the corresponding prior density for q is, for
, and 0 otherwise.
Use Bayes’ rule to derive the posterior density, .
Is the posterior density, , a beta density? Explain clearly why or why not.
Part (d) (5 marks)
Use the density transformation theorem to obtain the posterior density, , directly from
𝒇𝒇(𝒑𝒑|𝑿𝑿𝒏𝒏).
iid
n pxx ~|,…,1 ∑
=
=
n
i
ixs
1
sns
n ppXpf
−−∝ )1()|(
)1,1( +−+ snsbeta
21pq =
)|( qXf n
qqf 2)( =
]1,0[∈q
)|( nXqf
)|( nXqf
)|( nXqf
STAT314/461-20S2
TURN OVER
Page 6 of 9
Question 5. (15 marks)
Let 𝑥𝑥1, … , 𝑥𝑥𝑛𝑛 be independent and identically distributed random variables that have a uniform
distribution on the interval, [0,𝜃𝜃], i.e. the lower boundary of the interval is zero and the upper
boundary, 𝜃𝜃, is unknown. The probability density function is given by
𝑝𝑝(𝑥𝑥𝑖𝑖|𝜃𝜃) = 𝜃𝜃−1𝐼𝐼(𝜃𝜃 ≥ 𝑥𝑥𝑖𝑖),
where 𝐼𝐼(𝑐𝑐𝑐𝑐𝑐𝑐𝑑𝑑𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐) is the indicator function, which returns a value of 1 when condition is satisfied,
and a value of 0 otherwise.
Part (a) (2 marks)
Show that the joint likelihood function for 𝑥𝑥1, … , 𝑥𝑥𝑛𝑛 is
𝑝𝑝(𝑥𝑥1, … , 𝑥𝑥𝑛𝑛|𝜃𝜃) = 𝜃𝜃−𝑛𝑛𝐼𝐼(𝜃𝜃 ≥ 𝑥𝑥𝑚𝑚𝑚𝑚𝑚𝑚),
where 𝑥𝑥𝑚𝑚𝑚𝑚𝑚𝑚 = max {𝑥𝑥1, … , 𝑥𝑥𝑛𝑛}.
Hint: 𝐼𝐼(𝑐𝑐𝑐𝑐𝑐𝑐𝑑𝑑𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐1) × 𝐼𝐼(𝑐𝑐𝑐𝑐𝑐𝑐𝑑𝑑𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐2) = 𝐼𝐼(𝑐𝑐𝑐𝑐𝑐𝑐𝑑𝑑𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐1 𝐴𝐴𝐴𝐴𝐴𝐴 𝑐𝑐𝑐𝑐𝑐𝑐𝑑𝑑𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐2).
Part (b) (3 marks)
Consider the following proper prior density for 𝜃𝜃:
𝑝𝑝(𝜃𝜃) = 𝛼𝛼𝛽𝛽𝛼𝛼𝜃𝜃−(𝛼𝛼+1)𝐼𝐼(𝜃𝜃 ≥ 𝛽𝛽),
where 𝛼𝛼 > 0 and 𝛽𝛽 > 0 are the hyperparameters.
Derive the resulting posterior density, 𝑝𝑝(𝜃𝜃|𝑥𝑥1, … , 𝑥𝑥𝑛𝑛).
Part (c) (3 marks)
Is the prior density in part (b) a conjugate prior density for 𝜃𝜃? Justify your answer.
If your answer is yes, state how the posterior parameters are updated from the prior parameters, i.e.
𝛼𝛼 → ? and 𝛽𝛽 → ?
Part (d) (2 marks)
Suppose that we know how to generate random values from the posterior distribution, 𝑝𝑝(𝜃𝜃|𝑥𝑥1, … , 𝑥𝑥𝑛𝑛).
Use pseudo-code to describe how to generate 𝐴𝐴 random values, 𝑥𝑥�1, … , 𝑥𝑥�𝑁𝑁 , from the posterior
predictive distribution, 𝑝𝑝(𝑥𝑥�|𝑥𝑥1, … , 𝑥𝑥𝑛𝑛).
STAT314/461-20S2
TURN OVER
Page 7 of 9
Part (e) (2 marks)
Now consider another prior density for 𝜃𝜃:
𝑝𝑝(𝜃𝜃) ∝ 𝜃𝜃−1.
Show whether this prior density is proper or improper.
Hint: ∫𝜃𝜃−1𝑑𝑑𝜃𝜃 = log (𝜃𝜃), where log denotes the natural logarithm.
Part (f) (3 marks)
Derive the posterior density, 𝑝𝑝(𝜃𝜃|𝑥𝑥1, … , 𝑥𝑥𝑛𝑛), corresponding to the prior density in part (e).
Explain, without doing any integration, why the resulting posterior density is a proper density.
STAT314/461-20S2
TURN OVER
Page 8 of 9
Question 6. (15 marks)
Suppose we have 𝐾𝐾 independent experiments, where a single scalar observation is made in each
experiment. Let the observation in experiment 𝑘𝑘 be 𝑥𝑥𝑘𝑘, with the following distribution:
𝑝𝑝(𝑥𝑥𝑘𝑘|𝜃𝜃𝑘𝑘 ,𝜎𝜎𝑘𝑘
2) = 𝐴𝐴(𝑥𝑥𝑘𝑘|𝜃𝜃𝑘𝑘 ,𝜎𝜎𝑘𝑘
2),
for 𝑘𝑘 = 1, … ,𝐾𝐾. Hence, 𝑥𝑥𝑘𝑘 has a normal distribution with mean, 𝜃𝜃𝑘𝑘, and variance, 𝜎𝜎𝑘𝑘
2. Suppose that
𝜃𝜃1, … ,𝜃𝜃𝐾𝐾 are unknown, and that 𝜎𝜎12, … ,𝜎𝜎𝐾𝐾2 are assumed to be known.
Consider the following hierarchical model:
𝑝𝑝(𝑥𝑥1, … , 𝑥𝑥𝐾𝐾|𝜃𝜃1, … ,𝜃𝜃𝐾𝐾 ,𝜎𝜎12, … ,𝜎𝜎𝐾𝐾2) = �𝐴𝐴(𝑥𝑥𝑘𝑘|𝜃𝜃𝑘𝑘 ,𝜎𝜎𝑘𝑘
2)
𝐾𝐾
𝑘𝑘=1
,
𝑝𝑝(𝜃𝜃1, … ,𝜃𝜃𝐾𝐾|𝜇𝜇, 𝜏𝜏) = �𝐴𝐴(𝜃𝜃𝑘𝑘|𝜇𝜇, 𝜏𝜏2)
𝐾𝐾
𝑘𝑘=1
,
𝑝𝑝(𝜇𝜇, 𝜏𝜏) = 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐.
The resulting posterior distributions are as follows:
𝑝𝑝(𝜃𝜃𝑘𝑘|𝜇𝜇, 𝜏𝜏, 𝑥𝑥𝑘𝑘) = 𝐴𝐴(𝜃𝜃𝑘𝑘|𝜃𝜃�𝑘𝑘, 𝜈𝜈𝑘𝑘
2), for 𝑘𝑘 = 1, … ,𝐾𝐾,
𝑝𝑝(𝜇𝜇|𝜏𝜏, 𝑥𝑥1, … , 𝑥𝑥𝐾𝐾) = 𝐴𝐴(𝜇𝜇|𝜇𝜇,� 𝜔𝜔2),
𝑝𝑝(𝜏𝜏|𝑥𝑥1, … , 𝑥𝑥𝐾𝐾) ∝ 𝜔𝜔�(𝜎𝜎𝑘𝑘
2 + 𝜏𝜏2)−1 2⁄ 𝑒𝑒𝑥𝑥𝑝𝑝 �−
(𝑥𝑥𝑘𝑘 − �̂�𝜇)2
2(𝜎𝜎𝑘𝑘
2 + 𝜏𝜏2)
�
𝐾𝐾
𝑘𝑘=1
where
𝜃𝜃�𝑘𝑘 = �
1
𝜎𝜎𝑘𝑘
2 +
1
𝜏𝜏2
�
−1
�
1
𝜎𝜎𝑘𝑘
2 𝑥𝑥𝑘𝑘 +
1
𝜏𝜏2
𝜇𝜇� ,
1
𝜈𝜈𝑘𝑘
2 =
1
𝜎𝜎𝑘𝑘
2 +
1
𝜏𝜏2
,
�̂�𝜇 = ��
1
𝜎𝜎𝑘𝑘
2 + 𝜏𝜏2
𝐾𝐾
𝑘𝑘=1
�
−1
�
1
𝜎𝜎𝑘𝑘
2 + 𝜏𝜏2
𝑥𝑥𝑘𝑘
𝐾𝐾
𝑘𝑘=1
,
1
𝜔𝜔2
= �
1
𝜎𝜎𝑘𝑘
2 + 𝜏𝜏2
𝐾𝐾
𝑘𝑘=1
.
STAT314/461-20S2
Page 9 of 9
Part (a) (2 marks)
Draw a directed acyclic graph for the hierarchical model.
Part (b) (5 marks)
Describe how sample values can be generated from the marginal posterior distribution,
𝑝𝑝(𝜏𝜏|𝑥𝑥1, … , 𝑥𝑥𝐾𝐾), using the method of grid points.
Part (c) (8 marks)
Now suppose that we have coded a function called, tauPosterior, to generate sample values from
𝑝𝑝(𝜏𝜏|𝑥𝑥1, … , 𝑥𝑥𝐾𝐾). Use pseudo-code to describe how to generate 𝐴𝐴 random values, 𝑥𝑥�𝑘𝑘,1, … , 𝑥𝑥�𝑘𝑘,𝑁𝑁, from
the posterior predictive distribution, 𝑝𝑝(𝑥𝑥�𝑘𝑘|𝑥𝑥1, … , 𝑥𝑥𝑛𝑛), for experiment 𝑘𝑘.
END OF PAPER
Question 1 (5 marks)
Question 2. (5 marks)
Question 3. (5 marks)
Question 4. (15 marks)
Question 5. (15 marks)
Question 6. (15 marks)