MAST30027: Modern Applied Statistics
Assignment 3, 2021.
Due: 5pm Wednesday October 6th
• This assignment is worth 8% of your total mark.
• To get full marks, show your working including 1) R commands and outputs you use, 2)
mathematics derivation, and 3) rigorous explanation why you reach conclusions or answers.
If you just provide final answers, you will get zero mark.
• Your assignment must be submitted on Canvas LMS as a single PDF document only (no
other formats allowed). Your answers must be clearly numbered and in the same order as
the assignment questions.
• The LMS will not accept late submissions. It is your responsibility to ensure that your
assignments are submitted correctly and on time, and problems with online submissions are
not a valid excuse for submitting a late or incorrect version of an assignment.
• We will mark a selected set of problems. We will select problems worth ≥ 50% of the full
marks listed.
• Answers including images of screen-captured R codes or figures won’t be marked.
• Also, please read the “Assessments” section in “Subject Overview” page of the LMS.
1. The file assignment3 prob1.txt contains final exam scores of 100 students in Modern
Applied Statistics. We can read the scores as follows.
> X = scan(file=”assignment3_prob1.txt”, what=double())
Read 100 items
> length(X)
[1] 100
> mean(X)
[1] 75.726
Suppose that the 100 scores are independent to each other and they follow Normal distri-
bution with mean = 75 and unknown precision τ . Specifically, let x1, . . . , x100 be the final
exam scores, and
xi ∼ N(75,
1
τ
) for i = 1, . . . , 100.
Suppose that the precision τ has a Gamma(2, 1) prior distribution, where Gamma(α, β)
has the pdf
f(x) =
βα
Γ(α)
xα−1 exp (−βx) .
(a) (5 marks) Derive the posterior distribution of the precision τ conditioned on the final
exam scores of 100 students, p(τ |x1, . . . , x100). Evaluate parameters in the posterior
distribution using the data from assignment3 prob1.txt.
1
(b) (6 marks) Derive the posterior predictive distribution for a new score x̃, p(x̃|x1, . . . , x100).
Evaluate parameters in the posterior predictive distribution using the data from
assignment3 prob1.txt.
[Hint for (b)] A three-parameter version of a t distribution (Jackman, S. (2009)), denoted
by t(ν, a, b), has the pdf
p(x|ν, a, b) =
Γ(ν+1
2
)
Γ(ν
2
)
√
πνb
(
1 +
1
ν
(x− a)2
b
)− ν+1
2
.
2. (9 marks) Gamma random variables can be used to simulate chi-square, t, F, beta, and
Dirichlet distributions, as well as being useful in their own right. Hence it is important to
be able to generate gamma r.v.s as efficiently as possible. In this assignment we investigate
a popular algorithm due to Marsaglia and Tsang, for α ≥ 1.1
(1) If X ∼ Gamma(α, 1) then X/λ ∼ Gamma(α, λ).
Note that Gamma(α, β) has the pdf
f(x) =
βα
Γ(α)
xα−1 exp (−βx) .
(2) Assume that h(x) is strictly increasing and maps the range of x onto [0,∞). If X has
density h(x)α−1e−h(x)h′(x)/Γ(α) then Y = h(X) ∼ Gamma(α, 1).
(3) Given α ≥ 1 put d = α − 1/3 and c = 1/
√
9d and then define h : [−1/c,∞) → [0,∞)
by h(x) = d(1 + cx)3. Then,
h(x)α−1e−h(x)h′(x) ∝ exp(g(x)) where
g(x) = d log((1 + cx)3)− d(1 + cx)3 + d
and
exp(g(x)) ≤ exp(−x2/2) on [−1/c,∞).
Using these facts (checking these facts is not required for the assignment 3), come up with
an algorithm for simulating from Gamma(α, λ) for α ≥ 1. You may assume that you can
already simulate from the standard normal distribution. 1) Provide a brief description of
your algorithm (e.g., which algorithm you use. If you use a rejection method, explain which
function has been used as envelop). 2) Code up your algorithm and use it to generate 1000
Gamma(1.2, 3) pseudo-random variables. Demonstrate that your algorithm is working using
a q-q plot.
The following R commands show how to make q-q plot for 1000 samples generated by
> g <- rgamma(1000, 1.2, 3) For your assignment, instead of rgamma(1000, 1.2, 3), you should write your own code which implements your algorithm. > g <- rgamma(1000, 1.2, 3) > plot(qgamma(1:1000/1001, 1.2, 3), sort(g))
> abline(0, 1, col=”red”)
2
3. (2 marks) Read the “Assessments” section of “Subject Overview” page of the LMS. Provide
the requested information in the first page of your assignment.
1G. Marsaglia and W.W. Tsang, A simple method for generating gamma variables. ACM Trans. Math. Software,
26:363–371, 2000.
3