代写代考 STAT3023: Statistical Inference Semester 2, 2021

The University of of Mathematics and Statistics
Solutions to Tutorial Week 8
STAT3023: Statistical Inference Semester 2, 2021
Lecturers: and

Copyright By PowCoder代写 加微信 powcoder

1. For the simple prediction problem where Y has a strictly increasing, continuous CDF F(·) and μ = E(Y ) exists and is finite and the decision space is D = R, determine the decision d that minimises the risk
R(d) = E [L(d|Y )] for the asymmetric piecewise-linear loss function given by
􏰐p(y − d) for d < y, (1−p)(d−y) ford>y
and some 0 < p < 1 (hint: we have already seen the case p = 0.5). Write f (y) = d F (y) for the density (PDF) of Y . Then dy R(d) = (1 − p) (d − y)f (y) dy + p (y − d)f (y) dy = (1 − p)dF (d) − pd [1 − F (d)] + pμ − [(1 − p) + p] The derivative is = dF (d) − pd + pμ − yf (y) dy . R′ (d) = df (d) + F (d) − p − [df (d)] = F (d) − p . This is negative (so R(d) decreases) for d such that F(d) < p. R(d) increases for d such that F(d) > p; it is thus minimised at d = F−1(p), the p-th quantile of F(·); the case p = 0.5 (which we saw in lectures) gives the “population median”.
2. Determine the optimal decision d ∈ D = R for the simple prediction problem where Y has a continuous distribution on (0,∞) with density f(·) satisfying
• f(x)=0forx≤0;
• f(x)>0anddecreasinginxforx>0
and the loss function L(d|y) is given by L(d|y) =
for some known 0 < C < ∞. 􏰐0 if|d−y|≤C 1 if|d−y|>C,
Solution: The risk is simply the non-coverage probability of the “prediction interval” d ± C. The optimal choice is d = C, yielding the prediction interval [0,2C]. To see why, note that the risk is
R(d) = 1 − P {|d − Y | ≤ C} =1−P(d−C≤Y ≤d+C) =P(Y d+C) = F(d − C) + 1 − F(d + C)
Copyright © 2021 The University of Sydney 1

The derivative is then
R′ (d) = f (d − C ) − f (d + C ) .
So long as 0 < d−C < d+C < ∞, i.e. d > C, this difference is positive (due to the fact that f(·) is decreasing) and so R(d) increases in d > C. For −C < d < C, the difference is negative (the first term f(d − C) is zero in that case, the second term f(d + C) is positive) and so R(d) decreases in −C < d < C. For d ≤ −C, both terms are zero and so the risk is constant. The risk is therefore minimised at d = C. 3. SupposeZ∼N(0,1). (a) Show that for any constant c, 􏰟∞ 1 −1z2 |c + z| √ e 2 [−(c + z)] √ e 2 dz + 2 e − 21 c 2 E{|c+Z|}=c[1−2Φ(−c)]+ √ . (c + z)√ 2π −c 2π =−cΦ(−c)− z√ e 2 dz 􏰟−c 1−1z2 −∞ 2π 􏰟−∞ 1−1z2 +c[1−Φ(−c)]+ z√ e 2 dz 1 􏰎􏰔 −1z2􏰕∞ 􏰔 −1z2􏰕−c 􏰏 =c[1−2Φ(−c)]+√ −e 2 − −e 2 2 e − 21 c 2 =c[1−2Φ(−c)]+ √ (b) Suppose cn → 0 as n → ∞. Determine limn→∞ E {|cn + Z|}. Solution: As cn → 0, • cn [1 − 2Φ(−cn)] → 0; • e− 21 c2n → 1. 4. Suppose X = (X1,...,Xn) consists of iid N(θ,1) random variables and that it is desired to determine Bayes procedures using the weight function/prior is given by w(θ) ≡ 1 (the “flat prior”). Show that the resultant posterior density is the normal density with mean X ̄ = n1 􏰑ni=1 Xi and variance n1 . The likelihood is 􏰝n 􏰖 1 − 1 ( X i − θ ) 2 􏰗 n→∞n→∞ 2π2ππ 􏰐 2 e − 21 c 2n 􏰢 lim E{|cn+Z|}= lim cn[1−2Φ(−cn)]+ √ fθ(X)= √e2 2π =(2π)−n/2e−21 􏰑ni=1(Xi−θ)2 Xi−nθ2 =(2π)−n/2e−1 􏰑n X2+nX ̄2−nX ̄2+θ􏰑n Xi−nθ2 =(2π)−n/2e−1 􏰑n X2+θ􏰑n 2 i=1 i i=1 2 2 i=1 i 2 2 i=1 2 =(2π)−n/2e−1 􏰑n (Xi−X ̄)2e−nX ̄2+θ􏰑n Xi−nθ2 2i=1 2i=12 =(2π)−n/2e−21 􏰑ni=1(Xi−X ̄)2e−n2(θ−X ̄)2 √ −1/2 −(n−1)/2 −1 􏰑n (Xi−X ̄)2 =n(2π) e2i=1 n −n(θ−X ̄)2 √e2 . integrates to 1 As a function of θ, this is the N(X ̄, n1 ) density, times a “constant” (another expression involving the Xis and n, but not θ). Thus multiplying by the weight function/prior w(θ) ≡ 1 and then “integrating out” θ gives 􏰟 ∞ 1 􏰑n ̄ 2 and thus the posterior density is w(θ)fθ(X) dθ = n−1/2(2π)−(n−1)/2e− 2 􏰎 θ2 􏰖1+nσ02􏰗 =const.exp −2 σ02 +θ w(θ)fθ(X) m(X) n −n(θ−X ̄)2 2π 􏰎 1􏰖1+nσ02􏰗􏰌2 =const.exp−2 σ02 θ−2θ 1+nσ02 5. Suppose X = (X1,...,Xn) consists of iid N(θ,1) random variables and that it is desired to determine Bayes procedures using the weight function/prior w(·) given by the N(μ0,σ02) density, that is 1 − 1 (θ−μ0)2 w ( θ ) = √ e 2 σ 02 . Show that the resultant posterior density is the normal density with mean and variance 􏰌1􏰍 􏰌nσ02􏰍 ̄ 1+nσ02 μ0+ 1+nσ02 X σ02 1+nσ02 . Solution: The product of the weight function (prior) and the likelihood is (writing “const.” for an expression not involving θ), 1 −1(θ−μ0)2 1􏰑n 2 w(θ)fθ(X) = √ e 2σ02 (2π)−n/2e− 2 i=1(Xi−θ) =const.exp −2σ02+σ02 +θ 􏰐θ2θμ0􏰞n nθ2􏰢 􏰖μ0 +nX ̄σ02􏰗􏰏 σ02 􏰖μ0+nX ̄σ02􏰗􏰍􏰏 􏰖μ +nX ̄σ2􏰗 􏰖μ +nX ̄σ2􏰗2􏰚􏰢 􏰐 1􏰖1+nσ2􏰗􏰙 =const.exp − 0 θ2−2θ =const. exp −2 σ02 θ− 1+nσ02 which is a constant multiple of the desired normal density, so when renormalised that normal density becomes the posterior density. 6. Suppose X = (X1, . . . , Xn) consists of iid N(θ, 1) random varibles. We are interested in finding Bayes decisions/procedures for various loss functions using each of the two weight functions/priors used in questions 4 and 5 above: the “flat prior” and the “normal prior” respectively. (a) When the loss function is L(d|θ) = (d−θ)2 the Bayes procedure in each case is the posterior mean. Determine for both decisions d(·), (i) the risk R(θ|d) = Eθ [L (d(X)|θ)]; Solution: Ford(X)=X ̄,theriskisjustVarθ(X ̄)=n1.For 􏰌1􏰍 􏰌nσ02􏰍 ̄ d(X)= 1+nσ02 μ0 + 1+nσ02 X, 0 0 + 0 0 1 + nσ02 1 + nσ02 􏰐 1􏰖1+nσ02􏰗􏰌 􏰖μ0+nX ̄σ02􏰗􏰍2􏰢 since the risk is the mean-squared error (MSE) we use the identity MSE = Var + (Bias)2 . The bias is Eθ[d(X)]−θ=μ0+nσ02θ−[1+nσ02]θ= μ0−θ , and the variance is Varθ [d(X)] = Thus the MSE is Varθ 􏰄X ̄􏰅 = nσ04 + (μ0 − θ)2 (1 + nσ02)2 (1 + nσ02)2 1 + nσ02 􏰌 nσ2 (ii) the limiting risk limn→∞ nEθ [L (d(X)|θ)]. Solution: Multiplying both risks by n gives, for d(X) = X ̄: nE 􏰔􏰄X ̄−θ􏰅2􏰕≡1. θ For the second estimator we get 􏰛 4 2􏰜 n2σ4 􏰔1+ (μ0−θ)2 􏰕 1+ (μ0−θ)2 nσ0 + (μ0 − θ) 0 nσ04 nσ04 n 22=􏰊􏰋2=􏰊􏰋2→1, (1+nσ0) n2σ4 1 +1 1 +1 0 nσ2 nσ2 00 since both numerator and denominator tend to 1. Note that both procedures have the same limiting risk. Also they don’t depend on θ. (b) When the loss function is L(d|θ) = |d − θ| the Bayes procedure in each case is the posterior median. Determine for both decisions d(·) (i) the risk R(θ|d) = Eθ [L (d(X)|θ)]; Solution: Ford(X)=X ̄,theriskisEθ􏰄􏰈􏰈X ̄−θ􏰈􏰈􏰅.SinceX ̄∼N(θ,n1),√n􏰄X ̄−θ􏰅∼ N (0, 1). Therefore using question 3 with c = 0, the risk is Eθ 􏰈X ̄−θ􏰈 =√nEθ n􏰈X ̄−θ􏰈 =√n π. For the second estimator, the risk is 􏰣 􏰄􏰈 􏰈􏰅 1 􏰄√􏰈 􏰈􏰅 1 2 􏰛􏰈􏰈μ +nσ2X ̄ −􏰆1+nσ2􏰇θ􏰈􏰈􏰜 E 􏰆(μ −θ)+nσ2(X ̄ −θ)􏰇 􏰈000􏰈θ00 Eθ􏰈 2 􏰈= 2 􏰈 1+nσ0 􏰈 1+nσ0 2√􏰎􏰈 􏰈􏰏 σ0n􏰈μ0−θ√ 􏰈 = Eθ 􏰈 √ + n(X ̄−θ)􏰈 . 1 + nσ02 􏰈 σ02 n 􏰈 This is of the form kn E (|cn + Z |) for Z ∼ N (0, 1) and so by question 3 this is cn [1 − 2Φ(−cn)] + k n = σ 02 √ n 1 + nσ02 μ0 − θ c n = σ 02 √ n . 2 e − 21 c 2n 􏰢 (ii) the limiting risk lim Solution: For d(X) = X ̄, For the second case, since √nE [L (d(X)|θ)]. θ √nEθ 􏰀􏰈􏰈X ̄ − θ􏰈􏰈􏰁 = √ nσ02 π2 . nkn = 1 + nσ02 → 1 and c n = σ 02 √ n → 0 using part (b) of question 3, the limiting risk √􏰣2 nEθ {|d(X) − θ|} = π . Again, both procedures have the same limiting risk, which do not depend on θ! 􏰀􏰈 √ 􏰈􏰁 Hint: in each case write the risk in the form knEθ 􏰈cn + n(X ̄ − θ)􏰈 for sequences {kn} and {cn} and use question 3 above. (c) When the loss function is L(d|θ) = 1 {|d − θ| > C/√n} the Bayes procedure in each case is
the level set of the posterior density of width 2C . Because the posterior density is symmetric n
about the posterior mean/median (and unimodal) in each case, this is simply of the form posterior mean ± √C .
Determine for both decisions d(·)
(i) the risk R(θ|d) = Eθ [L (d(X)|θ)];
Solution: The risk is the probability of non-coverage. For either choice of d(X) this can be written as
Eθ [L (d(X)|θ)] = 1 − Pθ 􏰀|d(X) − θ| ≤ C/√n􏰁 􏰎C􏰏􏰎C􏰏
=Pθ d(X)<θ− √n +Pθ d(X)>θ+ √n . (1) When d(X) = X ̄, by symmetry this can be written as
􏰎 ̄ C􏰏 􏰀√􏰄 ̄ 􏰅 􏰁 2Pθ X>θ+√n =2Pθ nX−θ>C
= 2 [1 − Φ(C)] . Since this doesn’t depend on n, it is also the limiting risk.
In a similar way, for the second choice
􏰌1􏰍 􏰌nσ02􏰍 ̄
d(X)= 1+nσ02 μ0 + 1+nσ02 X, d(X) − θ = μ0 − θ + nσ02 􏰄X ̄ − θ􏰅
1 + nσ02 Then we can write the first probability in (1) as
􏰎 C􏰏􏰎2􏰄 ̄􏰅C􏰄2􏰅 􏰏 Pθ d(X)−θ<−√n =Pθ nσ0 X−θ <−√n 1+nσ0 −(μ0 −θ) 􏰎√􏰄 ̄ 􏰅 C μ−θ􏰏 =Pθ n X−θ <−C−nσ02 −σ02√n . In a similar way the second probability in (1) may be written as 􏰎√􏰄 ̄􏰅 Cμ−θ􏰏 Pθ n X−θ >C+nσ02 −σ02√n . 􏰌Cμ−θ􏰍􏰌Cμ−θ􏰍
Thus the exact risk is
Φ −C−nσ02 −σ02√n +1−Φ C+nσ02 −σ02√n
􏰖 􏰌 C μ−θ􏰍􏰗 􏰖 􏰌 C μ−θ􏰍􏰗 = 1−Φ C+nσ02 +σ02√n + 1−Φ C+nσ02 −σ02√n
(ii) the limiting risk limn→∞ Eθ [L (d(X)|θ)].
Solution: See above for d(X) = X ̄. For the second choice note that since
C + nσ02 ± σ02√n → C
as n → ∞, the limiting risk in the second case is also 2[1−Φ(C)]; so again both methods have the same limiting risks which are free of θ.

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com