Answer all questions in the Answer Booklet. Each question is worth 25 marks. (Total marks: 100)
Question 1: Ordinary Least Squares (25 marks) Consider the classical linear regression model yi =x′iβ+εi i=1,…,N (1)
E[εi|xi] = 0
where xi comprises K regressors.
(i) Show that the conditional moment restriction implies E[εi] = 0 and E[xiεi] = 0. (4 marks)
E[εi] = E[E[εi|xi]] = E[0] = 0 and E[xiεi] = E[xiE[εi|xi]] = E[0] = 0
(ii) In matrix form, (1) can be expressed as:
y = Xβ + ε E[ε|X] = 0
where y = (y1, y2, …, yN )′ and ε = (ε1, ε2, …, εN )′. What are the dimensions of X? (1 mark)
N×K
(iii) Derive β = arg minβ∈RK ε′Wε using matrix algebra, where W is an N × N symmetric positive definite matrix and X has rank K. (6 marks)
We have The FOC is
ε′Wε = β′X′WXβ + y′Wy − 2y′WXβ 2X′WXβ − 2X′Wy = 0
X′WXβ = X′Wy If X is rank K then X′WX is invertible and
β = (X′WX)−1X′Wy
(iv) Derive E[β|X] and VAR[β|X]. You may treat W as fixed (i.e. non-random) (6 marks) 1
Semester One Final Examinations, SAMPLE PAPER ECON6300/7320/8300 Advanced Microeconometrics
β = (X′WX)−1X′Wy
= β + (X′WX)−1X′Wε
E[β|X] = β + E[(X′WX)−1X′Wε|X]
= β + (X′WX)−1X′WE[ε|X] = β
VAR[β|X] = VAR[(X′WX)−1X′Wε|X]
= (X′WX)−1X′WVAR[ε|X]WX(X′WX)−1 = (X′WX)−1X′WΩWX(X′WX)−1
where Ω ≡ E[εε′|X]
(v) Now suppose that W = Ω−1 where Ω = E[εε′|X]. Show that VAR[β|X] = (X′Ω−1X)−1.
(3 marks)
VAR[β|X] = (X′WX)−1X′WΩWX(X′WX)−1
= (X′Ω−1X)−1X′Ω−1ΩΩ−1X(X′Ω−1X)−1
= (X′Ω−1X)−1(X′Ω−1X)(X′Ω−1X)−1 = (X′Ω−1X)−1
(vi) To which estimator does this choice of W correspond? How does one implement this estimator in practice? (5 marks)
This is the GLS estimator. It is infeasible in practice because Ω is unknown. We can
implement the FGLS estimator which involves the following steps: (1) Parameterize Ω =
Ω(θ) where θ is a finite dimensional parameter of fixed dimension not depending on N
(2) Obtain ε = y − XβOLS (3) Use ε to estimate θ, by treating ε as if it were ε (4) Set −1
W = Ω(θ) and obtain β.
EXAMINATION CONTINUES ON NEXT PAGE
Page 2 of 12
Semester One Final Examinations, SAMPLE PAPER ECON6300/7320/8300 Advanced Microeconometrics
Question 2: Finite Mixtures (25 marks) Consider the mixture model for y|x,d∼N(αd +xβd,σd2)
where y is an outcome of interest, x is a continuous regressor and d ∈ {0,1} denotes class membership.
(i) Suppose that you obtain a random sample (yi,xi,di)Ni=1. Derive an estimator of θ = (α0, α1, β0, β1, σ02, σ12). (7 marks)
θ can be estimated by maximum likelihood. The likelihood is:
N
L = f(yi|di,xi,θ)
i=1
N −1(y−α−βx)2
1 2σ2 i di dii = e di
i=1 2πσd2 i
i:di =1
The log likelihood is:
lnL=−Nln2π−N0lnσ2−N1lnσ2− 1 (y−α−βx)2 2 20212σ02 i00i
i:di =0
N 1 − 1 (yi−α0−β0xi)2 N 2σ2
1 − 1 (yi−α1−β1xi)2 2σ2
= 2πσ2e 0 i:di =0 0
2πσ2e 1 1
− 1 (yi −α1 −β1xi)2
2σ12
whereN1 =Ni=1di andN0 =N−N1. TheFOCare:
(yi −α0 −β0xi) = 0, (yi −α0 −β0xi)xi = 0
i:di =1
di =0
di =0
(yi −α1 −β1xi) = 0, (yi −α1 −β1xi)xi = 0
di =1 di =1
1 (yi −α0 −β0xi)2 =N0, 1 (yi −α1 −β1xi)2 =N1 22
σ0 di=0 The estimators are
σ1 di=1
d =0(yi − y0)(xi − x0)
α 0 = y 0 − β 0 x 0 , β 0 =
i
di=0(xi − x0)2
d =1(yi − y1)(xi − x1) i
α 1 = y 1 − β 1 x 1 , β 1 =
di=1(xi − x1)2 21 221 2
σ0 = (yi −α0 −β0xi) , σ1 = (yi −α1 −β1xi) NN
0 di=0
1 di=1
Page 3 of 12
Semester One Final Examinations, SAMPLE PAPER ECON6300/7320/8300 Advanced Microeconometrics
where y0 = 1 d =0 yi, y1 = 1 d =1 yi. These are the MLE obtained by splitting the N0i N1i
sample into its two classes.
(ii) Suppose that you observe only (yi,xi)Ni=1. Propose an estimator of θ. (6 marks)
The estimator in (i) is not feasible because it depends on di, which is latent. The parameters can instead be estimated using a Finite Mixture model with two latent classes. The EM algorithm can be used, which, for θ = (α0, α1, β0, β1, σ02, σ12, π) and
π = Prob[di = 1],
(a) Computing the expectation of the log-likelihood as a function of the parameter values
from the previous iteration:
Ed|y,x,θ(s−1) [ln L(θ|y, x, d)] (b) Maximizing with respect to θ to obtain θ(s).
Many starting values should be used to be sure of a global maximum.
is given by iterating between:
Page 4 of 12
Semester One Final Examinations, SAMPLE PAPER ECON6300/7320/8300 Advanced Microeconometrics
C−
lnπ−
whereC=−N ln2×3.142…
+(1−π(s)) ln(1−π)− i
MORE DETAILS ON EM (NOT EXPECTED IN YOUR ANSWER)
The intuition for EM is that if we know d we can obtain θ (as in part (i)), and if we know θ we can obtain d. So we iterate between these two steps. We make use of the mixture density:
f(y, d|x, θ) = [πg(y|x, α1, β1, σ1)]d[(1 − π)g(y|x, α0, β0, σ0)]1−d where g(y|x, α, β, σ) is the pdf of N (α + βx, σ2). This gives us:
f(y|x,θ)= f(y,d|x,θ)=πg(y|x,α1,β1,σ1)+(1−π)g(y|x,α0,β0,σ0) d=0,1
f(y|d, x, θ) = f(y, d|x, θ) = f(y, d|x, θ) = g(y|x, α1, β1, σ1)dg(y|x, α0, β0, σ0)1−d p(d|x, θ) πd(1 − π)1−d
p(d|y,x,θ)= f(y,d|x,θ) = [πg(y|x,α1,β1,σ1)]d[(1−π)g(y|x,α0,β0,σ0)]1−d f(y|x,θ) πg(y|x,α1,β1,σ1)+(1−π)g(y|x,α0,β0,σ0)
Notice that we used f(y|d,x,θ) to construct the likelihood in part (i).
Here’s EM the algorithm:
Set θ(0) as a starting value, and s = 1, then iterate the following until convergence
(a) Let
π(s) =E[di|yi,xi,θ(s−1)] i
= 1 × p(1|yi, xi, θ(s−1)) + 0 × p(0|yi, xi, θ(s−1)) = p(1|yi, xi, θ(s−1))
N Then Ed|y,x,θ(s−1) [ln L(θ|y, x, d)] = Ed|y,x,θ(s−1) ln i=1 f(yi, di|xi, θ) =
N i=1
(c) s=s+1
ln σ2 (y − α − β x )2 1 − i 1 1 i
ln σ2 (y − α − β x )2 0 − i 0 0 i
π(s) i
1 N π(s) = π(s)
Ni i=1
2 2σ12 2
2 2σ02
(b) Maximize with respect to θ to obtain θ(s). For example, it’s easy to check that
Page 5 of 12
Semester One Final Examinations, SAMPLE PAPER ECON6300/7320/8300 Advanced Microeconometrics
(iii) Now suppose that you obtain the regression output below for a sample with N = 1,000 observations. Which functions of α0, α1, β0, β1 are identifiable from such regressions? (6 marks)
The regression models identify
and
E[y|x]=πα1 +(1−π)α0 +(πβ1 +(1−π)β0)x E[y|x,d=0]=α0 +β0x
Provided that VAR[x] > 0, we are thus able to identify α0,πα1 +(1−π)α0,β0,πβ1 +(1−π)β0.
Estimates for these are 1.06, 1.33, 1.03, 1.31 respectively.
(iv) Suppose now that you obtain the additional information VAR[d] = 0.25. Use this information
and the regression results from (iii) to obtain estimates of α0, α1, β0, β1. (6 marks) Page 6 of 12
Semester One Final Examinations, SAMPLE PAPER ECON6300/7320/8300 Advanced Microeconometrics
Since d follows a Bernoulli distribution, we know that π(1 − π) = 0.25 and π > 0. Hence, we know π = 0.5. Using our estimates from (iii), we have
α0 = 1.06 0.5α1 + 0.5α0 = 1.33
β0 = 1.03
0.5β1 + 0.5β0 = 1.31 Solving yields α1 = 1.6, β1 = 1.59
EXAMINATION CONTINUES ON NEXT PAGE
Page 7 of 12
Semester One Final Examinations, SAMPLE PAPER ECON6300/7320/8300 Advanced Microeconometrics
Question 3: Bayesian methods (25 marks) Consider the random variable Y which follows a Bernoulli distribution with parameter θ
f (y|θ) = θy (1 − θ)1−y 1(y ∈ {0, 1})
(i) You decide to specify the prior π(θ) = 1 1(θ ∈ [a,b]). Which value(s) of a and b are
b−a
(ii) Now suppose that you choose a = 0 and b = 0.5. Derive the posterior distribution π(θ|y1) after observing a realization y1 = 1. (5 marks)
π(θ|y1 = 1) ∝ 2θ1(θ ∈ [0, 0.5]) so
π(θ|y1 = 1) = cθ1(θ ∈ [0, 0.5])
Since it must integrate to 1,
0.5
cθdθ = 1 ⇒ c = 8
0
(iii) Update the posterior distribution after observing a second realization y2 = 1. (5 marks)
π(θ|y1 = 1,y2 = 1) ∝ θ21(θ ∈ [0,0.5]) so
π(θ|y1 =1,y2 =1)=cθ21(θ∈[0,0.5])
Since it must integrate to 1,
0.5
cθ2dθ = 1 ⇒ c = 24
0
(iv) Now suppose that you have a sample of 1, 000 observations and the following STATA output. Interpret your results. (7 marks)
The trace does not appear to be white noise, and the histogram suggests a mass point at the boundary of the support of the prior. The autocorrelation in the chain is not a cause for concern, and the density plot suggests that the burn in is sufficiently long. Proceeding to the output, the acceptance rate suggests that the chain has worked well, though the efficiency is a little low. the estimated posterior mean and median are close to the boundary of the
appropriate? Why? (3 marks)
θ is a probability, so any 0 ≤ a ≤ b ≤ 1.
Page 8 of 12
Semester One Final Examinations, SAMPLE PAPER ECON6300/7320/8300 Advanced Microeconometrics
prior support, and the .95 credible interval is very narrow, suggestive of a mass point at the boundary. The posterior is highly skewed.
(v) Comment on how the specification of the prior may lead the posterior distribution to be inconsistent for the true parameter θ0. (5 marks)
The posterior is proportional to the prior multiplied by the likelihood (π(θ|y) ∝ π(θ)L(θ|y) ∝ 1(θ ∈ [0, 0.5])L(θ|y)). This means that π(θ|y) = 0 for θ > 0.5, and so, if the true value of θ exceeds 0.5 the posterior does not degenerate at it.
EXAMINATION CONTINUES ON NEXT PAGE
Page 9 of 12
Semester One Final Examinations, SAMPLE PAPER ECON6300/7320/8300 Advanced Microeconometrics
EXAMINATION CONTINUES ON NEXT PAGE
Page 10 of 12
Semester One Final Examinations, SAMPLE PAPER ECON6300/7320/8300 Advanced Microeconometrics
Question 4: Structural models (25 marks) Consider the two equation structural model: Y =AY +BX+ε
E[εX′] = 0
whereY =(Y1,Y2)′,X=(X1,X2)′ andε=(ε1,ε2)′ are2×1vectors,
0 α β 0 A= , |α|<1, B=
α0 0β and α, β are paremeters. Note also that a 2 × 2 matrix:
C C C= 11 12
C21 C22 is invertible if C11C22 − C21C12 ̸= 0 in which case:
C −C C−1 = (C11C22 − C21C12)−1 22 12
−C21 C11
(i) Under which conditions is E[XX′] invertible? (3 marks)
It is invertible if X1 and X2 are linearly independent, or equivalently, their correlation is less than 1 in absolute value.
(ii) Assuming that E[XX′]−1 exists, compute Π ≡ E[Y X′]E[XX′]−1 as a function of A, B. (3 marks)
YX′ = AYX′ + BXX′ + εX′ (I2 − A)YX′ = BXX′ + εX′
(I2 − A)E[YX′] = BE[XX′]
E[YX′] = (I2 − A)−1BE[XX′]
E[YX′]E[XX′]−1 = (I2 − A)−1B
(iii) Compute each element of Π as a function of α, β. (4 marks)
β αβ
1−α2 1−α2 αβ β 1−α2 1−α2
Π=
Page 11 of 12
Semester One Final Examinations, SAMPLE PAPER ECON6300/7320/8300 Advanced Microeconometrics
(iv) Are α, β identified if β = 0? Are α, β identified if β ̸= 0? (4 marks)
If β = 0 then Π = 0, and so we know β = 0, hence β is identified, and any α ∈ (−1, 1) is compatible with Π = 0, and so α is not identified. If β ̸= 0, then we can solve to obtain α = Π12/Π11 and β = Π11(1 − Π212/Π211) and so α, β are identified.
(v) Suppose now that we have a sample of (Y i, Xi) for i = 1, ..., N. Propose a consistent estimator of Π. (3 marks)
The OLS estimator is a consistent estimator of Π.
Πˆ = ( YiX′i)( XiX′i)−1
ii
(vi) Using your consistent estimator of Π, propose consistent estimators of α and β. (3 marks)
αˆ = Πˆ12/Πˆ11 and βˆ = Πˆ11(1 − Πˆ212/Πˆ211)
(vii) Explain how you would construct a 0.95 confidence interval for β. (5 marks)
Use the non-parametric bootstrap. For b = 1, ..., 200, draw a bootstrap sample and estimate (b) (b)
α , β as in (v)-(vi). A confidence interval is obtained by taking its lower bound to be the 2.5 percentile of β(b) and its upper bound to be the 97.5 percentile of β(b).
END OF EXAMINATION
Page 12 of 12