ECON61001: 2020-21 Econometric Methods
Solutions to Mock Exam
1.(a) x′Ax>0forallx̸=0.
1.(b) rank(A) = m – else there exists x ̸= 0 such that Ax = 0 which violates A pd.
1.(c) For 2×2 matrix A, if A satisfies tr(A) > 0 and |A| > 0 then A is p.d. Here: tr(A1) = 4, |A1|=3soA1 pd;tr(A2)=10,|A2|=−4soA2 notpd.
2.(a) Given this specification the conditional distribution function of y given x, is: P(y = 1|x) = Λ(x′β0)
Therefore, we have:
P(y = 0|x) = 1 − Λ(x′β0).
E[y|x] = 1Λ(x′β0) + 0 ∗ (1 − Λ(x′β0) = Λ(x′β0).
Using the Law of Iterated Expectations E[yi] = E[E[y|x]] = E[Λ(x′β0)]. Finally, we have
V ar[y|x] = E[(y − E[y|x])2|x] = (1−Λ(x′β0))2Λ(x′β0) + Λ(x′β0)2 1 − Λ(x′β0) = Λ(x′β0)(1−Λ(x′β0). 2.(b)(i) UsingΛ(z)=ez/(1+ez)and∂ex′iβ0/∂xi,j =β0,jex′iβ0,wehave:
∂Λ(x′β0) = ∂ex′iβ0 1 +(−1)∂ex′iβ0 ex′iβ0 ∂xi,j ∂xi,j 1+ex′iβ0 ∂xi,j (1+ex′iβ0)2
= Λ(x′β0)(1 − Λ(x′β0))β0,j.
1
2.(b)(ii) limx′β0→∞ = Λ(x′β0)(1 − Λ(x′β0))β0,j = 0 – since the probability depends monotonically on the index, as the index gets large the event happens with probability tending to one and so a small change in xi,j has no impact on the probability of the event.
3. This is not true. Unbiasedness is a statement about E[θˆT ], the mean of the sampling distribu- tion of θˆT . Consistency means limT →∞P (∥θˆT − θ0∥ < ε) = 1 for all ε > 0 and so implies the sampling distribution of θˆT collapses to a spike at θ0. A sufficient condition for consistency is that both bias(θˆT) → 0 and Var[θˆT] → 0 as T → ∞; bias(θˆT) → 0 alone does imply the collapse of the sampling distribution onto a single point.
4.(a) The conditions are: (i) E[ziui] = 0; (ii) E[zixi] ̸= 0. Condition (i) cannot be tested directly as it depends on ui which is unobservable. Condition (ii) can be tested via the first-stage regression that is, regressing xi on zi and testing the joint significance of the coefficients on zi via, say, a F − statistic, e.g estimate xi = zi′ γ + error and test H0 : γ = 0 (zi not relevant) versus H1 : γ ̸= 0 (zi relevant) using a F-statistic.
4.(b) •
xi is an endogenous regressor if E[xiui] ̸= 0. Given the model for xi, we have: E[xiui] = E[(zi′γ0 + vi)ui] = γ0′ E[ziui] + E[uivi].
Using the Law of Iterated Expectations (LIE) and E[ui|zi] = 0 (given), it follows that E[ziui] = E[E[ui|zi]zi] = 0.
Using the Law of Iterated Expectations, E[ui|zi] = 0 (given) and Cov[ui, vi|zi] = Ω1,2, the (1, 2) element of Ω0, it follows that
E[uivi] = E[E[ui,vi|zi]] = E[Cov[ui,vi|zi]]=E[Ω1,2]=Ω1,2.
So condition for xi to be endogenous is Ω1,2 ̸= 0.
• Orthogonality condition: Using LIE E[ui|zi] (given), E[ziui] = E[E[ui|zi]zi] = 0.
• Relevance condition: Using LIE and E[vi|zi] (given), we have
E[zixi] = E[zi(zi′γ0 + vi)] = Mzzγ0 + E[E[vi|zi]zi] = Mzzγ0 Since Mzz is pd (given), the condition for relevance is γ0 ̸= 0.
5.(a) A scalar time series {vt} is covariance stationary if its first two moments do not depend on t that is, E[vt] = μ, V ar[vt] = γ0 and Cov[vt, vt−j] = γ|j|. We are given that et is white noise and so: E[et] = 0, for all t; Var[et] = E[e2t] = σ2 for all t; Cov[et,es] = E[etes] = 0 for all t ̸= s.
2
(i) ut = et and so using the properties stated above E[ut] = 0, V ar[ut] = σ2, Cov[ut, us] = 0 ⇒ ut is cov stat.
(ii) E[vt] = (−1)t = E[et] = (−1)t as E[et] = 0 and so the mean depends on time: for example, E[v1] = −1 & E[v2] = 1. Therefore, vt is not covariance stationary.
(iii) Using the properties of white noise stated above, we have: E[wt] = (−1)tE[et] = 0; V ar[wt] = (−1)2tE[e2t ] = σ2; Cov[wt, ws] = E[wtws] = (−1)t+sE[etes] = 0. Therefore, ut is covariance stationary.
5.(b) vt is not strictly stationary because it is not covariance stationary; ut and wt may or may not be strictly stationary but we have insufficient information about {et} in order to assess this.
6.(a) Substituting for y in the formula for the estimator, we have
β ̃T = (Z′X)−1Z′y = β0 + (Z′X)−1Z′u, (1)
and so
Using X, Z constant and E[u] = 0, it follows that
E[β ̃T] = Eβ0 + (Z′X)−1Z′u. E[β ̃T] = β0 + (Z′X)−1Z′E[u] = β0.
6.(b) We have Var[β ̃T] = Eβ ̃T − E[β ̃T]β ̃T − E[β ̃T]′. Using part (a) and (1), it follows
that
Since Z,X are fixed in repeated samples (given), we can bring the expectation operator
Var[β ̃T] = E(Z′X)−1Z′uu′Z(X′Z)−1 Var[β ̃T] = (Z′X)−1Z′E[uu′]Z(X′Z)−1
through
GiventhatE[u]=0andVar[u]=σ02IT,itfollowsthatE[uu′]=σ02IT andthusthat
Var[β ̃T] = (Z′X)−1Z′σ02ITZ(X′Z)−1 = σ02(Z′X)−1Z′ITZ(X′Z)−1 = σ02(X′Z(Z′Z)−1Z′X)−1,
where have also used (ABC)−1 = C−1B−1A−1.
3
6.(c) Cov[βˆT,i, βˆT,j] = σ02{(X′Z(Z′Z)−1Z′X)−1}i,j where {A}i,j denotes the i − jth element of any matrix A.
6.(d) From (1), u ∼ N ormal and X, Z, β0 constants, β ̃T is a linear combination of normal random variables and so has a normal distribution.
6.(e) No, β ̃T is a linear in y unbiased estimator of β0 and so we know that under the conditions in the question that the OLS estimator is the efficient estimator within the class of linear unbiased estimators.
6.(f) It is not unbiased because E[(Z′X)−1Z′u] depends on both X and Z and so cannot use the Law of Iterated Expectations and E[u|Z] = 0 to deduce E[(Z′X)−1Z′u] is zero. However, es- timator is consistent because consistency rests on E[ztut] = 0 which is implied by E[u|Z] = 0. So the estimator is consistent subject to additional regularity conditions for WLLN.
7.(a) First need E[ut]. We have E[ut] = E[wt] + φE[wt−2] = 0 as E[wt] = 0 (given). Let Ωt,s denote the (t, s) element of Ω. By definition Ωt,t = V ar[ut] = E[u2t ] (as E[ut] = 0) and Ωt,s = Cov[ut,us] = E[utus] (as E[ut] = 0). Noting that wt ∼ iid(0,σ2) (given) we have: E[wt2] = V ar[wt] = σ2 and E[wtws] = Cov[wt, ws] = 0 for all s ̸= t. Using these properties we have:
and
Cov[ut, ut−s]
= E (wt + φwt−2)2 , =Ew2+φ2w2 +2φww,
V ar[ut]
= (1 + φ2)σ2 = γ0,
= E [(wt + φwt−2) (wt−s + φwt−s−2)] ,
t t−2 t t−2
= E[w2] + φ2E[w2 ] + 2φE[wtwt−2],
t t−2
= σ2 + φ2σ2, b/c E[wtwt−2] = 0,
= E wtwt−s + φwt−2wt−s + φwtwt−s−2 + φ2wt−2wt−s−2 = γs, say. From the previous equation we obtain: γ1 = 0; γ2 = φE[w2 ] = φσ2, and for s > 2,
t−2
γs = 0, because Cov[wt, wt−j] = 0 for all j ̸= 0. So Ω is a matrix whose elements are zero
apartfrom: Ωt,t =γ0,fort=1,2,…,T,andΩt,s =γ2 forall(t,s)suchthatt=j,s=j+2 for j = 1, 2, . . . T − 2, and t = j , s = j − 2, j = 3, 4, . . . , T .
4
7.(b)(i) yt−1 is contemporaneously exogenous if E[ut|yt−1] = 0 which would imply via LIE that E[utyt−1] = E [ E[ut|yt−1] ] = 0. So if E[utyt−1] ̸= 0 then yt−1 is not contemporaneously exogenous. Using the hint, yt−1 = ψ0ut−1 + ψ1ut−2 + f(ut−3, ut−4, . . ., ), we have:
E[utyt−1] = E[ut(ψ0ut−1 +ψ1ut−2 +f(ut−3,ut−4,…,))]
= ψ0E[utut−1] + ψ1E[utut−2] + E[utf(ut−3, ut−4, . . ., )]
Consider the terms on the right-hand side:
• SinceE[ut]=0andγ1 =0frompart(a),wehaveE[utut−1]=γ1 =0.
• Since E[ut] = 0 and γ2 ̸= 0 from part (a), we have E[utut−2] = γ2 ̸= 0.
• Since ut is a function of {wt, wt−2}, {ut−3, ut−4, . . .} is a function of {wt−3, wt−4, . . .} and wt ∼ i.i.d., it follows that ut is independent of {ut−3, ut−4, . . .} and so using E[ut] = 0 (from part (a)), we have E[utf(ut−3, ut−4, . . ., )] = E[ut]E[f(ut−3, ut−4, . . ., )] = 0.
Therefore, given ψ0,1 ̸= 0, it follows that E[utyt−1] = ψ0,1γ2 ̸= 0 and so yt−1 cannot be contemporaneously exogenous in this model.
7.(b)(ii) yt−1 is strictly exogenous if E[ut|yT−1,yT−2,…,y0] = 0. This implies that E[ysut] = 0 for all s = 1,2,…T −1. From part (i), we already know this is not the case, and so it is not strictly exogenous.
8.(a) We can write
ˆ N −1N
βN = xix′i xiyi
i=1 i=1
ˆ N −1N
and substituting for yi this yields:
βN =β0+ xix′i xiui.
Therefore, we have:
i=1 i=1
ˆ T −1 N
N1/2(βN − β0) = N−1xix′i N−1/2xiui, i=1 i=1
Using the WLLN, we have N−1 Ni=1 xix′i →p E[xix′i] = Q. Since Q is pd, we can apply Slutsky’s Theorem to deduce:
N −1
N − 1 x i x ′ i →p Q − 1 .
i=1
5
To use the CLT, need to evaluate: E[xiui] and Ω = limN→∞ V ar[N−1/2 Ni=1 xiui]. Via LIE, we have E[xiui] = E[xiE[ui|xi]] = 0 as E[ui|xi] = 0 (given). Multiplying out, we have:
NNN Var[N−1/2xiui] = N−1Cov[xiui,xjuj].
i=1 i=1 j=1
For i = j, Cov[xiui, xjuj] = V ar[xiui]. For i ̸= j, Cov[xiui, xjuj] = 0 because {ui, x′2,i)′ ∼ iid
implies that xiui and xjuj are independent for i ̸= j. Therefore, ΩN = N−1 Ni=1 V ar[xiui]. To find V ar[xiui], use E[xiui] = 0 from above and so via LIE
V ar[xiui] = E[u2i xix′i] = E[E[u2i |xi]xix′i] = E[h(xi)xix′i] = Ωh.
Using Lemma 3.5 from the Lecture Notes (If MT →p M , finite pd and bT →d N (0, V ) then
MTbT →d N(0,MVM′)),itfollowsthatT1/2(βˆT −β0) →d N(0,Vh).
8.(b)(i) Errors are heteroskedastic but serially uncorrelated. Hence we can perform a t-test with either White or Newey-West standard errors as both are consistent estimators of the true s.e in this case. Let s.e.W ( · ) and s.e.N W ( · ) denote the White and Newey-West standard errors of the coefficient estimator in the parentheses. We have
βˆ0,1 − 0 and βˆ0,1 − 0 s.eW (βˆ0,1) s.eN W (βˆ0,1)
are both asymptotically N(0,1) under H0. Here we perform the test with White standard errors as is a more efficient estimate of the true standard errors under heteroscedasticity.
Here βˆ0,1 = 0.389 and s.e(βˆ0,1) = 1.362 and βˆ0,1−0 = 0.389/1.362 = 0.286 where the crit- s.eW (βˆ0,1)
ical value of the two-sided 5% test is 1.96 and hence we cannot find evidence to reject the null.
8(b)(ii) Errors are heteroskedastic and serially correlated. Hence we can only construct a valid test performing a t-test with Newey-West standard errors, and so use test statistic
βˆ 0 , 2 − 0
s . e N W ( βˆ 0 , 2 )
Performing the test in this case βˆ0,2−0 = 0.336/0.321 = 1.05 where the c.v is 2.58 and s . e N W ( βˆ 0 , 2 )
hence no evidence to reject the null.
6
8(b)(iii) Errors are spherical and hence can perform a joint tests based on the F-stat with any s.e’s. We’d prefer to use the formula based on OLS s.e’s as it is more efficient. Given the info we may only perform this test at any rate.
The test-stat in this case is (R2/2)/((1 − R2)/97)) which has an exact F(2,97) distribution. (97 ∗ R2)/(2 ∗ (1 − R2) = 40.5 where the c.v is 2.15, hence evidence to reject the null.
9.(a) From notes, we have the conditional log likelihood function takes the form:
N CLLFN = li(β),
i=1
where
In this question xi = 1 and so specializing to model here, we have:
li(β) = yiln[Φ(x′iβ)] + (1 − yi)ln[1 − Φ(x′iβ)].
N
LLFN (β) = { yiln[Φ(β)] + (1 − yi)ln[(1 − Φ(β)]}
i=1
= N1ln[Φ(β)] + (N − N1)ln[(1 − Φ(β)].
The score function is:
s(β)=∂LLFN(β)=Nφ(β)−(N−N) φ(β)
∂β 1Φ(β) 1 1−Φ(β)
The MLE estimator of β0 is obtained by solving the first order conditions s(βˆ) = 0 which in
this case are:
Nφ(βˆ)−(N−N) φ(βˆ) =0. 1Φ(βˆ) 1 1−Φ(βˆ)
Since φ(β) ̸= 0, the MLE is also the solution to
N1(1 − Φ(βˆ)) − (N − N1)Φ(βˆ) = 0,
from which it follows that Φ(βˆ) = N1/N. Therefore, βˆ = Φ−1(N1/N). Since yi ∈ {0,1} it follows that y ̄ = Ni=1 yi/N = N1/N and so βˆ = Φ−1(y ̄).
9.(b)(i) Let βp be the coefficient on ptcon in the probit model. The decision rule is to reject H0 : βp = 0 in favour of HA : βp ̸= 0 at the 100α% significance level if |βˆp/s.e(βˆp)| > z1−α/2 where z1−α/2 is the 100(1 − α/2)th percentile of the standard normal distribution. Here the test stat is | − 2.4217/1.0982| = 2.2052. From Tables z0.975 = 1.96 and z0.995 = 2.576 so we can reject the null hypothesis at the 5% but not the 1% level.
7
9.(b)(ii) Let βli be the coefficient on loginc in the Probit model. The marginal response is: βliφ(x′iβ) where φ(v) is pdf of standard normal and so depends on xi. However sign of estimate gives sign of marginal response and so marginal response is positive here. So same sign for response to income as log is a monotonic increasing transformation. To test if the effect is different from zero, we can test: The decision rule is to reject H0 : βli = 0 in favour of HA : βli ̸= 0 at the 100α% significance level if |βˆli/s.e(βˆli)| > z1−α/2 where z1−α/2 is the 100(1 − α/2)th percentile of the standard normal distribution. From the output, the test statistic equals |2.434/0.821| = 2.96 and so we can reject at the 1% significance level – using cv’s given in part (b)(i) – the null that loginc has no effect on the probability of approval.
9.(b)(iii) LR test: LR = 2{LLF (βˆU ) − LLF (βˆR )} where βˆU is the (unrestricted) MLE, βˆR is the (restricted) MLE s.t. Rβ0 = r, and LLF is the log likelihood function. Here R equals rows 2 to 8 of I8 and r is 8 × 1 vector of zeros. The decision rule is to reject at the 100α% significance level if LR > c1−α(df) where c1−α(df) is the 100(1 − α)th percentile of the χ2df distribution
and df equals the number of restrictions. From output LLF(βˆU) = −52.84. Using part (a), LLF(βˆR) = 59 ∗ ln(59/95) + 36 ∗ ln(36/95) = −63.04. Therefore LR = 20.39 and as c0.99(8) = 20.09 can reject H0 at 1% sig level.
8