ECON 61001: Lecture 10
Alastair R. Hall
The University of Manchester
Alastair R. Hall ECON 61001: Lecture 10 1 / 17
Overview of course
During the course have considered inference in:
Linear regression model
Binary response models
Developed large sample inference procedures based on assumptions about how the data are generated.
Alastair R. Hall ECON 61001: Lecture 10 2 / 17
Overview of course
Linear regression model
Spherical errors ( & E[xtut] = 0) OLS – efficient (CS/TS)
Non-spherical errors (& E [xt ut ] = 0) OLS – inefficient
CS: heteroscedasticity robust inference (“White’s se’s”)
TS: serial correlation robust inference (“Newey-West se’s”)
GLS
Efficient but need model for Σ – CS/TS
Alastair R. Hall ECON 61001: Lecture 10 3 / 17
Overview of course
Linear regression model
E[xtut] ̸= 0 → IV
CS: heteroscedasticity robust inference
TS: serial correlation robust inference
Binary response model
LPM – OLS with heteroscedasticity robust inference Logit/Probit – MLE
Alastair R. Hall ECON 61001: Lecture 10 4 / 17
Common structure – OLS, GLS, WLS & IV
Consistency:
βˆ = β 0 + M T h T
MT is a random matrix: WLLN + Slutsky ⇒ MT →p M, finite
constant
hT is a random vector: WLLN ⇒ hT →p 0
Using Slutsky’s Theorem,
βˆ→p β0
Alastair R. Hall ECON 61001: Lecture 10 5 / 17
Common structure – OLS, GLS, WLS & IV
T 1 / 2 ( βˆ − β 0 ) = M T n T
MT is a random matrix: WLLN + Slutsky ⇒ MT →p M, finite
constant
nT =T1/2hT isarandomvector:CLT⇒nT →d N(0,Ω), where Ω pd, constant
⇒ T1/2(βˆ − β0) →d N(0,Vβ) where Vβ = MΩM′
Alastair R. Hall ECON 61001: Lecture 10 6 / 17
Common structure – OLS, GLS, WLS & IV
For inference need consistent estimator of Vβ : use Vˆβ = MT Ωˆ MT′ . Form of Ωˆ depends on assumptions about data.
use knowledege of Σ i.e. conventional OLS se’s or GLS unknown form of heteroscedasticity → White-type estimator unknown for of serial correlation → HAC estimator
Alastair R. Hall ECON 61001: Lecture 10 7 / 17
And binary response
Logit/probit models estimated via MLE.
Score equations (FOC) cannot be solved to obtain explicit formula for βˆ as function of data. So proof strategy for consistency is different to OLS/GLS/IV.
But given βˆ →p β0 can show that via first order Taylor series argument applied to score equations to show
T1/2(βˆ − β0) = MTnT + ξT large sample behaviour is determined by MT nT
MT is random matrix: WLLN + Slutsky ⇒ MT →p M, constant
nT is random vector: CLT ⇒ nT →d N (0, Ω), where Ω pd, constant
where
Alastair R. Hall ECON 61001: Lecture 10 8 / 17
And binary response
So similar arguments to OLS/GLS/IV
⇒ T1/2(βˆ − β0) →d N(0,VML),
where
VML = M ΩM ′ = f(Information matrix)
This generic structure,
T1/2(βˆ − β0) = MTnT + ξT,
holds in many nonlinear models.
Alastair R. Hall ECON 61001: Lecture 10 9 / 17
Estimation based on population moment conditions
OLS based on E[xtut(β0)] = 0
IV based on E[ztut(β0)] = 0
MLE solves score equations
∂LLFT(θ) =0 ∂ θ θ = θˆ
– if data are iid then
∂LLFT (θ) =
∂θ
So MLE is MoM based on
E ∂ln[p(vt,θ)] = 0
∂θ θ=θ0
Alastair R. Hall
T
T ∂ln[p(vt , θ)] t=1 ∂θ
( see Lecture Notes Ch 6.4)
ECON 61001: Lecture 10
10 / 17
Generalized Method of Moments
So OLS, IV and MLE can all be interpreted as estimation based on the information in population moment condition.
These are all examples of a more general approach to estimation called Generalized Method of Moments (GMM).
GMM provides method to translate information about θ0, a p × 1 vector of parameters, in Population Moment Condition
into estimator of θ0
E [f (vt , θ0)] = 0,
Alastair R. Hall ECON 61001: Lecture 10 11 / 17
Generalized Method of Moments
Hansen (1982) defines the GMM estimator as: θˆGMM = argminθ∈ΘQT (θ)
where
QT(θ) = T−1 f(vt,θ)′WTT−1 f(vt,θ),
WT is known as the the weighting matrix and is chosen to satisfy WT is positive semi-definite (psd),
WT →p W , a pd matrix of constants.
Alastair R. Hall ECON 61001: Lecture 10 12 / 17
TT t=1 t=1
Generalized Method of Moments
Note:
WT is positive semi-definite (psd) ⇒
QT(θ) ≥ 0
QT(θˆGMM) = 0ifT−1f(vt,θˆGMM)=0
W T →p W ( p d ) ⇒
QT(θˆGMM) = 0 iff T−1 f(vt,θˆGMM) = 0 in the limit as T → ∞
T i=1
T t=1
Alastair R. Hall ECON 61001: Lecture 10 13 / 17
Comparison of GMM to Method of Moments (MM)
θˆMM issolutiontoT−1Tt=1f(vt,θˆMM)=0.
MM only works in general if number of moments, q say, equals number of parameters, p, because if q > p then no solution (even though holds in population at θ0) due to sampling variation.
GMM works if q ≥ p for if:
q = p t h e n θˆ G M M = θˆ M M .
q > p then θˆGMM is value of θ that is closest to solving sample moments.
This is sense in which GMM generalizes MM.
Alastair R. Hall ECON 61001: Lecture 10 14 / 17
GMM
It can be shown that under certain conditions: θˆGMM →p θ0
T1/2(θˆGMM − θ0) →d N(0,VGMM) (see next slide)
There is a large array of GMM-based inference procedures available
and the method is widely applied in empirical analysis.
Alastair R. Hall ECON 61001: Lecture 10 15 / 17
GMM
By manipulating the FOC fo GMM estimation, it can be shown
that where
MT is random matrix: WLLN + Slutsky ⇒ MT →p M, constant
nT is random vector: CLT ⇒ nT →d N (0, Ω), where Ω pd, constant
So VGMM = MΩM′ and M depends on the weighting matrix, the Jacobian matrix (derivative of the sample moment wrt θ and Ω is (LR) variance of f (vt , θ0).
T 1 / 2 ( θˆ G M M − θ 0 ) = M T n T + ξ T large sample behaviour is determined by MT nT
Alastair R. Hall ECON 61001: Lecture 10 16 / 17
Further reading on GMM
Material on GMM is non-examinable but FYI: see Hall (1993,2015) papers on BB
Greene: Chapters 13.4-13.6
Alastair R. Hall ECON 61001: Lecture 10 17 / 17