ECONOMETRICS I ECON GR5411
Lecture 11 – Omitted Var, Latent Var, Conditional Mean Independence
by
Seyhan Erden Columbia University MA in Economics
Omitted Variables in Regression
True model:
𝑦 = 𝑋𝛽 + 𝑍𝛾 + 𝑢 (1)
Fitting a wrong model:
𝑦 = 𝑋𝛽 + 𝜀 (2)
Assume 𝐸 𝑢|𝑋, 𝑍 = 0, 𝑍 belongs in (1) (this is one of the conditions for an OV to “cause” bias: If Z is part of the error when omitted then Z must be a determinant of the dependent variable 𝑦)
From (2) we get2 45 6
𝛽=𝑋′𝑋 𝑋𝑦
Lecture 11 GR5411 by Seyhan Erden 2
But the true 𝑦 is given in (1), so
𝛽2 = 𝑋′𝑋 45 𝑋′ 𝑋𝛽+𝑍𝛾+𝑢
=𝛽+ 𝑋′𝑋 45𝑋6 𝑍𝛾+𝑢
=𝛽+ 𝑋′𝑋45𝑋6𝑍𝛾+ 𝑋′𝑋45𝑋6𝑢
𝐸 𝛽2 | 𝑋 , 𝑍 = 𝛽 + 𝑋 ′ 𝑋 4 5 𝑋 6 𝑍 𝛾 + 𝐸 𝑢 | 𝑋 , 𝑍 𝐸 𝛽2|𝑋,𝑍 ≠ 𝛽 unless 𝑋6𝑍 = 0 (this is the other
conditions for an OV to “cause” bias) note that the last term is zero only if u does not contain 𝑍.
Thus, 𝛽2 is not unbiased.
𝛽2→9 𝛽+𝑄45𝑄 𝛾+𝑄45=0
;;;< ;;
Thus, 𝛽2 is not consistent.
Lecture 11 GR5411 by Seyhan Erden 3
When the true model is 𝑦 = 𝑋𝛽 + 𝑍𝛾 + 𝑢 but we mistakenly fit 𝑦 = 𝑋𝛽 + 𝜀
What happens to the efficiency of 𝛽2
𝑉𝑎𝑟 𝛽2|𝑋,𝑍 =𝜎B 𝑋6𝑋−𝑋6𝑍 𝑍6𝑍 45𝑍6𝑋 45
Fact:
𝑉𝑎𝑟 𝛽2|𝑋 45 −𝑉𝑎𝑟 𝛽2|𝑋,𝑍 45
=𝜎B 𝑋6𝑋 −𝜎B 𝑋6𝑋−𝑋6𝑍 𝑍6𝑍 45𝑍6𝑋
=𝜎B 𝑋6𝑍 𝑍6𝑍 45𝑍6𝑋 >0
𝑉𝑎𝑟 𝛽2|𝑋 ≤𝑉𝑎𝑟 𝛽2|𝑋,𝑍
Although 𝛽2 is biased and inconsistent, it may mean square dominate 𝛾F − adding Z will never make conditional variance of 𝛽2 smaller.
Lecture 11 GR5411 by Seyhan Erden 4
Latent variables:
Let
𝑦 G∗ = 𝑥 G6 𝛽 + 𝜀 G
and 𝐸 𝑥G𝜀G = 0, but we do not observe 𝑦G∗, we observe 𝑦G instead. Hence, 𝑦G = 𝑦G∗ + 𝑢G where 𝑢G is the measurement error satisfying
𝐸𝑥G𝑢G =0 (1) 𝐸𝑦G∗𝑢G =0 (2)
Let 𝛽2 be the OLS coefficient from regression of 𝑦G on𝑥G.
Is 𝛽2 unbiased? Consistent? What is 𝛽2’s
asymptotic distribution?
Lecture 11 GR5411 by Seyhan Erden 5
Yes, 𝛽2 is unbiased:
L 45L
𝛽2= J𝑥G𝑥G6 J𝑥G𝑦G
GK5 45 GK5 LL
= J𝑥G𝑥G6 J𝑥G 𝑦G∗+𝑢G
GK5 45 GK5 LL
= J𝑥G𝑥G6 J𝑥G 𝑥G6𝛽+𝜀G+𝑢G GK5 GK5
𝐸 𝛽2|𝑥G = 𝛽 due to assumptions 𝐸 𝑥G𝑢G and𝐸𝑥G𝜀G =0
= 0
Lecture 11 GR5411 by Seyhan Erden
6
Yes, 𝛽2 is consistent:
29 1L
𝛽→ 𝑛J𝑥G𝑥G6
GK5
45 1L
𝑛J𝑥G 𝑥G6𝛽+𝜀G+𝑢G
GK5
45 1 L
𝑛J𝑥G𝜀G GK5 45
9 1 L →𝛽+ 𝑛J𝑥G𝑥G6
GK5
1L 1L
+ 𝑛J𝑥G𝑥G6 GK5
𝑛J𝑥G𝑢G GK5
Lecture 11 GR5411 by Seyhan Erden
7
We know that
GK5
and
1L459 45
J𝑥𝑥6 → 𝐸𝑥𝑥6 =𝑄45 𝑛GG GG ;;
and
1L9
𝑛J𝑥G𝜀G →𝐸𝑥G𝜀G =0
GK5
1L9
𝑛J𝑥G𝑢G →𝐸𝑥G𝑢G =0
GK5
Lecture 11 GR5411 by Seyhan Erden 8
Hence,
𝛽2 →9 𝛽 Thus, 𝛽2 is consistent.
Now, the asymptotic distribution of 𝛽2 : 2 1L 451L
𝛽2→9 𝛽+𝑄45=0+𝑄45=0 ;; ;;
𝑛 𝛽 − 𝛽 = 𝑛 J 𝑥 G 𝑥 G6 J 𝑥 G 𝜀 G + 𝑢 G
GK5 𝑛 GK5
Assuming observations are iid: by weak LLN the first term converges to 𝑄45
;;
Lecture 11 GR5411 by Seyhan Erden 9
The 2nd term, if we have
Ω=𝐸𝑥G𝑥G6𝜀G+𝑢GB <∞
for each element, we can apply CLT and conclude that this term converges in distribution to 𝑁 0, Ω .
Thus, by Slutsky theorem,
𝑛𝛽2−𝛽 →R 𝑁0,𝑄45Ω𝑄45 ;; ;;
Lecture 11 GR5411 by Seyhan Erden
10
Conditional Mean Independence:
Conditional Mean Independence:
𝐸𝑢|𝑋,𝑊 =𝐸𝑢|𝑊
𝐸 𝑥G𝑤G𝑢G =𝐸 𝑤G𝑢G ≠0
Consider multiple regression form
𝑦 = 𝑋𝛽 + 𝑊𝛾 + 𝑢
where 𝑋 and 𝑊 are, respectively, 𝑛×𝑘5 and 𝑛×𝑘B matrices of regressors. Let 𝑥G6 and 𝑤G6 denote the 𝑖XY row of 𝑋 and 𝑊.
Assumptions:
1.𝐸𝑢G|𝑥G𝑤G =𝑤G6𝛿
where 𝛿 is 𝑘B×1 vector of unknown parameters
Lecture 11 GR5411 by Seyhan Erden 11
Assumptions (cont’):
2. 𝑥G𝑤G𝑦G are i.i.d.
3. 𝑥G𝑤G𝑢G have four finite nonzero moments
4. 𝑡here is no perfect multicollinearity
Basically, the only new assumption here is the first one, namely the conditional mean independence assumption replacing exogeneity (conditional mean zero) assumption.
Under conditional mean independence, we will show that LS estimator of 𝛽, 𝛽2 is still consistent,
but 𝛾F is not.
Lecture 11 GR5411 by Seyhan Erden 12
Using FWL theorem, we can write 𝛽2 as 𝛽2 = 𝑋′𝑀`𝑋 45𝑋6𝑀`𝑦
= 𝑋′𝑀`𝑋 45𝑋6𝑀` 𝑋𝛽+𝑊𝛾+𝑢
= 𝛽 + 𝑋′𝑀`𝑋 45𝑋6𝑀`𝑊𝛾 + 𝑋′𝑀`𝑋 45𝑋6𝑀`𝑢
= 𝛽 + 𝑋′𝑀`𝑋 45𝑋6𝑀`𝑢
due to orthogonality of 𝑀`𝑊 = 0
Using𝑀` =𝐼−𝑃` and𝑃` =𝑊 𝑊6𝑊 45𝑊′wecan get
𝑛45 𝑋′𝑀`𝑋 =𝑛45 𝑋′𝐼−𝑃` 𝑋
= 𝑛45𝑋6𝑋 − 𝑛45𝑋6𝑃`𝑋
Lecture 11 GR5411 by Seyhan Erden 13
= 𝑛45𝑋6𝑋 − 𝑛45𝑋6𝑊 𝑛45𝑊6𝑊 45 𝑛45𝑊6𝑋 First consider L
𝑛 4 5 𝑋 6 𝑋 = 𝑛1 J 𝑥 G 𝑥 G 6
GK5 456
This* is true for all the elements of 𝑛 𝑋 𝑋, so 1L9
𝑛45𝑋6𝑋=𝑛J𝑥G𝑥G6 →𝐸𝑥G𝑥G6 =𝑄;; GK5
* Next slide explains this
Lecture 11 GR5411 by Seyhan Erden 14
* You are not responsible with this detail but for a complete proof I am adding this slide:
The 𝑗, 𝑙 element of this matrix is L
𝑛1 J 𝑥 e G 𝑥 f G
GK5
By assumption 3 above, each element of 𝑥G has four moments, so by
Cauchy – Schwarz inequality 𝑥eG𝑥fG has two moments: 𝐸𝑥B𝑥B ≤ 𝐸𝑥g =𝐸𝑥g <∞
eG fG eG fG
Because 𝑥 𝑥 is i.i.d. with two moments, 5 ∑L 𝑥 𝑥 obeys the LLN eGfG L L GK5eGfG
𝑛1J𝑥eG𝑥fG →9 𝐸 𝑥eG𝑥fG GK5
Lecture 11 GR5411 by Seyhan Erden 15
Appling the same reasoning and using assumption 2, and 3, L
𝑛45𝑊6𝑊=𝑛1J𝑤G𝑤G6 →9 𝐸𝑤G𝑤G6 =𝑄`` GK5
1L9 𝑛45𝑋6𝑊=𝑛J𝑥G𝑤G6 →𝐸𝑥G𝑤G6 =𝑄;`
GK5
1L9 𝑛45𝑊6𝑋=𝑛J𝑤G𝑥G6 →𝐸𝑤G𝑥G6 =𝑄`;
GK5
Lecture 11 GR5411 by Seyhan Erden 16
From assumptions 3, we know 𝑄;;, 𝑄``, 𝑄;` and 𝑄`; are all finite non-zero, Slutsky’s theorem implies
𝑛45𝑋6𝑋 − 𝑛45𝑋6𝑊 𝑛45𝑊6𝑊 45 𝑛45𝑊6𝑋
→9 𝑄 − 𝑄 𝑄 4 5 𝑄 ;; ;` `` `;
which is finite and invertible. The conditional expectation
𝐸 𝑢5|𝑋,𝑊
𝐸 𝑢B|𝑋,𝑊 ⋮
6 6 𝐸 𝑢5|𝑥5,𝑤5
𝐸 𝑢B|𝑥B6,𝑤B6 = ⋮
𝐸 𝑈|𝑋,𝑊 =
⋮⋮66 𝐸 𝑢L|𝑋,𝑊 𝐸 𝑢L|𝑥L,𝑤L
Lecture 11 GR5411 by Seyhan Erden
17
𝐸 𝑢 5 | 𝑤 56 𝑤 56 𝛿 𝑤 56
𝐸 𝑢 B | 𝑤 B6 𝑤 B6 𝛿 𝑤 B6
= ⋮ =⋮=⋮𝛿=𝑊𝛿
⋮⋮⋮ 𝐸 𝑢 L | 𝑤 L6 𝑤 L6 𝛿 𝑤 L6
In the limit
𝑛45𝑋6𝑀`𝑈 →9 𝐸 𝑋6𝑀`𝑈|𝑋,𝑊 =𝑋6𝑀`𝐸 𝑈|𝑋,𝑊
= 𝑋6𝑀`𝑊𝛿 = 0kl×5
Lecture 11 GR5411 by Seyhan Erden 18
𝑛45𝑋6𝑀`𝑋 converges in probability to a finite invertible matrix, and 𝑛45𝑋6𝑀`𝑈 converges in probability to a zero vector. Applying Slutsky’s theorem,
𝛽2−𝛽=𝑛45𝑋6𝑀`𝑋45𝑛45𝑋6𝑀`𝑈 →9 0
This implies
𝛽2 →9 𝛽
Lecture 11 GR5411 by Seyhan Erden 19