程序代写代做代考 algorithm ECONOMETRICS I ECON GR5411 Lecture 8 – Multicollinearity and

ECONOMETRICS I ECON GR5411 Lecture 8 – Multicollinearity and
FWL Theorem
by
Seyhan Erden Columbia University MA in Economics

Non-spherical errors:
𝐸 𝛽” 𝑋 = 𝛽 is unbiased provided 𝐸 𝜀 𝑋 = 0. It is no longer efficient when 𝜀 are not spherical.
𝑉𝑎𝑟𝛽”𝑋 =𝐸 𝛽”−𝛽 𝛽”−𝛽,|𝑋
= 𝐸 𝑋,𝑋 ./𝑋,𝜀 𝑋,𝑋 ./𝑋,𝜀 ,|𝑋
= 𝐸 𝑋,𝑋 ./𝑋,𝜀𝜀,𝑋 𝑋,𝑋 ./|𝑋
= 𝑋,𝑋 ./𝑋,𝐸[𝜀𝜀,|𝑋]𝑋 𝑋,𝑋 ./ = 𝑋,𝑋 ./ 𝑋,Ω𝑋 𝑋,𝑋 ./
≠ 𝜎5 𝑋,𝑋 ./
𝛽”|𝑋 ~ 𝑁 𝛽,(𝑋,𝑋)./(𝑋,Ω𝑋) 𝑋,𝑋 ./
10/5/20
Lecture 7 GR5411 by Seyhan Erden 2

Estimating error variance:
The error variance 𝜎5 = 𝐸 𝜀;5 is a moment, so a natural estimator is a moment estimator. If 𝜀;,𝑠 were observed, we would estimate 𝜎5 by
1B1
𝜎= 5 = 𝑛 @ 𝜀 ;5 = 𝑛 𝜀 , 𝜀
, ;A/
However, 𝑒; 𝑠 are not observed, hence we have to estimate them: B
𝜎D5 = 1@𝜀̂5 = 1 𝜀̂,𝜀̂ = 1𝜀,𝑀𝜀 𝑛;𝑛𝑛
;A/
10/5/20 Lecture 7 GR5411 by Seyhan Erden 3

Estimating error variance:
Since,
𝜀̂=𝑀𝑦=𝑀 𝑋𝛽+𝜀 =𝑀𝑋𝛽+𝑀𝜀=𝑀𝜀 where𝑀=𝐼−𝑋 𝑋,𝑋 ./𝑋,and𝑀𝑋=0 Then 𝜀̂,𝜀̂ = 𝜀,𝑀𝜀 since 𝑀 is idempotent.
Then we can show
𝜎= 5 − 𝜎D 5 = 𝑛1 𝜀 , 𝜀 − 𝑛1 𝜀 , 𝑀 𝜀 = 𝑛1 𝜀 , 𝑃 𝜀 ≥ 0
Meaning the feasible estimator is smaller than the the idealized estimator. (Because 𝑃 is positive semi-definite and 𝜀,𝑃𝜀 is quadratic form)
10/5/20 Lecture 7 GR5411 by Seyhan Erden 4

Estimating 𝜎5:
Definition: The LS estimator of 𝜎5 is
𝑠5 = 𝜀̂,𝜀̂ 5 𝑛−𝑘
Theorem: 𝑠 is unbiased Proof:
𝜀̂=𝑀𝑦=𝑀 𝑋𝛽+𝜀 =𝑀𝑋𝛽+𝑀𝜀=𝑀𝜀 where 𝑀 = 𝐼 − 𝑋(𝑋,𝑋)./𝑋,
hence 𝑀𝑋 = 0
Then 𝜀̂,𝜀̂ = 𝜀,𝑀𝜀 since 𝑀 is idempotent.
10/5/20 Lecture 7 GR5411 by Seyhan Erden 5

LS estimator of 𝜎5:
𝐸(𝜀̂,𝜀̂) = 𝐸(𝜀,𝑀𝜀)
Question: How do you take the expected value of a quadratic form?
Answer: By tracing it
𝐸 𝜀,𝑀𝜀
= 𝐸 𝑡𝑟𝑎𝑐𝑒(𝜀,𝑀𝜀)
= 𝐸 𝑡𝑟(𝑀𝜀𝜀,)
= 𝑡𝑟 𝑀 𝐸(𝜀𝜀,)
= 𝑡𝑟 (𝜎5𝑀)
=𝜎5𝑡𝑟𝐼− 𝑋,𝑋./𝑋,𝑋 = 𝜎5𝑡𝑟(𝐼 − 𝑋(𝑋,𝑋)./𝑋,)
= 𝜎5(𝑛 − 𝑘)
10/5/20
Lecture 8 GR5411 by Seyhan Erden 6

LS estimator of 𝜎5:
Hence, the variance estimator of the disturbance is
unbiased
𝐸𝑠5 =𝐸 𝜀̂,𝜀̂ = 1 𝜎5𝑛−𝑘=𝜎5 𝑛−𝑘 𝑛−𝑘
Also, ,
𝐸𝑠5|𝑋=𝐸 𝜀̂𝜀̂|𝑋= 1 𝜎5𝑛−𝑘=𝜎5
𝑛−𝑘 𝑛−𝑘
Thus, 𝑠5 is an unbiased estimator of 𝜎5
10/5/20 Lecture 8 GR5411 by Seyhan Erden 7

We covered all finite sample properties of 𝛽”:
Under 𝐴6.
𝛽”|𝑋 ~ 𝑁 𝛽,𝜎5(𝑋,𝑋)./
Ø 𝐸 𝛽” | 𝑋 = 𝐸 𝛽” = 𝛽 Ø𝐸𝑠5|𝑋=𝐸𝑠5 =𝜎5 Ø𝑉𝑎𝑟 𝛽” X = 𝜎5 𝑋,𝑋 ./
Ø𝑉𝑎𝑟𝛽” =𝜎5𝐸U 𝑋,𝑋./
Ø𝛽” is 𝐵. 𝐿. 𝑈. 𝐸. under 𝐴6
10/5/20 Lecture 8 GR5411 by Seyhan Erden 8

Notethat𝑉𝑎𝑟 𝛽” X =𝜎5 𝑋,𝑋 ./ is true under no strict multicollinearity:
𝑋,𝑋 ./ does not exist and 𝛽” is not defined. This situation is called perfect or strict multicollinearity. This happens when columns of 𝑋 are linearly dependent, meaning sets of regressors are identically related. These situations are not commonly encountered in applied econometric analysis. A more common case is near multicollinearity, which is simply called multicollinearity. When regressors are highly correlated.
10/5/20 Lecture 8 GR5411 by Seyhan Erden 9
In a regression 𝑦 = 𝑋𝛽 + 𝜀, if 𝑋,𝑋 is singular, then

Multicollinearity:
An implication of near multicollinearity is that individual coefficient estimates will be imprecise. This is not necessarily a problem for econometric analysis as the imprecision will be reflected in the standard errors, but it is still important to understand how highly correlated regressors can result in a lack of precision of individual coefficient estimates.
10/5/20 Lecture 8 GR5411 by Seyhan Erden 10

Multicollinearity:
Take a homoskedastic linear regression,
\𝑦; = 𝑥/;𝛽/ + 𝑥5;𝛽5 + 𝑒; where𝑥[;=𝑋[; −𝑋for𝑗=1,2.
and
𝑥// 𝑥/5 ⋮⋮
1𝑋,𝑋=1𝑥// …𝑥B/
𝑛 𝑛𝑥/5…𝑥B5 BB
𝑥B/
𝑥B5
= 𝑛1
@𝑥 𝑥 ;/ ;5
@𝑥5 ;/
;A/ BB
@𝑥𝑥 @𝑥5. ;5 ;/ ;5
;A/ ;A/
;A/
10/5/20
Lecture 8 GR5411 by Seyhan Erden
11

Given variances are one
1B 1B
@ 𝑥 5 = @ 𝑋 − 𝑋\ 5 = 1 𝑛 ;/ 𝑛 /;
;A/ ;A/
Covariances then are the same as correlation and can be
denoted as
1B
𝜌=𝑛@𝑋/;−𝑋\ 𝑋5;−𝑋\
;A/
1B 1B
@𝑥5 @𝑥 𝑥 1 𝑛 ;/ 𝑛 ;/;5
1𝜌 𝜌1
𝑋,𝑋= ;A/ ;A/ 𝑛1B1B
=
@𝑥;5𝑥;/ @𝑥5 . 𝑛 𝑛;5
10/5/20
Lecture 8 GR5411 by Seyhan Erden
12
;A/ ;A/

Then we can write, for the homoskedastic linear regression above,
𝑛1 𝑋 , 𝑋 =
𝑉𝑎𝑟𝛽”|𝑋=𝜎51𝜌./= 𝜎5 1−𝜌 𝑛𝜌1 𝑛1−𝜌5−𝜌1
𝜌 is the correlation between the two regressors, as 𝜌 → 1, the matrix becomes singular.
1𝜌 𝜌1
In this case
10/5/20 Lecture 8 GR5411 by Seyhan Erden 13

It is easy to see that as 𝜌 → 1, 𝜎5
→∞
Thus the more “collinear” are the regressors, the worse the precision of the individual coefficient estimates.
What is happening is that when the regressors are highly dependent, it is statistically difficult to disentangle the impact of 𝛽/ from that of 𝛽5:
As a consequence, the precision of individual estimates are reduced. The imprecision, however, will be reflected by large standard errors, so there is no distortion in inference. No bias due to multicollinearity.
𝑛 1−𝜌5
10/5/20 Lecture 8 GR5411 by Seyhan Erden 14

Partitioned and Partial Regression
𝑦 = 𝑋𝛽 + 𝜀 = 𝑋/𝛽/ + 𝑋5𝛽5 + 𝜀
The normal equations are
𝑋/,𝑋/ 𝑋/,𝑋5 𝛽/ = 𝑋/,𝑦 (1) 𝑋 5, 𝑋 / 𝑋 5, 𝑋 5 𝛽 5 𝑋 5, 𝑦 ( 2 )
First solve (1) for 𝛽/
𝑋 /, 𝑋 / 𝛽 / + 𝑋 /, 𝑋 5 𝛽 5 = 𝑋 /, 𝑦
𝛽/ = 𝑋/,𝑋/ ./𝑋/,𝑦 − 𝑋/,𝑋/ ./𝑋/,𝑋5𝛽5 = 𝑋/,𝑋/ ./𝑋/, 𝑦−𝑋5𝛽5
10/5/20 Lecture 8 GR5411 by Seyhan Erden 15

Partitioned and Partial Regression
𝛽 / = 𝑋 /, 𝑋 / . / 𝑋 /, 𝑦 − 𝑋 5 𝛽 5 Similarly,𝛽5 = 𝑋5,𝑋5 ./ 𝑋5, 𝑦−𝑋/𝛽/
Orthogonal Partitioned Regression:
If 𝑋/ and 𝑋5 are orthogonal 𝑋/, 𝑋5 = 0, then
𝛽 / = 𝑋 /, 𝑋 / . / 𝑋 /, 𝑦 𝛽 5 = 𝑋 5, 𝑋 5 . / 𝑋 5, 𝑦
10/5/20 Lecture 8 GR5411 by Seyhan Erden 16

Frisch – Waugh – Lowell Theorem:
The least-squares estimator 𝛽” , 𝛽” for /5
𝑦 = 𝑋𝛽 + 𝜀 = 𝑋/𝛽/ + 𝑋5𝛽5 + 𝜀
has algebraic solution (we will show this soon):
where
𝑀 / = 𝐼 − 𝑃 / = 𝐼 − 𝑋 / 𝑋 /, 𝑋 / . / 𝑋 /, 𝑀 5 = 𝐼 − 𝑃 5 = 𝐼 − 𝑋 5 𝑋 5, 𝑋 5 . / 𝑋 5,
𝛽” =(𝑋,𝑀𝑋)./𝑋,𝑀𝑦 //5//5
𝛽”5 =(𝑋5,𝑀/𝑋5)./𝑋5,𝑀/𝑦
10/5/20
Lecture 8 GR5411 by Seyhan Erden 17

Frisch – Waugh – Lowell theorem:
In a model such as
𝑦 = 𝑋𝛽 + 𝑒 = 𝑋/𝛽/ + 𝑋5𝛽5 + 𝑒
The OLS estimator of 𝛽5 and the OLS residuals 𝑒̂
may be equivalently computed by either the OLS regression𝑦=𝑋𝛽”+𝑒̂=𝑋𝛽” +𝑋𝛽” +𝑒̂
or via the following algorithm
ØRegress 𝑦 on 𝑋/ get residuals, say 𝑦= (𝑀/𝑦) ØRegress 𝑋5 on 𝑋/ get residuals, say 𝑋5∗ (𝑀/𝑋5)
ØRegress 𝑦= on 𝑋∗ to obtain 𝛽e (same as regressing 𝑀/𝑦 on 𝑀/𝑋5)5 5
𝐹𝑊𝐿 theorem says 𝛽” = 𝛽e 55
10/5/20 Lecture 8 GR5411 by Seyhan Erden 18
// 55

Partitioned and Partial Regression
Note that if we regress 𝑀/𝑦 on 𝑀/𝑋5
the resulting least squares estimator would be
𝛽e = ( 𝑋 , 𝑀 𝑋 ) . / 𝑋 , 𝑀 𝑦 55/55/
To prove FWL theorem, we must show that this is the same result as 𝛽”5
10/5/20 Lecture 8 GR5411 by Seyhan Erden 19

Proof of 𝐹𝑊𝐿 Theorem: 𝑋/,𝑋/ 𝑋/,𝑋5 𝛽/ = 𝑋/,𝑦
𝑋 5, 𝑋 / 𝑋 5, 𝑋 5 𝛽 5 𝑋 5, 𝑦
From eq. (2) we can write
𝑋 5, 𝑋 / 𝛽 / + 𝑋 5, 𝑋 5 𝛽 5 = 𝑋 5, 𝑦
From eq. (1)𝑋/,𝑋/𝛽/ + 𝑋/,𝑋5𝛽5 = 𝑋/,𝑦
(1) ( 2 )
( 3 )
Solving this for 𝛽/
𝛽/ = 𝑋/,𝑋/ ./𝑋/,𝑦 − 𝑋/,𝑋/ ./𝑋/,𝑋5𝛽5
10/5/20 Lecture 8 GR5411 by Seyhan Erden 20

Proof of 𝐹𝑊𝐿 Theorem:
𝛽/ = 𝑋/,𝑋/ ./𝑋/,𝑦 − 𝑋/,𝑋/ ./𝑋/,𝑋5𝛽5
Now, plug this back into eq. (3)
𝑋5,𝑋/ 𝑋/,𝑋/ ./𝑋/,𝑦− 𝑋/,𝑋/ ./𝑋/,𝑋5𝛽5 +𝑋5,𝑋5𝛽5 =𝑋5,𝑦
then
𝑋5,𝑋/ 𝑋/,𝑋/ ./𝑋/,𝑦−𝑋5,𝑋/ 𝑋/,𝑋/ ./𝑋/,𝑋5𝛽5
+ 𝑋 5, 𝑋 5 𝛽 5 = 𝑋 5, 𝑦
𝑋 5, 𝑋 5 − 𝑋 5, 𝑋 / 𝑋 /, 𝑋 / . / 𝑋 /, 𝑋 5 𝛽 5
=𝑋5,𝑦−𝑋5,𝑋/ 𝑋/,𝑋/ ./𝑋/,𝑦
10/5/20 Lecture 8 GR5411 by Seyhan Erden 21

Proof of 𝐹𝑊𝐿 Theorem: 𝑋 5, 𝑋 5 − 𝑋 5, 𝑋 / 𝑋 /, 𝑋 / . / 𝑋 /, 𝑋 5 𝛽 5
=𝑋5,𝑦−𝑋5,𝑋/ 𝑋/,𝑋/ ./𝑋/,𝑦
𝑋5, 𝐼−𝑋/ 𝑋/,𝑋/ ./𝑋/, 𝑋5𝛽5 , , =𝑋5 𝐼−𝑋/ 𝑋/𝑋/
𝑋 5, 𝐼 − 𝑃 / 𝑋 5 𝛽 5 = 𝑋 5, 𝐼 − 𝑃 / 𝑦
𝑋 5, 𝑀 / 𝑋 5 𝛽 5 = 𝑋 5, 𝑀 / 𝑦
./ , 𝑋/ 𝑦
10/5/20 Lecture 8 GR5411 by Seyhan Erden
22

Proof of 𝐹𝑊𝐿 Theorem: 𝑋 5, 𝑀 / 𝑋 5 𝛽 5 = 𝑋 5, 𝑀 / 𝑦
Hence,
𝛽5 = 𝑋5,𝑀/𝑋5 ./𝑋5,𝑀/𝑦
Recall that 𝑀/ is the residual maker, such that 𝑀/𝑋5 =vector of residuals from regressing 𝑋5on 𝑋/
Let 𝑋5∗ = 𝑀/𝑋5, since 𝑀/ is symmetric and idempotent, we can write 𝛽5 as
𝛽5 = 𝑋5,𝑀/𝑀/𝑋5 ./𝑋5,𝑀/𝑀/𝑦
Thus,
𝛽5 = 𝑋5∗,𝑋5∗ ./𝑋5∗,y
10/5/20 Lecture 8 GR5411 by Seyhan Erden 23

Proof of 𝐹𝑊𝐿 Theorem: 𝛽5 = 𝑋5∗,𝑋5∗ ./𝑋5∗,y
This is known as “netting out” or “partialing out” the effect of 𝑋/.
Implications of FWL Theorem:
1. 𝛽”5 is the same as the estimate when 𝑦 is regressed on 𝑋/ alone only when 𝑋/ and 𝑋5 are orthogonal 𝑖.𝑒. 𝑋/,𝑋5 =0
2. The slope parameters 𝛽5 can be obtained by including an intercept or performing the regression on demeaned data. Introducing 𝑀
matrix (known as “centering matrix”) 10/5/20 Lecture 8 GR5411 by Seyhan Erden
k
24

Explaining 2nd implication:
Recall the demeaning formula
𝑀𝑦= 𝐼−𝑖 𝑖,𝑖 ./𝑖, 𝑦=𝑦−𝑖𝑦\
when 𝑀 is the centering matrix sometimes denoted
as 𝑀k
Let 𝑋/ be vector of ones and 𝑋5 be a matrix of
regressors in 𝑋 = 𝑋/ 𝑋5
Observethat𝑀/k𝑋5 =𝑋5 −𝑋\5 and𝑀/k𝑦=𝑦−𝑦\ are demeaned variables. The FWL theorem says that 𝛽”5 Is the OLS regression of 𝑀/k𝑦 on 𝑀/k𝑋5
𝛽”5 = 𝑋5,𝑀/k𝑋5 ./𝑋5,𝑀/k𝑦
10/5/20 Lecture 8 GR5411 by Seyhan Erden 25