ECONOMETRICS I ECON GR5411 Lecture 23 – Panel Data II
by
Seyhan Erden Columbia University MA in Economics
The Fixed Effects Model:
Therefore a LS regression of 𝑦̈ on 𝑋̈ is equivalent to a regression of 𝑦̈!” = 𝑦!” − 𝑦&! on 𝑥̈!” = 𝑥!” − 𝑥̅!
In terms of within transformed data
𝛽! ! ” # $ = 𝛽! % & = 𝑋 ̈ ′ 𝑋 ̈ ‘ ( 𝑋 ̈ 𝑦 ̈ = 𝑋 ) 𝑀 # 𝑋 ‘ ( 𝑋 ) 𝑀 # 𝑦
The above transformation is also known as within transformation. We also refer to 𝑦̈ and 𝑋̈ as demeaned
values or deviations from entity means. Also, 𝛽)#$ is
known as within estimator.
12/7/20 Lecture 23 – GR5411 by Seyhan Erden 2
Testing the significance of group effects
To test the differences across groups, we can test the hypothesis that the constant terms are all equal with an 𝐹 test. Under the null hypothesis, the efficient estimator is pooled least squares. The 𝐹 ratio is
𝑅* − 𝑅* /(𝑛 − 1) 𝐹= !”#$ +,,-./
1 − 𝑅* /(𝑛𝑇 − 𝑛 − 𝑘 − 1) !”#$
~𝐹
0′( ,(03’0’4′()
12/7/20
Lecture 23 – GR5411 by Seyhan Erden
3
The Fixed Effects Model:
𝛽)#$ = 𝑋%𝑀&𝑋 ‘( 𝑋%𝑀&𝑦
= 𝑋%𝑀&𝑋 ‘( 𝑋%𝑀& 𝑋𝛽+𝐷𝛼+𝜀
=𝛽+ 𝑋%𝑀&𝑋'( 𝑋%𝑀&𝜀
since 𝑀&𝐷 = 0. 𝛽)#$ is free potential bias from any
entity – specific, time – invariant variables.
Fixed effects estimator is invariant to the actual values of fixed effects (entity – fixed effects). Thus statistical properties do not rely on assumptions about 𝑢! (𝑐! or
𝛼!, i.e. any entity – fixed effects)
12/7/20 Lecture 23 – GR5411 by Seyhan Erden 4
The Fixed Effects Covariance Matrix:
𝑉𝑎𝑟#$ = 𝜎)* 𝑋̈′𝑋̈ ‘( ≥ 𝜎)* 𝑋%𝑋 ‘( = 𝑉𝑎𝑟+,,-
To estimate the variance of fixed effects estimator under heteroskedasticity we use clustered one:
2
9 ̈̈'( ̈%%̈̈̈'( 𝑉𝑎𝑟 = 𝑋′𝑋 : 𝑋 𝜀̂ 𝜀̂ 𝑋 𝑋′𝑋
.-/0#$ !!!! !1(
12/7/20 Lecture 23 – GR5411 by Seyhan Erden
5
Example: Investment model
𝐼!” = 𝛽# + 𝛽$𝐹!” + 𝛽%𝐶!” + 𝜀!”, 𝑡 = 1,…20,𝑖 = 1,…10
𝐼!” = Real gross investment for firm 𝑖 in year 𝑡. 𝐹!” = Real value of the firm (shares outstanding) 𝐶!” = Real value of the capital stock.
12/7/20 Lecture 23 – GR5411 by Seyhan Erden 6
Investment model: Pooled Regression
xtset firm year
panel variable: firm (strongly balanced)
time variable: year, 1935 to 1954 delta: 1 unit
. reg i f c
Source | SS df ————-+———————————-
Model | 7604093.48 2 3802046.74 Residual | 1755850.43 197 8912.94636 ————-+———————————- Total | 9359943.92 199 47034.8941
Root MSE —————————————————————————— i | Coef. Std. Err. t P>|t| [95% Conf. Interval] ————-+—————————————————————- f| .1155622 .0058357 19.80 0.000 .1040537 .1270706 c| .2306785 .0254758 9.05 0.000 .1804382 .2809188 _cons | -42.71437 9.511676 -4.49 0.000 -61.47215 -23.95659
——————————————————————————
12/7/20 Lecture 23 – GR5411 by Seyhan Erden 7
MS
Number of obs F(2, 197) Prob>F R-squared
Adj R-squared
= 200
= 426.58
= 0.0000
= 0.8124
= 0.8105
= 94.408
Investment model: Pooled with Cluster
reg i f c, cluster(firm)
Linear regression
Number of obs
F(2, 9)
Prob > F
R-squared
Root MSE
= 200
= 51.59
= 0.0000
= 0.8124
= 94.408
(Std. Err. adjusted for 10 clusters in firm) ——————————————————————————
| Robust
i | Coef. Std. Err. t P>|t| [95% Conf. Interval]
————-+—————————————————————- f| .1155622 .0158943 7.27 0.000 .0796067 .1515176 c| .2306785 .0849671 2.71 0.024 .0384695 .4228874
_cons | -42.71437 20.4252 -2.09 0.066 -88.91939 3.490649 ——————————————————————————
The standard errors increase substantially. This is at least suggestive that there is correlation across observations within the groups.
12/7/20 Lecture 23 – GR5411 by Seyhan Erden 8
Investment model: Fixed Effects
. xtreg i f c, fe
Fixed-effects (within) regression Group variable: firm
R-sq:
within = 0.7668
between = 0.8194
overall = 0.8060
corr(u_i, Xb) = -0.1517 ——————————————————————————
i | Coef. Std. Err. t P>|t| [95% Conf. Interval] ————-+—————————————————————- f| .1101238 .0118567 9.29 0.000 .0867345 .1335131 c| .3100653 .0173545 17.87 0.000 .2758308 .3442999 _cons | -58.74393 12.45369 -4.72 0.000 -83.31086 -34.177 ————-+—————————————————————-
sigma_u | 85.732501
sigma_e | 52.767964
rho | .72525012 (fraction of variance due to u_i) —————————————————————————— F test that all u_i=0: F(9, 188) = 49.18 Prob > F = 0.0000 . estimates store fe
F-test for testing the hypothesis that the constant for all 10 firms are all the same is 49.18, so reject the
null
Number of obs = 200 Number of groups = 10 Obs per group:
F(2,188)
min = 20
avg = 20.0
max = 20
= 309.01
Prob > F
= 0.0000
12/7/20 Lecture 23 – GR5411 by Seyhan Erden 9
Investment model: FE with HAC errors
. xtreg i f c, fe vce(cluster firm)
Fixed-effects (within) regression Group variable: firm
R-sq:
within = 0.7668
between = 0.8194
overall = 0.8060
corr(u_i, Xb) = -0.1517
Number of obs = 200 Number of groups = 10
Obs per group:
F(2,9)
min = 20
avg = 20.0
max = 20
= 28.31
Prob > F
(Std. Err. adjusted for 10 clusters in firm)
—————————————————————————— | Robust
i | Coef. Std. Err. t P>|t| [95% Conf. Interval] ————-+—————————————————————- f| .1101238 .0151945 7.25 0.000 .0757515 .1444961 c| .3100653 .0527518 5.88 0.000 .1907325 .4293981 _cons | -58.74393 27.60286 -2.13 0.062 -121.1859 3.698079 ————-+—————————————————————-
sigma_u | 85.732501
sigma_e | 52.767964
rho | .72525012 (fraction of variance due to u_i)
——————————————————————————
12/7/20 Lecture 23 – GR5411 by Seyhan Erden 10
= 0.0001
Intercept in Fixed Effects Regression:
The fixed effects estimator does not apply to any regressor which is time – invariant for all entities. This includes the intercept. Yet some authors and packages (ex: Amemiya (1971) and xtreg in Stata) report intercept. To see how to construct an estimator of an intercept, take the components regression equation adding an explicit intercept #
𝑦!” =𝛼+𝑥!”𝛽+𝑢! +𝜀!”
We estimate 𝛽 by 𝛽)$% , replacing 𝛽 in this equation with 𝛽)$% and then estimating 𝛼 by least squares, we obtain +# )
𝛼* $ % = 𝑦+ − 𝑋 𝛽 $ %
where 𝑦+ and 𝑋+ are averages from full sample. This is the estimator reported by
xtreg.
It is unclear if 𝛼*$% is particularly useful. It may be best to ignore the reported
intercepts and focus on the slope coefficients.
12/7/20 Lecture 23 – GR5411 by Seyhan Erden 11
Random Effects Model:
Under 𝐸 𝑐6|𝑋6
𝑦67 = 𝑥) 𝛽 + 𝑐6 + 𝜀67 67
= 𝛼, the model can be written as 𝑦67=𝛼+𝑥)𝛽+𝜀67+ 𝑐6−𝐸𝑐6|𝑋6
𝑦67 = 𝛼 + 𝑥) 𝛽 + 𝜀67 + 𝑢6 67
𝑦67 =𝑥) 𝛽+ 𝛼+𝑢6 +𝜀67 67
or sometimes written as
)
𝑦67 =𝛼+𝑥67𝛽+𝜂67
67
12/7/20 Lecture 23 – GR5411 by Seyhan Erden 12
Random Effects estimator is GLS estimator :
where where
# 𝜎 )* 𝜌=
𝛽&!” = 𝑋#Ω$%𝑋 $% 𝑋#Ω$%𝑦 ( $%(
= , 𝑋&#Ω$%𝑋& &’%
, 𝑋&#Ω$%𝑦& &’%
𝜎 )*
Ω=𝐼+𝑖𝑖 𝜎+* =𝐼+𝑇𝜎+*𝑃=𝑀+𝜌 𝑃
$ *
𝜎+
𝜎+* + 𝜎)*
12/7/20
Lecture 23 – GR5411 by Seyhan Erden 13
Random Effects Model:
If entity specific effects are strictly uncorrelated with the regressor, then it might be appropriate to model the entity specific constant terms as randomly distributed across cross- sectional units.
Advantage: it greatly reduces the number of parameters to be estimated
Disadvantage: if “strictly uncorrelated” assumption is not correct we’ll get inconsistent estimators.
12/7/20 Lecture 23 – GR5411 by Seyhan Erden 14
𝑦67 =𝑥) 𝛽+𝑢6 + 𝛼+𝑢6 +𝜀67 67
Strict exogeneity is assumed
𝐸 𝜀67|𝑋6 =𝐸 𝑢6|𝑋6 =0
𝐸 𝜀* |𝑋6 = 𝜎8* (the variance is constant) 𝐸 𝜀6𝜀)|𝑋6 67 6
𝐸 𝑢6*|𝑋6 = 𝜎9* (the variance is constant) 𝐸 𝜀67𝑢:|𝑋6 =0 forall𝑖,𝑡and𝑗
𝐸 𝜀67𝜀:;|𝑋6 =0 forall𝑡≠𝑠or𝑖≠𝑗 𝐸𝑢6𝑢:|𝑋6 =0for𝑖≠𝑗
= 𝜎8*𝐼
12/7/20 Lecture 23 – GR5411 by Seyhan Erden
15
Recall that 𝜂67 = 𝜀67 + 𝑢6
and
in this form the model is referred to as error correction
model.
𝐸𝜂*|𝑋6 =𝜎8*+𝜎9* 67
𝜂6 = 𝜂6(,𝜂6*,…,𝜂63
)
𝐸𝜂67𝜂6;|𝑋6 =𝜎9* for𝑡≠𝑠
𝐸 𝜂67𝜂:;|𝑋6 =0 forall𝑡and𝑠if𝑖≠𝑗
12/7/20
Lecture 23 – GR5411 by Seyhan Erden 16
For 𝑇 observations for unit 𝑖,
let
% 𝜎)*+𝜎/* ⋯ 𝜎/* Σ=𝐸𝜂!𝜂!= ⋮ ⋱ ⋮
𝜎/* ⋯ 𝜎)*+𝜎/* = 𝜎 )* 𝐼 3 + 𝜎 /* 𝑖 3 𝑖 3%
where 𝑖3 is 𝑇×1 column vector of ones.
12/7/20 Lecture 23 – GR5411 by Seyhan Erden 17
Because observations 𝑖 and 𝑗 are independent, the covariance matrix for disturbances for full 𝑛𝑇 observations is
Σ⋯0
Ω= ⋮ ⋱ ⋮ =𝐼0⨂Σ
0⋯Σ
For random effects model, as seen from the form of the model, OLS is consistent but not efficient, however GLS will be efficient.
12/7/20 Lecture 23 – GR5411 by Seyhan Erden 18
Generalized Least Squares (Random Effects):
𝛽! = 𝑋′Ω'(𝑋 ‘( 𝑋′Ω'(𝑦 Ω'(/* = 𝐼0⨂Σ ‘(/* = 𝐼0⨂Σ'(/*
12/7/20 Lecture 23 – GR5411 by Seyhan Erden 19
Investment model: Random Effects
xtreg i f c, re
Random-effects GLS regression
Group variable: firm
R-sq:
within = 0.7668
between = 0.8196
overall = 0.8061
corr(u_i, X) = 0 (assumed) ——————————————————————————
i | Coef. Std. Err. z P>|z| [95% Conf. Interval] ————-+—————————————————————- f| .1097811 .0104927 10.46 0.000 .0892159 .1303464 c| .308113 .0171805 17.93 0.000 .2744399 .3417861 _cons | -57.83441 28.89893 -2.00 0.045 -114.4753 -1.193537 ————-+—————————————————————-
sigma_u | 84.20095
sigma_e | 52.767964
rho | .71800838 (fraction of variance due to u_i)
——————————————————————————
. estimates store re
Number of obs = 200
Number of groups = 10
Obs per group:
Wald chi2(2)
min = 20
avg = 20.0
max = 20
= 657.67
Prob > chi2
= 0.0000
12/7/20 Lecture 23 – GR5411 by Seyhan Erden 20
Testing for Random Effects: 1. LM test
2. Hausman Test LM Test:
𝐻 , : 𝜎 )* = 0 𝐻 % : 𝜎 )* > 0
𝑛𝑇 ∑( ∑. 𝜀̂
the test statistic
𝐿𝑀=
*
&’% -‘% &- −1
*
2(𝑇−1) ∑( ∑. 𝜀̂* &’% -‘% &-
under 𝐻,, 𝐿𝑀 is distributed by Chi-sqr with one degrees of freedom. 12/7/20 Lecture 23 – GR5411 by Seyhan Erden 21
Testing for Random Effects:
Hausman Test:
𝐻=: 𝛽%& = 𝛽>& 𝐻(: 𝛽%& ≠ 𝛽>&
the chi-squared test is based on the Wald criterion
𝑊=𝛽!−𝛽!)ΨR'(𝛽!−𝛽! ~𝜒* %& >& %& >& 4′(
ΨR=𝑣U𝑎𝑟𝛽! −𝛽! =𝑉Y −𝑉Y %& >& %& >&
12/7/20 Lecture 23 – GR5411 by Seyhan Erden 22
Covariance matrix of the difference vector 𝛽!%& − 𝛽!>& 𝑉𝑎𝑟 𝛽!%& − 𝛽!>&
=𝑉𝑎𝑟 𝛽!%& +𝑉𝑎𝑟 𝛽!>& −2𝐶𝑜𝑣(𝛽!%&,𝛽!>&)
Hausman’s essential result is that the covariance of an efficient estimator with its difference from an inefficient estimator is zero, which implies that
𝐶𝑜𝑣 𝛽!%&−𝛽!>& ,𝛽!>& =𝐶𝑜𝑣𝛽!%&,𝛽!>& −𝑉𝑎𝑟𝛽!>& =0
orthat ! ! 𝑉𝑎𝑟 𝛽%& −𝛽>&
! ! ^ =𝑉𝑎𝑟 𝛽%& −𝑉𝑎𝑟 𝛽>& =Ψ
12/7/20
Lecture 23 – GR5411 by Seyhan Erden
23
Investment model: LM Test (RE vs. Pooled OLS)
. xttest0
Breusch and Pagan Lagrangian multiplier test for random effects
i[firm,t] = Xb + u[firm] + e[firm,t]
Estimated results:
Test:
| Var sd = sqrt(Var)
———+—————————–
i | e | u |
Var(u) = 0
47034.89
2784.458
7089.8
216.8753
52.76796
84.20095
798.16
chibar2(01) =
Prob > chibar2 =
0.0000 →
𝐻!: 𝜎”# = 0 meaning there is no variation across panels, so no panel effect. This is comparing Pooled
Reject 𝐻!
to Random Effects. If you reject 𝐻!, then there is panel effect so you should use random effects.
12/7/20 Lecture 23 – GR5411 by Seyhan Erden 24
Investment model: Hausman Test RE vs FE
. hausman fe re
—- Coefficients —-
| (b) (B) (b-B) sqrt(diag(V_b-V_B))
| fe re Difference S.E.
————-+—————————————————————- f | .1101238 .1097811 .0003427 .0055213
c | .3100653 .308113 .0019524 .0024516
—————————————————————————— b = consistent under Ho and Ha; obtained from xtreg B = inconsistent under Ha, efficient under Ho; obtained from xtreg
Test: Ho: difference in coefficients not systematic
chi2(2) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= 2.33
Prob>chi2 = 0.3119 → do not reject 𝐻!
The Hausman statistic is quite small, and p-value is large which suggests that the random effects approach is consistent with the data. This test compares random effects and fixed effects.
12/7/20 Lecture 23 – GR5411 by Seyhan Erden 25