Panel Data and Diff-in-Diff
Chris Hansman
Empirical Finance: Methods and Applications
January 24-25, 2022
Copyright By PowCoder代写 加微信 powcoder
Some Details
First assignment released this week
Posted on January 25th
Due on February 8th.
Last class: an introduction to causality
This class: estimating causal effects with panel data 1. An introduction to panel data
Multiple observations of the same unit over time 2. First difference and fixed effects estimators
Estimating causal effects with fixed omitted variables
3. Difference-in-difference estimators
A more robust method for estimating causal effects
Part 1: Introducing Panel Data
Three common types of data 1. Cross-sectional
2. Time-series 3. Panel
Estimating unit and time specific averages
Three common types of data:
(1) Cross-Sectional
A single observation for each unit i in {1,2,··· ,N}
e.g. test scores and study times for each individual in the class (2) Time Series
Repeated observations from time t = 1, · · · , T for a single unit
e.g. yearly GDP and unemployment in the UK
Repeated observations over time for multiple units
e.g. monthly market cap and leverage for every firm in the S&P
Cross-sectional data: One observation per unit
Time series data: One unit over time
Panel data: Multiple units followed over time
Panel data: Notation
Panel data consists of observations of the same n units in T different periods
If the data contains variables x and y, we write them (xit,yit)
fori=1,···,N
i denotes the unit, e.g. Microsoft or Apple
andt=1,···,T
t denotes the time period, e.g. September or October
Panel data: Multiple units followed over time
Panel data: Allows Averaging Within Units
Because we see every unit multiple times: Can take unit specific averages
pricei = ∑Tt=1 priceit T
Because we see many units at the same time period Can take time specific averages:
pricet = ∑Ni=1 priceit N
The overall average is (of course):
price = ∑Tt=1 ∑Ni=1 priceit N×T
Panel data: Unit Specific Averages
Panel data: Time Specific Averages
Calculating Unit Specific Averages With Regression
Recall that dummy variables let you calculate these means
Create dummy variables for each i (e.g. Company) omitting 1
Lets call them D1,D2,··· ,DN
And consider the following regression
yit = β0 + ∑ δi Di + vit
i=1 Recall that we can then estimate
Average for the omitted unit: βˆ0
Average for any other i: βˆ +δˆ 0i
Residualizing to Remove Differences in Means
yit = β0 + ∑ δi Di + vit
After estimating this regression, we can also compute the residuals: N−1
vˆ=y−βˆ− δˆD it it 0 ∑ii
For any given i, this translates to: vˆ = y −βˆ −δˆ
This is just yit −y ̄i
The price minus the unit specific average
Lets us compare changes over time
Putting aside level differences
Residualizing Removes Group Specific Means
Residualizing Removes Group Specific Means
Residualizing Removes Group Specific Means
Calculating Time Specific Averages With Regression
Can similarly calculate average for each time period with regression
Create dummy variables for each t (e.g. Dec. 15) omitting 1
Lets call them D1,D2,··· ,DT
And consider the following regression
yit = β0 + ∑ τt Dt + vit
t=1 Recall that we can then estimate
Average for the omitted unit: βˆ0
Average for any other i: βˆ +τˆ 0t
Part 2: Advantages of Panel Data for Causal Effects
A simple approach using a panel: event study
Two approaches to dealing with a fixed omitted variables
First differences
Fixed effects
A simple panel approach: Before vs. after
Suppose we are interested in the causal effect of a particular event or policy
yit = β0 + β1 AfterEventit + vi
Example: Impact of Brexit on UK firms Can we simply compare?
E [yit |Afterevent = 1] − E [yit |Afterevent = 0]
A simple panel approach: Before vs. after
2016m1 2016m4 2016m7 2016m10 2017m1 Month (t)
A simple panel approach: Before vs. after
E[Y|Before]
E[Y|After]
2016m1 2016m4 2016m7 2016m10 2017m1 Month (t)
Before vs. after an event used frequently
This tactic underlies an approach called event study
Lots of different techniques/bells and whistles
Chapter 4 of The Econometrics of Financial Markets (Cambell, Lo and MacKinlay) if you want more detail
Entrance into the S&P (Shleifer,1986; Harris and Gurel, 1986)
Source: Gompers, Greenwood, and Lerner’s Lecture Notes
When is an event study ineffective?
E[Y|Before]
E[Y|After]
2016m1 2016m4 2016m7 2016m10 2017m1 Month (t)
Panel Data and Omitted Variables
We will come back to this before vs. after strategy in a bit Lets reconsider our omitted variables problem:
yit =β0+β1xit+γai+eit Suppose we see xit and yit but not ai
Suppose Corr(xit,eit) = 0 but Corr(ai,xi) ̸= 0
Note that we are assuming ai doesn’t depend on t
Panel Data and Omitted Variables
An example:
Leverageit = β0 + β1 Profitit + γ ai + eit Some potential (fixed) omitted variables
Manager skill or risk aversion
Cost of capital
Panel Data and Omitted Variables
Suppose we are unable to observe ai yit=β0+β1xit+ vit
γ ai +eit If we estimate this regression, will we recover
No! because
βols =β 11
corr(xit,ai) ̸= 0 ⇒ corr(xit,vit) ̸= 0
Aside: Regression of this form are often called “pooled”
Because they “pool” data across individuals and time periods
Panel Data and Omitted Variables
βOLS +βOLSX 01
Our first Mentis…
Load the data panel example.csv
What is the coefficient βˆols if we treat ai as unobserved?
yit =β0+β1xit+vit
What is the coefficient βˆols if we observe and include ai in the 1
regression
yit =β0+β1xit+γai+eit
First Difference Regression
yit=β0+β1xit+ vit
Suppose we see exactly two time periods t = {1, 2} for each i We can write our two time periods as:
yi,1 = β0 +β1xi,1 +γai +ei,1
yi,2 = β0 +β1xi,2 +γai +ei,2 Then take the difference:
yi,2 −yi,1 = β1(xi,2 −xi,1)+(ei,2 −ei,1) ∆yi,2−1 = β1(∆xi,2−1)+∆ei,2−1
First Difference Regression
Instead of regressing yit on xit , regress the change in yit on the change in xit
Taking changes (differences) gets rid of fixed omitted variables ∆yi,2−1 = β1∆xi,2−1 +∆ei,2−1
As long as ∆ei,2−1 is mean independent of ∆xi,2−1:
E[∆ei,2−1|∆xi,2−1] = E[∆ei,2−1]
Note that this is not the same as:
E[eit|xit] = E[eit]
Menti: What is the coefficient βˆFD from a first difference regression? 1
Fixed Effects Regression
yit =β0+β1xit+γai+eit
An alternative approach:
Lets define δi = γai and rewrite:
yit =β0+β1xit+δi+eit So yit is determined by
(i) The baseline intercept β0 (ii) The effect of xit
(iii) An individual specific change in the intercept: δi Intuition behind fixed effects: Lets just estimate δi
What is δi
yit =β0+β1xit+δi+eit
δi is often referred to as i’s “fixed effect”
E[yit|xit = 0] = β0 +E[β1 ·0]+δi +E[eit|xit = 0]
So δi is just the change in individual is intercept: δi = E[yit|xit = 0]−β0
Fixed Effects Regression: Estimating δi
y1t =β0+β1x1t+δ1+eit y2t =β0+β1x2t+δ2+eit
ynt =β0+β1xnt+δn+eit
How do we estimate δ1,δ2,··· ,δn?
Fixed Effects Regression: Estimating δi
yit =β0+β1xit+δi+eit
Simplest approach (to me): Dummy variables
Construct N-1 dummy variables D1,D2,··· ,DN−1
D1 =1 when i =1 and 0 otherwise
D2 =1 when i =2 and 0 otherwise
D3 =1 when i =3 and 0 otherwise
And so on…
DN−1 =1 when i =N−1 and 0 otherwise
Fixed Effects Regression: Implementation
yit = β0 +β1xit + ∑ δiDi +eit
Note that we’ve left out DN
βOLS is interpreted as the intercept for individual N:
βOLS=E[y|x =0,i=N] 0 itit
and for all other i (e.g. i=2)
δ2 = E [yi |xit = 0, i = 2] − β0
Menti: What is the coefficient βˆFE from a fixed effects regression? 1
Fixed Effects Regression: Intuition
Any fixed characteristic of i is captured by the average of yit (for i)
By using dummy variables for i, we can just estimate (and hence
account for) those averages.
No longer have to worry about xit being correlated with a fixed component of eit
Why is This? Recall Regression Anatomy
βOLS = Cov(yit,x ̃it) 1 Var (x ̃it )
Where x ̃it is the residual from a regression of xit on Di N
xit = α0 + ∑αjDj +εit j=1
x ̃ =x −(αOLS+αOLS) it it 0 i
Subtracting (partialling out) the average xit for each i
x ̃it is no longer correlated with eit
Fixed Effects Regression: Assumptions
There is one important difference in the assumptions necessary for OLS to capture the causal effect:
Before, we needed
Now, we need:
E[eit|xit] = E[eit] E[eit|xi1,xi2,··· ,xiT ] = E[eit]
When Will Fixed Effects Not Be Enough?
E[eit|xi1,xi2,··· ,xiT ] = E[eit]
But what if eit is growing over time?
E.g. interest rates rising each quarter, influencing profits and leverage
Time Fixed Effects
We so far have focused on controlling for entity i fixed effects
What if xit is correlated with something that changes over time but
is fixed across individual units?
Leverageit = β0 + β1 Profitsit + τt + vit
For example, many time-varying macro variables (e.g. monetary policy) might affect profits and leverage
If these are constant for all firms than they will be captured by τt
Time Fixed Effects
yit =β0+β1xit+τt+eit
Exact same approach as with entity fixed effects
Construct T −1 dummy variables D1,D2,··· ,DT−1
D1 =1 when t =1 and 0 otherwise
D2 =1 when t =2 and 0 otherwise
And so on…
And then, omitting one time period, we can estimate T−1
Whatisβ0?τt?
yit = β0 +β1xit + ∑ τtDt +eit t=1
Time Fixed Effects
Time fixed effects do not deal with fixed individual characteristics What about combining both approaches?
Part 3: Difference-in-Difference
An example: Bankruptcy Costs and Leverage The difference-in-difference framework
Key assumption: Parallel Trends
Example: Bankruptcy Costs and Leverage
What is the effect of a decline in bankrutpcy costs on leverage?
Theory: Lower expected bankruptcy costs should increase leverage
Ideal (impossible to conduct) experiment:
Randomly select a subset of firms
Reduce bankruptcy costs for these firms (e.g. streamline bankruptcy procedures)
Compare leverage between this subset and the remaining firms
Example: Bankruptcy Costs and Leverage
At the end of 1991 the state of Delaware passed a new law (“the reform”)
Significantly streamlined bankruptcy proceedings Reduced costs and time of litigation
we use this to learn something about our question?
Suppose we call the causal effect of the reform: β1 How do we recover this parameter?
Approach 1: Before vs. After
Compare the average leverage of Delaware firms in 1991 vs. 1992 Let Aftert be a dummy equal to 1 after the reform
We would like to describe the relationship between the reform and leverage as:
Leverageit = β0 + β1 Aftert + vit
Where vit contains all other time and firm specific factors that influence leverage
Approach 1: Before vs. After
Suppose we regress Leverageit on our Aftert dummy: What is βOLS?
βOLS =E[Leverage |After =1]−E[Leverage |After =0] 1 it t it t
= β1 +E[vit|Aftert = 1]−E[vit|Aftert = 0] So β OLS = β1 (the causal effect of treatment) if
Why might that fail?
E[vit|Aftert]=E[vit]
Before vs. After
E[Y|After=0]
E[Y|After=
1991m7 1991m10 1992m1 1992m4 1992m7 Month (t)
When is Before vs. After Ineffective?
1991m7 1991m10 1992m1 1992m4 1992m7 Month (t)
When is Before vs. After Ineffective?
E[Y|After=0]
E[Y|After=
1991m7 1991m10 1992m1 1992m4 1992m7 Month (t)
Approach 1: Before vs. After
βOLS is just the difference in leverage for 1992 Delaware firms 1
(“treatment”) relative to 1991 Delaware firms (“Control”)
We require E [vit |Aftert = 1] = E [vit |Aftert = 0] for this to identify the causal effect of the reform
Any time trend/other events in 1992 will cause vit for later observations to be different from vit for earlier observations
e.g. tight credit in 1992 may have reduced debt (and hence leverage)
Approach 2: Cross Sectional
Compare Delaware Firms (“Treatment”) vs. Non-Delaware firms (Control) in 1992
Don’t need to worry about time trends
Requires data from firms in surrounding states
Let Di be a dummy equal to 1 if firm i is registered in Delaware
We would like to describle the relationship between the reform and leverage as:
Leveragei =β0+β1Di+vi
Where vi contains all other time and firm specific factors that influence leverage
Approach 2: Cross Sectional
Suppose we regress Leveragei on our Di dummy:
βOLS = E[Leverage |D = 1]−E[Leverage |D = 0]
1iiii = β1 +E[vi|Di = 1]−E[vi|Di = 0]
So β OLS = β1 (the causal effect of treatment) if 1
E[vi|Di] = E[vi]
Do we expect everything else that impacts leverage to be the same in Delaware and other states?
When is Cross Sectional Approach Ineffective?
Do we expect everything else that impacts leverage to be the same in Delaware and other states?
What if firms in Delaware are more capital-intensive Typically capital intensivity ⇒ more leverage
This is just an omitted variable:
Leveragei = β0 + β1Di + β2CIi + ei
So if we omit CIi and estimate
Leveragei =β0+β1Di+vi
Will βOLS be larger or smaller than β1? 1
When is Cross Sectional Approach Ineffective?
Of course, we could measure and control for capital intensivity Leveragei = β0 + β1Di + β2CIi + ei
Then our the assumption for β OLS = β1 becomes: 1
E[ei|Di,CIi] = E[ei|CIi]
Beyond capital intensivity, do we expect everything else that
impacts leverage to be the same in Delaware and other states?
Hard to control for everything
Difference-in-Difference Approach
Let’s combine the positive features of the cross-sectional and before/after approaches
Cross sectional avoided omitted trends
Before/after avoided omitted (fixed) characteristics
The difference-in-difference estimator does exactly this Leverageit = β0 + β1Di × Aftert + β2Di + β3Aftert + vit
Here β1 is the causal effect of the reform in Delaware
Requires data on firms in/out of Delaware before/after the reform
What Does Data Look Like for Difference-in-Difference
State Delaware Maryland Virginia Maryland Virginia
Year Leverageit (D/E) Di Aftert 1991 1.2 1 0 1991 3.1 0 0 1991 1.9 0 0 1991 0.9 1 0 1991 1.5 0 0 1991 1.1 0 0 1991 1.2 1 0 1991 1.6 0 0 1991 0.5 0 0
. . .. . . ..
Di ×Aftert 0
0 1 0 1 0 1
Maryland 1992 Delaware 1992 Virginia 1992 Delaware 1992 Maryland 1992 Delaware 1992
0.8 0 1 0.9 1 1 1.6 0 1 2.2 1 1 1.4 0 1 1.9 1 1
What Do the Difference-in-Difference Estimates Capture?
Recall that when righthand side variables take discrete values, OLS perfectly captures the conditional expectation function:
E[Leverageit|Di,Aftert]=E[βOLS +βOLSDi ×Aftert +βOLSDi +βOLSAftert|Di,Aftert] 0123
There are four groups:
1. Non-Delaware Before: {Di = 0, Aftert = 0}
2. Delaware Before: {Di = 1, Aftert = 0}
3. Non-Delaware After: {Di = 0, Aftert = 1} 4. Delaware After: {Di = 1, Aftert = 1}
What Do the Difference-in-Difference estimates Capture?
Lets calculate conditional expectations for these four groups: 1. E[Leverageit|Di =0,Aftert =0]=βOLS
2. E[Leverageit|Di = 1,Aftert = 0] = βOLS +βOLS 02
3. E[Leverageit|Di = 0,Aftert = 1] = βOLS +βOLS 03
4. E[Leverageit|Di = 1,Aftert = 1] = βOLS +βOLS +βOLS +βOLS 0123
What Do the Difference-in-Difference estimates Capture?
Lets calculate conditional expectations for these four groups: 1. E[Leverageit|Di =0,Aftert =0]=βOLS
2. E[Leverageit|Di = 1,Aftert = 0] = βOLS +βOLS 02
3. E[Leverageit|Di = 0,Aftert = 1] = βOLS +βOLS 03
4. E[Leverageit|Di = 1,Aftert = 1] = βOLS +βOLS +βOLS +βOLS 0123
Diff-in-Diff Solves Issues with Cross-Sectional Approach
Cross Sectional: Compare averages In Delaware vs. outside, after the reform
E[Leverageit|Di =1,Aftert =1]−E[Leverageit|Di =0,Aftert =1]
βOLS+βOLS+βOLS+βOLS (βOLS+βOLS) 0123 03
Cross-sectional Difference After
= β OLS + β OLS 12
We worried about the possibility of some omitted difference between Delaware and other states (β OLS ̸= 0)
Solution: Use the pre-reform difference to account for any fixed differences
E[Leverageit|Di =1,Aftert =0]−E[Leverageit|Di =0,Aftert =0]
βOLS+βOLS βOLS 020
Cross-sectional Difference Before
Diff-in-Diff Solves Issues with Cross Sectional Approach
Difference in Difference=
Difference After−Difference Before
βOLS+βOLS βOLS 122
Diff-in-Diff Solves Issues with Before vs. After
Before vs After: Compare averages before vs. after within Delaware: E[Leverageit|Di =1,Aftert =1]−E[Leverageit|Di =1,Aftert =0]
βOLS+βOLS+βOLS+βOLS (βOLS+βOLS) 0123 02
Difference In Delaware
= β OLS + β OLS 13
We worried about the possibility of some time trend Solution: Use other states to account for time trends
E[Leverageit|Di =0,Aftert =1]−E[Leverageit|Di =0,Aftert =0]
βOLS+βOLS βOLS 030
Difference Out of Delaware
Diff-in-Diff Solves Issues with Before vs. After
Difference in Difference=
Difference In Delaware−Difference Out of Delaware
βOLS+βOLS βOLS 133
Difference in Difference Matrix
Two ways to interpret the same estimator βOLS : 1
Delaware (Treatment) Other States (Control) Difference
Before After Difference βOLS +βOLS βOLS +βOLS +βOLS +βOLS =βOLS +βOLS
βOLS βOLS +βOLS =βOLS 0033
= βOLS = βOLS +βOLS = βOLS 2121
Diff-in-Diff Graphically
Treatment (Delaware)
Control (Non−Delaware)
Diff-in-Diff Graphically
Treatment (Delaware)
Control (Non−Delaware)
Diff-in-Diff Graphically
Treatment (Delaware)
Control (Non−Delaware)
Diff-in-Diff Graphically
Treatment (Delaware)
Control (Non−Delaware)
Diff-in-Diff Graphically
Treatment (Delaware)
Control (Non−Delaware)
Diff-in-Diff Graphically
Treatment (Delaware)
Control (Non−Delaware)
When Does Diff-in-Diff Identify A Causal Effect
As usual, we need
E[vit|Di,Aftert] = E[vit]
What does this mean intuitively?
Parallel trends assumption: In the absence of any reform the
average change in leverage would have been the same in the treatment and control groups
In other words: trends in both groups are similar
Parallel Trends
Treatment (Delaware)
Control (Non−Delaware)
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com