ECONOMETRICS I ECON GR5411
Lecture 5 – Linear Regression Model I by
Seyhan Erden
Columbia University
MA in Economics
Least Squares Regression:
OLS objective function: minimize sum of squared residuals
min𝑆(𝑏)= 𝑦−𝑋𝑏 – 𝑦−𝑋𝑏 $
= 𝑦-𝑦 − 𝑦-𝑋𝑏 − 𝑏-𝑋-𝑦 + 𝑏-𝑋-𝑋𝑏 = 𝑦-𝑦 − 2𝑦-𝑋𝑏 + 𝑏-𝑋-𝑋𝑏
The first order conditions:
𝜕𝑆(𝑏) = −2𝑋-𝑦 + 2𝑋-𝑋𝑏 12($) 𝜕𝑏
Setting 1$ = 0 and solving for 𝑏, gives us the estimator
that minimizes 𝑆(𝑏) (must check the second order conditions to make sure we are minimizing and not maximizing)
9/28/20 Lecture 4 GR5411 by Seyhan Erden 2
Least Squares Regression:
𝜕𝑆(𝑏) = −2𝑋-𝑦 + 2𝑋-𝑋𝑏 = 0 𝜕𝑏
2𝑋-𝑋𝑏 = 2𝑋-𝑦 𝑋-𝑋𝑏 = 𝑋-𝑦
This is known as normal equations. Hence
𝑏= 𝑋-𝑋45𝑋-𝑦 As long as 𝑋-𝑋 is non-singular
= 𝑋-𝑋 has full rank
= the inverse of 𝑋-𝑋 exists
= the columns of 𝑋-𝑋 are linearly independent The solution that satisfies the FOC
𝛽6 = 𝑋 – 𝑋 4 5 𝑋 – 𝑦
9/28/20 Lecture 4 GR5411 by Seyhan Erden 3
Least Squares Regression:
Verifying SOC
𝜕8𝑆(𝑏) = 2𝑋-𝑋 𝜕𝑏𝜕𝑏-
𝑋-𝑋 must be a positive definite matrix.
9/28/20 Lecture 4 GR5411 by Seyhan Erden 4
Least Squares Regression:
So Normal equations are solved uniquely for 𝑏
and by pre-multiplying both sides of them by 𝑋-𝑋 45
𝛽6 = 𝑋 – 𝑋 4 5 𝑋 – 𝑦
Viewed as a function of the sample (𝑦, 𝑋), 𝑏 called (ordinary) least squares estimator. For a given sample (𝑦, 𝑋), the value of this function is the OLS estimate.
Two terms are used almost interchangeably.
9/28/20 Lecture 4 GR5411 by Seyhan Erden 5
Vector and Matrix Notation Match:
Simple regression model:
Vector version: Matrix version:
Matching:
𝑦 : = 𝒙 -: 𝛽 + 𝜀 : 𝑦 = 𝑋𝛽 + 𝜀
?
= 𝒙 : 𝒙 -: = 𝑋 – 𝑋 :>5
?
=𝒙:𝑦: =𝑋-𝑦
:>5
9/28/20 Lecture 5 GR5411 by Seyhan Erden 6
Least Squares Regression:
Since
The OLS estimator can be written as
𝑋-𝑋 45𝑋-𝑦 = 𝑋-𝑋/𝑛 45𝑋-𝑦/𝑛
where 𝑘×𝑘 matrix is sample average of 𝑥:𝑥H- 11?
𝛽6 = 𝑆 4 5 𝑠 BB BD
𝑠 B B = 𝑛 𝑋 – 𝑋 = 𝑛 = 𝑥 : 𝑥 :- :>5
and 𝑘×1 vector is sample average of 𝑥:𝑦: 1-1?
𝑠BD = 𝑛 𝑋 𝑦 = 𝑛 = 𝑥: J 𝑦:
9/28/20 Lecture 5 GR5411 by Seyhan Erden 7
:>5
Assumptions
Ø𝐴1 − Linearity: 1D does not depend on 𝑥M
1BL
Ø𝐴2 − 𝑋 has full rank: 𝑋 has full column rank.
Regressors are linearly independent.
Ø𝐴3−Exogeneityofregressors:𝐸𝜀|𝑋 =0 Ø𝐴4 − Spherical errors: homoskedasticity and no
serial correlation.
Ø𝐴5 − 𝑥S can be fixed or random:
Ø𝐴6 − Normal distribution: the disturbances, 𝜀: are
normally distributed 𝜀:~𝑁(0, 𝜎8), 𝜀|𝑋~𝑁 0, 𝜎8𝐼 9/28/20 Lecture 4 GR5411 by Seyhan Erden 8
A1: Is linearity restrictive? Ø𝑦 = 𝐴𝑋Y𝑒[ implies
𝐿𝑛𝑦 = 𝐿𝑛𝐴 + 𝛽𝐿𝑛𝑋 + 𝜀
This is known as constant elasticity form.
The elasticity of 𝑦 with respect to changes in 𝑥 is 1]?D =𝛽M
1]?BL
where 𝑥M is the 𝑘^_ column of 𝑋 matrix.
Linearity assumption (A1) can be written compactly as
𝑦=𝑋 𝛽+𝜀
9/28/20
𝑛×1 𝑛×𝑘 𝑘×1 (𝑛×1)
Lecture 4 GR5411 by Seyhan Erden
9
Is linearity restrictive?
ØSemi-log model for growth rates: 𝐿𝑜𝑔𝑦^ =𝑋^-𝛽+𝛿𝑡+𝜀^
In this model autonomous growth rate (growth rate over time that is not explained by the model)
is
𝜕𝐿𝑜𝑔𝑦^ 𝜕𝑡
=𝛿
9/28/20 Lecture 4 GR5411 by Seyhan Erden 10
Is linearity restrictive?
ØOther variations of the general form: 𝑓𝑦^ =𝑔(𝑥^-𝛽+𝜀^)
also fit into definition of linear model.
9/28/20 Lecture 4 GR5411 by Seyhan Erden 11
A2: X has Full Column Rank
(Identification condition)
(No Perfect Multicollinearity)
Assumption:
𝑋 is an 𝑛×𝑘 matrix with rank 𝑘 (𝑋 has full column rank)
The columns of 𝑋 are linearly independent. This assumption is known as identification
condition
None of the 𝑘 columns of the data matrix 𝑋 can be expressed as a linear combination of the other
columns of 𝑋.
9/28/20 Lecture 4 GR5411 by Seyhan Erden 12
Identification condition:
Example:𝑦=𝑋5𝛽5 +𝑋8𝛽8 +𝑋e𝛽e +𝑋f𝛽f +𝜖
ØIdentification problem when 𝑋f = 𝑋8 +𝑋e
ØTo see this
𝑦=𝑋5𝛽5 +𝑋8𝛽8 +𝑋e𝛽e + 𝑋8 +𝑋e 𝛽f +𝜖
=𝑋5𝛽5 +𝑋8 𝛽8 +𝛽f +𝑋e 𝛽e +𝛽f +𝜖 We can only identify 𝛽8 + 𝛽f and 𝛽e + 𝛽f but
we cannot identify each parameter separately.
ØIf 𝑋 does not have full rank, 𝑋-𝑋 is not
invertible.
9/28/20 Lecture 4 GR5411 by Seyhan Erden 13
A3: Exogeneity.
Conditional mean restriction:
𝐸 𝜀5|𝑋 𝐸 𝜀5|𝑥5,𝑥8,…𝑥M
𝐸𝜀|𝑋 = 𝐸𝜀8|𝑋 = 𝐸𝜀8|𝑥5,𝑥8,…𝑥M =0 ⋮⋮
𝐸 𝜀?|𝑋 𝐸 𝜀?|𝑥5,𝑥8,…𝑥M
Implications:
ØUnconditional mean of 𝜀 is zero
𝐸𝜀: =𝐸𝐸𝜀:|𝑋 =𝐸0=0 Ø𝐸𝑦|𝑋 =𝑋𝛽
9/28/20 Lecture 4 GR5411 by Seyhan Erden 14
Proof of Exogeneity.
the proof is a good illustration of the use of properties of conditional expectations:
Proof: Since 𝑥HM is an element of 𝑋, strict exogeneity implies
𝐸𝜀:|𝑥HM =𝐸𝐸𝜀:|𝑋|𝑥HM =0
by the Law of Iterated Expectations from probability theory.
9/28/20 Lecture 4 GR5411 by Seyhan Erden 15