Nonlinear Econometrics for Finance Lecture 3
. Econometrics for Finance Lecture 3 1 / 18
Recap: testing asset pricing models
Copyright By PowCoder代写 加微信 powcoder
Prices are discounted expectations of future cash flows:
pt = Et[mt+1 (pt+1 + dt+1)].
Dividing by pt both sides, we can now re-write the pricing equation in terms of
(pt+1 + dt+1) pt
] ⇒ 1 = Et[mt+1(1 + Rt+1)].
We now have our pricing equation in terms of returns:
Et(mt+1(1 + Rt+1)) = 1.
Equivalently, by taking 1 to the left-hand side, we can write a “conditional
expected pricing error”:
Et(mt+1(1 + Rt+1) − 1) = 0.
Taking unconditional expectations of both sides, by the law of iterated expectations, we now have an “unconditional expected pricing error” or a moment condition:
E(mt+1(1 + Rt+1) − 1) = 0.
. Econometrics for Finance Lecture 3 2 / 18
Recap: testing asset pricing models
Consider, now, N assets.
We can stack the moment conditions for all N assets one on top of the other to obtain
1+Rt+1 1 Emt+1 … − … = 0.
1+RN 1 t+1
Notice that mt+1 depends on parameters. Different asset pricing models will, therefore, lead to a
different stochastic discount factor mt+1 and different moment conditions. In the Consumption CAPM with CRRA utility: mt+1 = β ct+1 −γ .
In this case, the moment conditions become
c −γ 1+Rt+1
− …=0.
t+1 … ct N
There are two parameters to estimate: the subjective discount factor β and the coefficient of
relative risk aversion γ. We could write mt+1(θ) with θ = (β, γ).
1
1+Rt+1 N vector
. Econometrics for Finance Lecture 3 3 / 18
Recap: testing asset pricing models
The moment conditions depend on an expectation. We do not know the expectation. We can, however, compute empirical means. Sample means converge to expectations by the law of large numbers.
Estimation: GMM estimates θ by setting the difference between the sample mean of mt+1(θ)(1 + Rt+1) and 1 as close as possible to 0:
T−1 1+R1 1 1 t+1
mt+1(θ) … − … T t=1 1+RN 1
= gT(θ)
more compact notation
Nvector − 1+Rt+1 Nvector − 1
Testing: Given an estimate for θ, denoted by θT , GMM evaluates the size of
the pricing errors (Hansen, 1982): how close to 0 is the difference between the
sample mean of mt+1(θT )(1 + Rt+1) and 1? The larger the pricing errors, the worse the pricing model.
. Econometrics for Finance Lecture 3 4 / 18
GMM: The criterion
Estimation of θ:
arg min gT (θ)⊤ WT gT (θ) = θ
arg min QT (θ)
Thus,wechooseθT sothat ∂Q (θ ) ≈0.
1×N N×N N×1
Assume the dimension of the vector θ is d with N ≥ d. (The number of
parameters is not larger than the number of assets.)
Typically, we cannot estimate θ to make the pricing errors exactly zero.
However, we want to make the pricing errors as small as possible.
In order to do so, we minimize a quadratic criterion: arg min QT (θ) .
θ 1×1
Note: the weight matrix WT tells you how much emphasis you are putting on specific moments (i.e., on specific assets).
If WT = IN , i.e., the identity matrix, then you are effectively treating all assets in the same way. In this case, the criterion minimizes the sum of the squared pricing errors.
. Econometrics for Finance Lecture 3 5 / 18
GMM: The criterion
Example with 2 assets
The model:
) − 1 ) − 1
(θ)(1 + R1 t+1
(θ)(1 + R2 t+1
g1(X g2(X
, θ) , θ)
= E(g(X , θ)) = 0, t+1
where 2 is the number of assets (1 moment condition per asset).
Empirically (after replacing “expectations” with “sample means”):
T−1 1 T−1 T−1
1 mt+1(θ)(1+Rt+1)−1 = 1 g1(Xt+1,θ) = 1 g(X ,θ)=g (θ)≈0.
T t=1 mt+1(θ)(1+Rt+1)−1 Estimation criterion:
T t=1 g (Xt+1,θ)
T t=1 2×1
1 T−1 1
1 T−1 2
t=1 g (Xt+1,θ)
t=1 g (Xt+1,θ) θT = argmin T t=1 g (Xt+1,θ) T t=1 g (Xt+1,θ) WT 1 T−1 2
= argming (θ)⊤ W g (θ) = argminQ (θ). T TT T
θ θ 1×2 2×2 2×1 1×1
. Econometrics for Finance Lecture 3 6 / 18
GMM: The criterion
Example with 2 assets
Recall the estimation criterion:
θT = argmin θ
T t=1 g (Xt+1,θ) 2×2
g (Xt+1,θ) T
argminw1
g (Xt+1,θ)
1 T−1 T t=1
w3 T t=1 g (Xt+1,θ)
1 T−1 T t=1
g (Xt+1,θ)
g (Xt+1,θ)
t=1 g (Xt+1,θ) WT 1 T−1 2 .
If WT = I2, then we minimize the sum of the squared pricing errors:
T t=1 g (Xt+1,θ) 1 T−1 2
T t=1 g (Xt+1,θ)
If WT is a generic symmetric matrix, then we minimize a “weighted” sum of the squared pricing errors:
θT = argmin θ
t=1 g (Xt+1,θ)
T t=1 g (Xt+1,θ) T
2
= argmin 1 1 g1(Xt+1,θ) +1 1 g2(Xt+1,θ) .
T−1 θ T t=1 T t=1
T−1 2
g (Xt+1,θ) +w2
T−1 11 12
g (Xt+1,θ) g (Xt+1,θ). T t=1
1T−1 1
1T−11
1T−1 1
w 1 T−1 2
3 2 T t=1g(Xt+1,θ)
g (Xt+1,θ) +
. Econometrics for Finance Lecture 3
GMM: Some important ingredients
Recall the criterion function:
QT(θ) = gT(θ)⊤ WT gT(θ).
1×1 1×N N×N N×1
Thus, for m = 1, …, d, the first derivative of the criterion function is:
⊤ ∂QT (θ)
T−1 T−1
Tt=1
N×N
∂QT (θ) ∂θ
∂θ 1 ∂QT (θ) 1 ∂g(Xt+1,θ) 1
= ··· where =2
g(X ,θ) Tt+1
∂QT (θ) ∂θd
and, for m, j = 1, …, d, the second derivative of the criterion function is:
∂2QT (θ) ⊤
∂2QT (θ) ∂θ1∂θ1 ∂2QT (θ)
∂2QT (θ) ∂θ1∂θ2 ∂2QT (θ)
· · · ∂2QT (θ) ∂θ1∂θd
··· ··· 1222 ∂θ∂θ ··· ··· ··· ··· ∂2Q (θ)
d×d ··· ··· ··· T ∂θd∂θd
T−1 ⊤ T−1 where ∂ QT(θ) = 2 1 ∂g(Xt+1,θ) W 1 ∂g(Xt+1,θ)
T T t=1 ∂θm T t=1 ∂θj
T−1 ⊤T−1 + 21∂g(Xt+1,θ) W 1g(X ,θ).
Tt+1 T t=1 ∂θm∂θj T t=1
. Econometrics for Finance Lecture 3
GMM: Assumptions
1 We will assume that the data is IID for now. We will consider dependent, stationary data in the future.
2 We will assume that the weight matrix WT is such that WT →p W .
3 Because WT will be defined as a fixed matrix (the identity matrix, for example) or as a data-driven sample average, this property will always be true.
4 For the sample average, it will be true by the WLLN.
. Econometrics for Finance Lecture 3 9 / 18
GMM: A useful Taylor’s expansion (around θ0)
By Taylor’s expansion, stopped at the first order, around the true
∂ Q ( θ ) ∂ Q ( θ ) ∂ 2 Q ( θ )
T T − T 0 = T 0 θT−θ0 .
d×1 vector
d×d matrix
Note: ∂QT (θT ) ≈ 0. In fact, we are minimizing Q (θ) with respect
to θ and θT is the minimizer. It follows that
∂2QT (θ0)−1 ∂QT (θ0)
θT−θ0 =− . ∂θ∂θ⊤ ∂θ
. Econometrics for Finance Lecture 3 10 / 18
GMM: a useful Taylor’s expansion (around θ0)
∂2QT(θ0)−1 ∂QT(θ0) θT−θ0 =− .
∂θ∂θ⊤ ∂θ Elements of the d × 1 gradient vector ∂QT (θ0) : For m = 1, …, d,
T−1 ⊤ T−1
∂QT(θ0)=2 1∂g(Xt+1,θ0) W 1g(X
∂θm T t=1 ∂θm T t=1 Elements of the d × d Hessian matrix ∂2QT (θ0) : For m, j = 1, …, d,
⊤ T−1 ∂ QT(θ0) = 2 1 ∂g(Xt+1,θ0) W 1 ∂g(Xt+1,θ0)
T T t=1 ∂θm T t=1 ∂θj
T−1 ⊤ T−1
+ 21∂g(Xt+1,θ0) W 1g(X ,θ0).
T t=1 ∂θm∂θj T t=1
. Econometrics for Finance Lecture 3
Consistency: the gradient
Elements of the d × 1 gradient vector ∂QT (θ0) : ∂θ
∂Q (θ0) T=2
1 ∂g(X
⊤
t+1 T ∂θm
W g(X ,θ0) T t+1
T t=1
p →Wp
→ E g(Xt+1,θ0) = 0
p ∂g(Xt+1,θ0)
→p 2Γ⊤0,m W 0
E g(Xt+1 , θ0 )
= 0? Because this is what the moment conditions imply! See All convergences in probability are due to the WLLN.
Important: Why is first slide.
Thus, for the full gradient, we have:
∂QT(θ0) →p2Γ0,1 Γ0,2 … Γ0,d⊤W0=2Γ⊤0 W0=0 ∂θ
. Econometrics for Finance Lecture 3 12 / 18
Consistency: the Hessian
Elements of the d × d Hessian matrix ∂2QT (θ0) :
2 T−1 T−1 ∂ Q (θ0) 1 ∂g(X ,θ0) 1 ∂g(X ,θ0)
T t+1t+1
p ∂g(Xt+1,θ0)
p →W p ∂g(Xt+1,θ0)
= Γ0,m ⊤
→ E ∂θj = Γ0,j
1 T − 1 ∂ g ( X , θ 0 ) 1 T − 1
t+1 +2 WT g(Xt+1,θ0) .
T ∂θm∂θ T
t=1 jt=1
p ∂g(Xt+1,θ0)
→p W →p Eg(X ,θ ) = 0
→E 2Γ⊤0,m W Γ0,j
All convergences in probability are due to the WLLN.
Thus, for the full Hessian, we have:
∂2QT(θ0) →p 2Γ0,1 Γ0,2 … Γ0,d⊤ W Γ0,1
Γ0,d = 2Γ⊤0 WΓ0
. Econometrics for Finance Lecture 3
Consistency: putting gradient and Hessian together
−1 θ − θ = − ∂ 2 Q T ( θ 0 )
∂ Q T ( θ 0 ) p→ 0 ∂θ∂θ⊤ ∂θ
p →p 2Γ⊤0 WΓ0 →2Γ⊤0 W0
Conclude: The GMM estimator (θT ) is a consistent estimator for the true parameter vector (θ0).
In other words, it converges to θ0 in probability as T → ∞.
. Econometrics for Finance Lecture 3 14 / 18
Asymptotic normality: the standardized gradient Elements of the d × 1 standardized gradient vector ∂QT (θ0) :
√ ∂QT(θ0) T=2
T−1 T−1 1∂g(Xt+1,θ0) 1
W √g(X,θ0)
T t+1
Tt=1∂θmTt=1
p →W
p ∂g(Xt+1,θ0) → E ∂θm
2Γ⊤0,m W N (0, Φ0 )
d ⊤
→ N0, E g(X ,θ )g(X ,θ ) t+1 0 t+1 0
The first two terms converge in probability (by the WLLN). The last term converges in distribution (by the CLT). The entire term converges by Slutsky’s theorem (in distribution). Thus, for the full standardized gradient, we have:
√ ∂QT(θ0) d
⊤ ⊤ ⊤ WN(0,Φ0) = 2Γ0 WN(0,Φ0) = N(0,4Γ0 WΦ0WΓ0)
→2 Γ0,1 . Bandi
Γ0,2 … Γ0,d
Nonlinear Econometrics for Finance Lecture 3 15 / 18
Asymptotic normality: putting standardized gradient and Hessian together
∂θ∂θ⊤ ∂θ
√ TθT−θ0 =−
√ ∂QT(θ0) T
→p 2Γ⊤0 WΓ0 →d N(0,4Γ⊤0 WΦ0WΓ0)
N(0,(2Γ⊤0 WΓ0)−14Γ⊤0 WΦ0WΓ0(2Γ⊤0 WΓ0)−1) N(0,(Γ⊤0 WΓ0)−1Γ⊤0 WΦ0WΓ0(Γ⊤0 WΓ0)−1)
V(θT)= 1(Γ⊤WΓ0)−1Γ⊤WΦ0WΓ0(Γ⊤WΓ0)−1
∂g(Xt+1,θ0) Γ0=E ⊤
⊤ Φ0 = E g(Xt+1, θ0)g(Xt+1, θ0) .
∂θ N×d
. Econometrics for Finance Lecture 3
Asymptotic normality: implications
Conclude: The GMM estimator is asymptotically normally distributed (as T → ∞).
The asymptotic variance depends on three quantities, Γ0, Φ0 and W.
Γ0 and Φ0 are expectations. They can be estimated using sample means by the WLLN.
W can just be replaced by the initial weight matrix (which is a choice variable). Thus,
Notice that, in order to compute Γ0 and Φ0, we need to use θT (since θ0 is
Once we have V(θT ), we can compute confidence intervals, test hypothesis and
T0 0 0 V(θT ) = 1 (Γ⊤WT Γ0)−1Γ⊤WT Φ0WT Γ0(Γ⊤WT Γ0)−1
Φ0 = 1 g(Xt+1,θT)g(Xt+1,θT)⊤. t=1
T−1 Γ0=
∂g(Xt+1, θT ) ∂θ⊤
so on. In other words, we can do statistical inference.
. Econometrics for Finance Lecture 3 17 / 18
Let us see GMM estimation in practice using Matlab …
. Econometrics for Finance Lecture 3 18 / 18
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com