STAD70 Statistics & Finance II
Assignment 1
Due date: Feb 11, 2021 (by 11:59pm). Late submissions will be penalized.
Include your codes in your submission. 1. (Normal mixture)
Copyright By PowCoder代写 加微信 powcoder
(a) Recall that the moment generating function (mgf) of the standard normal distribution N(0,1) is
E[etX ] = e 21 t2 ,
where X ∼ N(0,1) and t ∈ R. Use this result to show that the kurtosis of N(μ,σ2) is 3 for any μ ∈ R and σ > 0. Hint: You may differentiate (repeatedly) under the integral sign, e.g.
dtX dtX dtE[e ]=E dte .
Then evaluate at t = 0.
(b) One way of modelling fat-tailed distributions is to use normal mixtures. Given two normal distributions N(μ1,σ12) and N(μ2,σ2), a normal mixture (with two compo- nents) has density
p(x) = (1 − λ)φ(x; μ1, σ12) + λφ(x; μ2, σ2),
where φ(x;μ,σ2) is the density of N(μ,σ2) and 0 ≤ λ ≤ 1.
Assume μ1 = μ2 = μ. Compute the kurtosis of the distribution in terms of the parameters.
(c) Letμ1 =μ2 =0,σ1 =1,σ2 =4,andλ=0.05. Simulaten=50000samplesfrom the corresponding normal mixture. Plot the density estimate pˆ(x) together with the theoretical density p(x). [In this simulation, do not use a package to simulate directly from the normal mixture. Use only functions in base R such as sample() and rnorm().]
(d) For the distribution p in (c), compute (numerically) the 5% quantile, and compare it with that of the normal distribution whose standard deviation matches that of p.
(a) Differentiating repeatedly, we have
tX d tX 1t2
E[Xe ]= E[e ]=te2 , dt
2tX d2 tX 2 1t2 E[Xe ]=d2tE[e ]=(t+1)e2 ,
3tX d3 tX 2 1t2 E[Xe ]=d3tE[e ]=t(t+3)e2 ,
4tX d4 tX 4 2 1t2 E[Xe ]=d4tE[e ]=(t+6t+3)e2 .
Letting t = 0, we have E[X4] = 3.
Now let X ∼ N(μ,σ2). Then Z = X−μ ∼ N(0,1). The kurtosis of X is given by
X − μ4
KX =E σ =E[Z4]=3.
(b) Let X be distributed as the normal mixture with density p(x). Let φi(x) = φ(x; μ, σi2), i = 1,2. Clearly E[X] = μ. We have
Var(X) = E[(X − μ)2]
Using (a), we have
(x − μ)2p(x)dx
= (1 − λ) (x − μ)2φ1(x)dx + λ (x − μ)2φ2(x)dx
= (1 − λ)σ12 + λσ2.
E[(X−μ)4]= (x − μ)4p(x)dx
= (1 − λ) (x − μ)4φ1(x)dx + λ (x − μ)4φ2(x)dx
= 3(1 − λ)σ14 + 3λσ24. Hence, the kurtosis of X is
1 4 (1 − λ)σ14 + λσ24 KX=Var(X)2E(X−μ) =3((1−λ)σ12+λσ2)2.
[By Jensen’s inequality, we have
(1 − λ)σ14 + λσ24 ≥ ((1 − λ)σ12 + λσ2)2.
Thus the kurtosis of this normal mixture is always greater than or equal to 3. For the parameters given in (c), we have KX ≈ 13.46939.]
(c) Refer to the R file for the implementation. The result is shown in Figure 1. With n = 50000 samples, we have excellent agreement between p and pˆ.
(d) The cumulative distribution function (cdf) of p is
F = (1 − λ)F1 + λF2,
where Fi is the cdf of N(μi,σi2). To find the 5% quantile, we need to solve the equation F(x)−0.05 = 0. In R, this can be done by using e.g. unisolve() (see the R file for details). The solution is x∗ = −1.805703. The corresponding quantile of the normal distribution (with the same mean and variance) is −2.175937. It can be checked that if we decrease the probability value (e.g. 0.5% quantile), the quantiles from the normal mixture are further to the left. That is, the normal mixture has fatter tails.
Density estimate
density estimate theoretical density
−10 −5 0 5 10
N = 50000 Bandwidth = 0.1081
Figure 1: Density estimate and theoretical density.
2. (Relative entropy and excess growth rate) Given two positive probability vectors p = (p1,…,pn) and q = (q1,…,qn) (i.e., pi,qi > 0 and pi = qi = 1), define the relative entropy H(p||q) by
n pi H(p||q) = pi log q .
(a) Using Jensen’s inequality, show that H(p||q) ≥ 0 and equality holds if and only if
p = q. Hint: Think of H(p||q) = −Ep log p(X) where X ∼ p.
(b) Suppose there are n assets. At time t, t = 0,1,…, let Xi(t) > 0 be the value of asset i. Consider a buy-and-hold portfolio whose value at time t is
V1(t)= X1(t)+···+Xn(t). X1(0)+···+Xn(0)
Let V2(t) be the rebalanced portfolio with weights w = (w1,…,wn), where w is a positive probability vector. That is, we have
V 2 ( t ) n X i ( t ) V2(t−1) = wiXi(t−1).
(c) Consider the stocks Ford (F), JPMorgan (JPM), IBM (IBM) and Coca-Cola (COKE). Consider monthly stock prices from Jan 1, 1990 to Dec 31, 2021. Normalize the prices so that the beginning value Xi(0) is 1 and m(0) = (14, 14, 14, 14). Let w =
Let mi(t) = t we have
Xi(t) be the capitalization weight of asset i. Show that for each
X1 (t)+···+Xn (t)
log V2(t) = H(w||m(0)) − H(w||m(t)) + Γ∗w(t), (0.1)
where Γ∗w(t) is the cumulative excess growth rate of the rebalanced portfolio up to
0.0 0.1 0.2 0.3 0.4
(0.1, 0.2, 0.3, 0.4) (in the same order). Compute the performance of the two portfolios and illustrate the decomposition (0.1) with a figure (similar to Figure 2.6 in the notes). Comment on the results you get.
(a) Since − log(·) is strictly convex, we have nqnq
H(p||q) = p (−log)( i ) ≥ (−log) p i
Moreover, equality holds if and only if q1 p1
(b) Consider
log V2(t) − log V2(t − 1)
ipip i=1 i i=1i
= −log(1) = 0. = ··· = qn , i.e., when p = q.
= log V2(t)
V2(t − 1) n
− log V1(t)
V1(t − 1) Xi(t)
X1(t)+···+Xn(t) =log wiXi(t−1) −log X1(t−1)+···+Xn(t−1)
n mi(t)
=log wimi(t−1) i=1
=wilogmi(t−1)+ log wimi(t−1) −wilogmi(t−1) .
mi(t) n mi(t) n i=1 i=1 i=1
We claim that the first term is equal to H(w||m(0))−H(w||m(t)), and the second term (in
[· · · ]) is the excess growth rate γw∗ (t). Summing over time then gives the assertion (0.1). It remains to show the claim.
First consider
m i ( t ) n m i ( t ) w i wilogmi(t−1) = wilog wi mi(t−1)
nwnw = wi log i − wi log i
i=1 mi(t − 1) i=1 mi(t) = H(w||m(t − 1)) − H(w||m(t)).
The second claim amounts to say that
n mi(t) n mi(t) n Xi(t) n Xi(t)
log wimi(t−1) −wilogmi(t−1) =log
i=1 i=1 i=1 i=1
wiXi(t−1) −wilogXi(t−1). We omit the verification (which is easy once conceived). [This amounts to say that the
excess growth rate is independent of the numeraire.]
(c) See the R file for the implementation. The decomposition is shown in Figure 2. We found that the constant-rebalanced portfolio outperforms the cap-weighted portfolio.
decomposition
1990−01−01 / 2021−12−01
Figure 2: The decomposition (0.1). Black: log V2(t)/V1(t). Red: H(w||m(0)) − H(w||m(t)). Green: Γ∗w(t).
3. Consider the 4 stocks in Problem 2(c). Now, consider the sample period Jan 1, 2000 to Dec 31, 2021.
(a) Compute the sample skewness and sample kurtosis for the log returns at daily, weekly and monthly frequencies (similar to Figure 1). For each asset-frequency pair, perform the Jarque-Bera test and report the p-values. Also compute the autocorrelation (lag 1) for the log return and absolute value of the log return (similar to Table 2). Comment on the results.
(b) Illustrate the aggregational Gaussianity property using one of the assets with (i) kernel density estimates (similar to Figure 3.4) and (ii) normal q-q plots. Comment on the results.
See the R file for implementation. (a)
daily weekly monthly
F 0.06367344 -0.3016361 -0.05689116
JPM 0.22250450 -0.1628902 -0.67398152
IBM -0.29792696 -0.2773926 -0.08018066
COKE -0.32089843 -0.1815877 0.06159216
Kurtosis (not excess):
daily weekly monthly
F 17.05323 27.770750 15.558219
JPM 17.00857 15.599044 4.756286
IBM 11.56257 6.863664 5.826147
COKE 14.20171 8.934347 5.462434
JB p-value:
daily weekly monthly
F0 JPM 0 IBM 0 COKE 0
Observations:
0 0.000000e+00
0 8.614220e-13
0 0.000000e+00
0 9.992007e-16
The sample skewness values are close to zero.
The sample kurtosis values are all larger than 3.
The p-values of the JB-test is practically zero for all asset-frequency pair. For each asset, the kurtosis decreases as the frequency decreases.
(b) We plot the outputs here:
Daily log return
Weekly log return
Monthly log return
−0.2 −0.1 0.0 0.1
−0.2 0.0 0.2
−0.2 0.0 0.2
Normal Q−Q Plot
Normal Q−Q Plot
Normal Q−Q Plot
Observations:
−4 −2 0 2 4
−3 −2 −1 0 1 2 3
Theoretical Quantiles
−3 −2 −1 0 1 2 3
Theoretical Quantiles
Theoretical Quantiles
−0.1 0.0 0.1
−0.1 0.0 0.1 0.2
Sample Quantiles
Sample Quantiles
Sample Quantiles
The distribution becomes more similar to a normal distribution as the frequency decreases.
Still, even at the lowest frequency the distribution is significantly non-normal (see e.g. the JB test reported above).
4. Consider weekly log returns of the FTSE 100 index from Jan 1, 2000 to Dec 31, 2015. Leave the last 80 observations for testing; these observations are not used for fitting the model.
(a) Fit an ARMA model to the data. Explain carefully how you arrive at your chosen model. Estimate the parameters and examine the residuals.
(b) We examine the out-of-sample performance of the model. Suppose your fitted ARMA
model in (a) has order (p∗, q∗). For each time t + 1 of the testing period, use the data
up to time t to fit an ARMA(p∗, q∗) model (note that here the order is fixed), and let
rˆ be the forecast of the log return r computed at time t. Compute the sum of
squared errors
(rˆ −r )2
t+1|t t+1 t
and compare it with t(rt+1 −r)2, where r is the sample mean in the training period. (The sum should have 80 terms.) Comment on the results.
(a) In Figure 3 we plot the sample ACF and PACF of the training data.
Sample acf
Sample pacf
0 5 10 15 20 25
0 5 10 15 20 25
Figure 3: Sample ACF and PACF of the training data.
We see significant values in both graphs. For completeness, we perform the Ljung-Box test (with 10 lags) for the ACF. The p-value is of order 10−5. Thus it makes sense to consider a nontrivial ARMA model.
We consider choosing a model using AIC. Since the mean of the return is close to 0, we use include.mean = FALSE in the estimation. For p, q ∈ {0, 1, . . . , 4}, we fit a ARMA(p, q) model (without an intercept) and obtain the following table:
0.0 0.2 0.4 0.6 0.8 1.0
Partial ACF
−0.10 −0.05 0.00 0.05 0.10
q=0 q=1 q=2 q=3 q=4 p = 0 -3397.385 -3399.320 -3398.432 -3411.632 -3414.702 p = 1 -3399.648 -3401.531 -3399.833 -3411.976 -3414.654 p = 2 -3399.353 -3399.619 -3400.078 -3418.482 -3417.838 p = 3 -3407.667 -3408.341 -3416.187 -3417.591 -3415.838 p = 4 -3411.623 -3410.550 -3417.708 -3415.903 -3413.817
The chosen order is (p∗, q∗) = (2, 3). Here is the fitted model (using the training data): Call:
arima(x = data_training, order = model_order, include.mean = TRUE)
Coefficients:
ar1 ar2 ma1 ma2 ma3
0.0310 -0.4961 -0.1058 0.5795 -0.1912
s.e. 0.1689 0.1218 0.1686 0.1128 0.0434
sigma^2 estimated as 0.0006225: log likelihood = 1715.24, aic = -3416.48
Note that the coefficients ar1 and ma2 insignificant. They might be removed from the model to reduce the estimation error. Doing this, we obtain the following model:
arima(x = data_training, order = model_order, include.mean = FALSE, fixed = c(0,
NA, 0, NA, NA))
Coefficients:
ar1 ar2 ma1 ma2 ma3
0 -0.5281 0 0.6156 -0.1363
s.e. 0 0.1061 0 0.0958 0.0311
sigma^2 estimated as 0.000626: log likelihood = 1713.14, aic = -3418.28
In Figure 4 we examine the residuals. Note that some of the autocorrelation coefficients are significant. The p-values from the Box-Ljung test (with different lags) are small. (They are still small even if we don’t restrict the values of ar1 and ma2.) Also the residuals are fat-tailed (as expected). To avoid overfitting, we do not choose a model with more coefficients and keep the current one, where order = c(2, 0, 3) and fixed = c(0, NA, 0, NA, NA).
(b) We obtain
(rˆ t+1|t
)2 = 0.03847438
(rt+1 − r)2 = 0.03661328. t
Unfortunately, the out-of-sample performance of the model is still worse than the constant prediction.
ACF of residuals
Normal Q−Q Plot
0 5 10 15 20 25 −3 −2 −1 0 1 2 3
Lag Theoretical Quantiles
Figure 4: Sample ACF and PACF of the training data.
0.0 0.2 0.4 0.6 0.8 1.0
Sample Quantiles
−0.20 −0.10 0.00 0.05 0.10
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com