MGMTMFE 407: Empirical Methods in Finance
Homework 1
Prof. . Lochstoer TA:
January 11, 2022
Copyright By PowCoder代写 加微信 powcoder
Please use R to solve these problems. You can just hand in one set of solutions that has all the names of the contributing students on it in each group. Use the electronic drop box to submit your answers. Submit the R file and the file with a short write-up of your answers separately.
[The quality of the write-up matters for your grade. Please imagine that you’re writing a report for your boss at Goldman when drafting answers these questions. Try to be be clear and precise.]
Problem 1: Building a simple autocorrelation-based forecasting model
Fama and French (2015) propose a five-factor model for expected stock returns. One of the factor is based on cross-sectional sorts on firm profitability. In particular, the factor portfolio is long firms with high profitability (high earnings divided by book equity; high ROE) and short firms with low profitability (low earnings divided by book equity; low ROE). This factor is called RMW – Robust Minus Weak.
1. Go to ’s Data Library (google it) and download the Fama/French 5 Factors (2×3) in CSV format. Denote the time series of value-weighted monthly factor returns for the RMW factor from 1963.07-2021.10 as ”rmw.” Plot the time-series, give the annualized mean and standard deviation of this return series.
Suggested solution:
Figure 1: Annualized mean: 3.23%, annualized standard deviation: 7.66%.
2. Plot the 1st to 60th order autocorrelations of rmw. Also plot the cumulative sum of these autocorrelations (that is, the 5th observation is the sum of the first 5 autocorre- lations, the 11th observation is the sum of the first 11 autocorrelations, etc.). Describe these plots. In particular, do the plots hint at predictabilty of the factor returns? What are the salient patterns, if any?
Suggested solution: The plots are shown in Figure 2 and 3, and hints at predictabil- ity of the factor returns. The return has a momentum effect in the short term and a reversal effect in the longer term.
3. Perform a Ljung-Box test that the first 6 autocorrelations jointly are zero. Write out the form of the test and report the p-value. What do you conclude from this test?
Suggested solution:
(a) H0 : ρi =0fori=1,…,6 vs. H1 : atleastoneρi ̸=0fori=1,…,6. 2
m ρˆi H0 2 (b) Test statistic: Q(m) = T(T + 2) i=1 T−i ∼ χm
(c) Rejection rule: Reject H0 if Q(m) > χ2m,α or if the associated p-value is less than α = 0.05.
Figure 2: Figure 2: Autocorrelation function of rmw.
Figure 3: Cumulative autocorrelation function of rmw.
(d) From R we get a p-value of 0.002481 < α = 0.05. We reject H0 that the six first autocorrelations are jointly zero. 4. Based on your observations in (2) and (3), propose a parsimonious forecasting model for rmw. That is, for the prediction model rmwt+1 = β′xt + εt+1, (1) where the first variable in xt is a 1 for the intercept in the regression. Choose the remaining variables in xt – it could be only one more or a longer K × 1 vector. While this analysis is in-sample, I do want you to argue for your variables by attaching a ”story” to your model that makes it more ex ante believeable. (PS: This question is purposefully a little vague. There is not a single correct answer here, just grades of more to less reasonable as in the real world). Suggested solution: The predictive variables are the following: the one-period lagged rmw and the sum of the last 40 months rmw. 5. Estimate the proposed model. Report Robust (White) standard errors for βˆ, as well as the regular OLS standard errors. In particular, from the lecture notes we have that T −1 T T −1 Whiteˆ 11′ 1′21′ Var β = xx xxεˆ xx , (2) TTttTtttTtt t=1 t=1 t=1 ˆ 1 1 T − 1 1 T 1 T 1 T − 1 OLS ′ ′ 2 ′ Var β = xx t=1 t=1 t=1 t=1 (In asymptotic standard errors, we do not adjust for degrees of freedom which is why we simply divide by T). Table 1: Regression result with OLS standard errors. RMWt−1 sums40 Constant Dependent variable: RMWt 0.154∗∗∗ −0.014∗∗ (0.006) 0.402∗∗∗ (0.108) ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01 Table 2: Regression result with Robust (White) standard errors. RMWt−1 sums40 Constant Dependent variable: RMWt 0.154 −0.014 (0.009) 0.402∗∗ (0.158) ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01 Problem 2: Nonstationarity and regression models 1. Simulate T time series observations each of of the following two return series N times: r1,t = μ + σε1,t, r2,t = μ + σε2,t, (4) where μ = 0.6%, σ = 5%, and the residuals are uncorrelated standard Normals. Let T = 600 and N = 10, 000. For each of the N time-series, regress: r1,t =α+βr2,t +εt, (5) and save the slope coefficient as β(n), where n = 1,...,N. Give the mean and standard deviation of β across samples n and plot the histogram of the 10,000 β’s. Does this correspond to the null hypothesis β = 0? Do the regress standard errors look ok? 2. Next, construct N price sample of length T based on each return using: p1,t = p1,t−1 + r1,t, p2,t = p2,t−1+r2,t, (6) using p1,0 = p2,0 = 0 as the initial condition. Now, repeat the regression exercise using the regression: p1,t =α+βp2,t +εt. (7) Again report the mean and standard deviation of the N estimated β’s and plot the histogram. Does this correspond to the null hypothesis β = 0? Do the regression standard errors look ok? Explain what is going on here that is different from the previous return-based regressions. Suggested solution: Shown in Figure 4 and Figure 5 for 2.1, shown in Figure 6 and Figure 7 for 2.2. In the first question, the mean of beta is approximately 0 and the standard deviation is 0.04099828. In the second question, the mean of beta is 0.9786 and the standard error is 0.5. The first looks okay and the null hypothesis that β = 0 cannot be rejected. However, in question 2.2, estimated betas can be far from zero and, strikingly, the associated estimated t-statistics are way off. This indicates a failure of OLS. That is indeed the case as we are regressing two non-stationary variables. install.packages("portes") install.packages("sandwich") library(portes) library(zoo) library(lmtest) library(sandwich) require(graphics) library(dplyr) ## Question 1 RMW<-data.frame(F_F_Research_Data_5_Factors_2x3) RMW$ym<-format(as.Date(RMW$Date),"%Y-%m") plot(RMW$Date,RMW$RMW,type="l") annualized_mean<-12*mean(RMW$RMW) annaulized_std<-sqrt(12)*sd(RMW$RMW) autocorr<-acf(RMW$RMW,lag.max=60, type = c("correlation")) cumcorr<-cumsum(autocorr$acf)-1 plot(autocorr,type="l") plot(cumcorr,type="l") Box.test(RMW$RMW, lag = 6, type = "Ljung-Box") # regression RMW$sums40 <- rollsumr(RMW$RMW, k = 40, fill = NA) RMW$lead<-lead(RMW$RMW,1) a<-lm(RMW$lead~RMW$RMW+RMW$sums40) coeftest(a, vcov = vcovHC(a, type = "HC0")) H<-vcovHC(a,type="HC0") plot(12*predict(a),type="l") ## Question 2 beta=rep(0,N) t_stat=rep(0,N) for(n in 1:N){ r1=rnorm(M+T,mean = 0.005,sd = 0.04) r2=rnorm(M+T,mean = 0.005,sd = 0.04) r1=r1[(M+1):(M+T)] r2=r2[(M+1):(M+T)] output=lm(r1~r2) beta[n]=output$coefficients[2] HAC_se=sqrt(NeweyWest(output,lag = 5)[2,2]) t_stat[n]=beta[n]/HAC_se hist(beta,breaks = 100,main = "beta-case1") mean(HAC_se) hist(t_stat,breaks = 100, main = "t_stat-case1") beta=rep(0,N) t_stat=rep(0,N) for (n in 1:N){ p1=rep(0,M+T) p2=rep(0,M+T) r1=rnorm(M+T-1,mean = 0.005,sd = 0.04) r2=rnorm(M+T-1,mean = 0.005,sd = 0.04) for (t in 1:(M+T-1)){ p1[t+1]=p1[t]+r1[t] p2[t+1]=p2[t]+r2[t] p1=p1[(M+1):(M+T)] p2=p2[(M+1):(M+T)] output=lm(p1~p2) beta[n]=output$coefficients[2] HAC_se=sqrt(NeweyWest(output,lag = 5)[2,2]) t_stat[n]=beta[n]/HAC_se hist(beta,breaks=100,main = "beta-case2") hist(t_stat,breaks=100,main = "t_stat-case2") 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com