CS计算机代考程序代写 STAT 437 HW 3 (50 pts) Due date: October 15, 2021 by 17:00

STAT 437 HW 3 (50 pts) Due date: October 15, 2021 by 17:00

STAT 437 HW 3 (52 pts) (Due Date: Oct 15, 2021 by 17:00)

A corporation’s earnings in a given year is its income minus its expenses. Our data set for this
homework looks at the relationship between US stock prices, the earnings of the corporations,
and the returns on investment in stocks, with returns counting both changes in stock price and
dividends paid to stock holders. Specifically, our data contains the following variables:

• Date, with fractions of a year indicating months

• Price of an index of US stocks (inflation-adjusted)

• Earnings per share (also inflation-adjusted);

• Earnings_10MA_back, a ten-year moving average of earnings, looking backwards from the
current date;

• Return_cumul, cumulative return of investing in the stock index, from the beginning;

• Return_10_fwd, the average rate of return over the next 10 years from the current date.

Note: in all following questions, “Returns” will refer to Return_10_fwd.

1. (9 pts) Linear models

(a) (2 pt) Run four linear regressions for the returns: on Price; on Earnings; on both Price
and Earnings; and on both variables and their interaction. Report coefficients and
standard errors.

(b) (2 pt) Find in-sample R2 for these four models. Can their R2 ’s be meaningfully com-
pared? If so, which model is preferred by R2?

(c) (5 pts) Use five-fold cross-validation to estimate the generalization error of all four
models. Can these be meaningfully compared? If so, which model is preferred by cross-
validation?

2. (11 pts) Creating a variable

(a) (2 pts) Add a new column, MAPE, to the data frame, which is the ratio of Price to
Earnings_10MA_back. It should have the following summary statistics:

> summary(stock$MAPE)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA’s

4.785 11.708 15.947 16.554 19.959 44.196 120

Why are there exactly 120 NAs?

(b) (2 pts) Linearly regress the returns on MAPE (and nothing else). What is the coefficient
and its standard error? Is it significant?

(c) (2 pts) Make a scatter-lot of the returns against MAPE and add a line showing the
predictions from the model you fit in (a).

Fall 2021 p.1/2

STAT 437 HW 3 (50 pts) Due date: October 15, 2021 by 17:00

(d) (5 pts) What is the R2 of this new model? What is its five-fold cross-validated (CV)
MSE? Are these better or worse than the models in the previous question? In order
for these CV MSE’s to be comparable, make sure they are obtained using the same five
subsets.

3. (8 pts) Inverting MAPE to obtain a new variable 1/MAPE

(a) (2 pts) Linearly regress the returns on 1/MAPE (and nothing else). What is the coeffi-
cient and its standard error? Is it significant?

(b) (2 pts) Make a scatter-lot of the returns against 1/MAPE and add a line showing the
predictions from the model you fit in (a).

(c) (4 pts) What are the R2 and the CV MSE of this model? How do they compare to the
previous ones?

4. (14 pts) Bootstrapping a parametric model, in this problem, use the model you fit in problem
3.

(a) (2 pts) What are the conventional 90% confidence limits for the coefficient on 1/MAPE?
Hint: use confint( ) in R to directly extract the confidence intervals.

(b) (5 pts) Use resampling of residuals to get 90% confidence limits for that coefficient.
What are they?

(c) (5 pts) Use resampling of cases to get 90% confidence limits for that coefficient. What
are they?

(d) (2 pts) Are these compatible with each other? If so, why? If not, explain which seems
best.

5. (10 pts) Kernel regression of returns on MAPE:

(a) (3 pts) Use npreg to estimate a kernel regression of the returns on MAPE. What are
the bandwidth and the cross-validated MSE?

(b) (7 pts) Use resampling of residuals to get a 90% confidence bands for the kernel regres-
sion and add the confidence bands to the plot obtained in Question 2. Is it better than
fitting a linear regression of returns on MAPE?

Fall 2021 p.2/2