ECON 416/516 – Elias
Exam 1 – Dynamic Causal Effects 155 points total
Please submit all R code along with your answers.
1. The file “traffic2.RData” is a dataset that contains 108 monthly observations on auto- mobile accidents, traffic laws, and some other variables for California from January 1981 through December 1989. Use this dataset to answer these questions.
(a) (5 points) During what month and year did California’s seat belt law, which re- quires drivers to wear seat belts, take effect? When did the highway speed limit increase to 65 miles per hour?
(b) (15 points) Regress the variable ltotacc, which is the natural logarithm of the vari- able totacc (i.e., statewide total accidents) on a linear time trend and 11 monthly dummy variables (which are in the data set), using January as the base month. That is, run the following regression using ordinary least squares (OLS):
ltotacct = β0 + β1t + β2feb + β3mar + β4apr + β5may + β6jun + β7jul + β8aug + β9sep + β10oct + β11nov + β12dec
Interpret the coefficient estimate on the time trend. Also, test for seasonality in total accidents by using an F test on the monthly dummy variables.
(c) (10 points) Add to the regression from part (b) the variables wkends, unem, spdlaw, and beltlaw. That is, run the following regression using OLS:
ltotacct = β0 + β1t + β2feb + β3mar + β4apr + β5may + β6jun + β7jul + β8aug + β9sep + β10oct + β11nov + β12dec + β13wkends + β14unem
+ β15spdlaw + β16beltlaw
Interpret the coefficient on the unemployment variable (unem). Note that the unemployment variable is in percentage points. Does its sign and magnitude make sense to you?
(d) (10 points) In the regression from part (c), interpret the coefficients on spdlaw and beltlaw. Are the estimated effects what you expected? Explain.
(e) (5 points) The variable prcfat is the percentage of accidents resulting in at least one fatality. Note that this variable is in percentage points (i.e., not a proportion). What is the average value of prcfat over this time period? Does the magnitude seem about right?
(f) (15 points) Run the regression in part (c) but use prcfat as the dependent variable in place of ltotacc. That is, run the following regression using OLS:
prcfatt = β0 + β1t + β2feb + β3mar + β4apr + β5may + β6jun + β7jul + β8aug + β9sep + β10oct + β11nov + β12dec + β13wkends + β14unem
+ β15spdlaw + β16beltlaw
Discuss the estimated effects and significance of the speed (spdlaw) and seat belt law (beltlaw) variables.
(g) (10 points) Compute the first order autocorrelation coefficients for the variables prcfat and unem. Are you concerned that either of these variables contains a unit root?
(h) (15 points) Estimate a multiple regression model relating the first difference of prcfat (i.e., ∆prcfat) to the same variables in part (f), except you should first difference the unemployment rate, too. In other words, run an OLS regression with ∆prcfat as the dependent variable, and the following independent variables: an intercept, a linear time trend, all of the monthly dummy variables, wkends, ∆unem (i.e., the first difference of unem), spdlaw, and beltlaw. Do you find any interesting results? (Hint: convert the data to time series objects, use the R function “diff” to create the differenced variables, and use the R function “dynlm” to run the regression with the time series objects. Also, note that for monthly data, the “frequency” option in the “ts” function should be set equal to 12 (i.e., 12 months per year).)
(i) (5 points) Comment on the following statement: “We should always first difference any time series we suspect of having a unit root before doing multiple regression because it is the safe strategy and should give results similar to using the levels”.
(j) (10 points) In the regression of part (f), test the errors for AR(1) serial correlation using a test that assumes strictly exogenous regressors (i.e., by running a regression of the residuals on the lagged residuals). Does it make sense to use the test that assumes strict exogeneity of the regressors? (Hint: Make sure the residuals are a time series object and then use the R functions “lag” and “dynlm”.)
(k) (10 points) In the regression for part (f), obtain serial correlation and heteroskedas- ticity robust (i.e., Newey-West) standard errors for the coefficients on spdlaw and beltlaw, using four lags in the Newey-West estimator. How does this affect the statistical significance of the two policy variables spdlaw and beltlaw compared to what was obtained in part (f)?
(l) (10 points) Now, estimate the same model as in part (f) but using Prais-Winsten estimation and compare the estimates with the OLS estimates. Are there important changes in the policy variable coefficients (spdlaw and beltlaw) or their statistical significance?
(m) (10 points) Using the standard Dickey-Fuller regression (i.e., equation (13) in lec- ture 07), test whether ltotacc has a unit root. That is, estimate the following regression model:
∆ltotacct = α + θltotacct−1 + et Can you reject a unit root at the 2.5% level?
(n) (10 points) Now, add two lagged changes of ltotacc to the test from part (m) and compute the augmented Dickey-Fuller test. That is, estimate the following regres-
Page 2
sion model:
∆ltotacct = α + θltotacct−1 + γ1∆ltotacct−1 + γ2∆ltotacct−2 + et
What do you conclude?
(o) (10 points) Add a linear time trend to the augmented Dickey-Fuller regression from part (n). What do you conclude? (Hint: Use the R function “adfTest”.)
(p) (5 points) Given the findings from parts (m) thru (o), what would you say is the best characterization of ltotacc: an I(1) process or an I(0) process about a linear time trend?
Page 3