ECON0019: QUANTITATIVE ECONOMICS AND ECONOMETRICS EMPIRICAL PROJECT 2021
For the questions 1–4 you will need to collapse the data to the current weekend level and focus on the sum of box offices for that weekend. To do so you need to use the following line:
collapse (sum) tickets, by(sat date week year)
1. Plot the time series for ticket sales. Do you see any seasonal patterns or trends?
Copyright By PowCoder代写 加微信 powcoder
HINT: Though there are other ways of achieving this, you might want to declare we are working with a time series observed at a weekly frequency so you can use commands like tsline. If so, this is how you do it: tsset sat date, daily delta(7).
ANSWER: Yes, ticket sales tend to be higher around summer and mid- to late-November (i.e., Thanksgiving).
2. To examine the possibility of a trend, regress tickets on a constant and a set of year dummies. What are the F-statistic and its p-value? What does this suggest regarding the existence of
ECON0019 1
a trend? HINT: The commands tabulate year, generate(d year) and tabulate week, generate(d week) generate year and week dummy variables, respectively, to include as regres- sors. Alternatively you can add i.year and i.week directly as regressors. These will auto- matically omit a base year or week. If you prefer to include all of them and omit the constant, use instead ibn.year and ibn.week and the option noconstant in the regression.)
ANSWER: A regression of tickets on a constant and year dummies (taking 2002 as the ommit- ted dummy) produces an F-statistic of 1.18 (or 0.39 allowing for serial correlation in residuals). The p-value in is 0.3018 (or 0.9507 alllowing for serial correlation in residuals). One fails to reject the hypothesis that all year coefficients are zero and there is thus no evidence for a trend.
3. Now, to investigate any (within year) seasonality in ticket sales, regress tickets on week-of- the-year dummies. Is there evidence of seasonality? What is the p-value for the null hypothesis that all coefficients are equal?
ANSWER: A test for the hypothesis is the F-test for a regression of tickets on a constant and week-of-the-year dummies (with one ommitted category) or the joint test for whether the coefficients on week-of-the-year dummies (with one ommitted category) in a regression on a con- stant, week-of-the-year dummies and year dummies is zero. In either case, allowing for serial correlation in the residuals or not, the p-values for this test are 0.0000. There is thus evidence of seasonality within the year.
4. Estimate an AR(1) model for the residuals obtained after removing weekly seasonal effects. In other words, estimate
tickets residualt = γ0 + γ1tickets residualt−1 + εt.
Is there any evidence of higher viewership in one week correlating with higher attendance in the subsequent week? (Hint: The lag of variable tickets residual is obtained as L.tickets residual in Stata.)
ANSWER: The estimate for γ1 is 0.3449. The t-statistic is 7.20 (or 6.05 allowing for serial correlation in residuals) and it is thus statistically significant at usual levels. There is thus evi- dence that higher viewership in one week is correlated with higher viewership in the subsequent week.
For questions 5–8 please use the original sample in ECON00192021.dta.
5. Does nice temperature (as measured by temperature75) predict the box office (ltickets) dur-
ing the opening weekend? Check this while controlling for all holiday variables, day-of-week ECON0019 2
dummies, week-of-year dummies, and year dummies, and using robust standard errors in your regression.
ANSWER: Yes, the coefficient estimate is -1.016 (SE = 0.222).
6. Does the opening weekend’s nice temperature predict the box office in the weekend that follows the opening one? Use the same controls, but additionally control for the current nice tempera- ture. Why is this additional control a good idea?
ANSWER: Yes, the coefficient estimate is -1.116 (SE = 0.223).
7. Estimate the elasticity of the second-weekend box office with respect to the opening weekend’s one, using opening weekend’s nice temperature as an IV. Keep the same controls from question 6. Interpret the coefficient magnitude. Does your IV coefficient equal to the ratio of the coeffi- cients in questions 6 and 5? Why or why not?
ANSWER: The coefficient estimate is 0.946 (SE = 0.087) is the elasticity of the second-week box office with respect to the first-week box office. It is close to one, meaning that a twice more successful box office (for exogenous reasons) makes the following weekend’s box office nearly twice more successful, too. The coefficient does not equal to the ratio -1.116/-1.016 because the controls in Q5 are different from the controls in the first-stage of this IV. Current temperature is now controlled for, which is a good idea because weather may be autocorrelated, and current weather may have its own effect on the box office. (In addition, the holiday controls are also different: they correspond to the second weekend here, but to the opening weekend in Q5.)
8. Estimate the elasticity from question 7 while using two instrumental variables from the opening weekend: nice temperature and rain. (Add a control for the current rain on top of all previous controls.) You should find a coefficient similar to that from question 7. What do you learn from this finding? (Hint: pay attention to the first stage.)
ANSWER: The coefficient estimate is 0.894 (SE = 0.085). It is similar to the Q7 estimate; indeed, the overidentification test (the Hansen J statistic reported by ivreg2) has a p-value of 0.128 and so the null is not rejected. However, nothing can be learned from it. In the first stage, open rain has a small coefficient of 0.121 and a t-statistic of 1.11, so it is not significantly different from zero and does not contribute much to the first-stage fitted value. Therefore, it is purely mechanical that the second-stage produces a similar coefficient.
For questions 9–11 please use the sample ECON00192021.dta but keep the data for opening Saturdays only. You should have 557 observations.
ECON0019 3
9. Generate successful as a dummy that the box office is at least USD 3 million. Estimate the average partial effect of nice temperature on successful using logit, probit, and the linear prob- ability models, and interpret their magnitudes. Control for the holiday variables, week-of-year dummies, and the year dummies.
ANSWER: The APE is for logit is -0.655; for probit is -0.675. For LPM the APE equals the coefficient at temperature75, which is -0.564. The APE of -0.655 means that on a day when the temperature is nice in an additional 10p.p. of the country, 6.55p.p. fewer releases are successful. Another way to interpret it is that an additional standard deviation of temperature75, which is 0.068, leads to 4.49p.p. fewer releases. It is a sizable effect, relative to a 38.8% successful releases on average.
10. For how many observations does each of these three models predict a probability of success which is outside the [0, 1] interval?
ANSWER: There are 67 such observations for LPM. This cannot happen with probit and logit.
11. Compute the sample mean of successful. We say that each of the models makes a mistake if it predicts a probability of success above the sample mean while success = 0, or the other way round. (Note that we use the sample mean, rather than 0.5, as a threshold.) Report the number of mistakes each of the three models makes.
ANSWER: LPM makes 142 mistakes. Some of you may have noticed that logit and probit dropped 116 observations, for which the outcome is perfectly predicted. These should be viewed as correct predictions. Among the remaining 441 observations, logit makes 159 mistakes while probit makes 157—similar to each other more than LPM. If you didn’t pay attention to the missing observations, your answer should have been 243 and 241 mistakes. We will consider both answers correct but give bonus points to those who understood the issue with the missing predictions.
ECON0019 4
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com