Midterm 2022 – EEP 118/ IAS 118 – Villas-Boas
Name _________________________ SID _______________________ GSI ___________________
Instructions: Please answer questions in the boxes and spaces provided. This is an open book midterm. To receive full credit, answers must include a correct answer, demonstrate all steps used to obtain the answer, and be uploaded to Gradescope within the allotted deadline, correctly indicating on which pages your answers to each question can be found. Link to Jupyter Notebook version. 75 Points Total
Question 1. (10 points) In January 2020 Harvard School of Public Health and Politico produced a report based on a nationally representative survey of 1004 randomly selected U.S. adults contacted via phone, focused on attitudes towards Build Back Better legislation. The responses to the question below posed to half the sample, N=502, will be used in your answer to the questions following.
Copyright By PowCoder代写 加微信 powcoder
Each table cell represents the percentage of respondents for a given category (total respondents, and Democrats, Republicans, or Independents separately) who agreed with the row’s statement.
a) (5 points) Please test at the 10% significance level whether the true proportion of all respondents in January 2020 (502 respondents) that believe Build Back Better (BBB) components will slow climate change (NET) (that is, respondents that answered a lot or a little) is equal to 50%. Use the five steps of hypothesis testing. Round to 4 decimal places.
Midterm 2022 – EEP 118/ IAS 118 – Villas-Boas
Name _________________________ SID _______________________ GSI ___________________
b) (5 points) Within the subset of N=111 republican respondents, construct a 99% confidence interval for the true proportion in this subpopulation that agree that BBB will not make much difference (denote this as p_Rep_not) to slow climate change.
99% Confidence interval p_Rep_not = [ _____________________ , ___________________] Round your final answer to 4 decimal places.
Question 2 (45 points) lm(y ~ upwind + downwind + age + educ + inc, mydata)
a – h) (5 points each) Please complete the missing elements in the Stata output below from a
regression of Y = b0+ b1 Upwind Pollution (PM2.5)+ b2 Downwind Pollution + b3 age+ b4 educ + b5 Inc + u Where Y is a measure of Mental Recall in Cognitive tests.
Sum of Squares Explained SSE Sum of Squared Residuals SSR Sum of Squares Total SST
Upwind PM2.5 due to Fires
Downwind PM2.5 due to Fires
Age -0.050 Education 0.065 Income 0.063 Constant 2.510
Number Observations
R squared b) Degrees of Freedom
Conf. Interval]
0.315 -0.007
0.204 5.541
Std. Error
-0.194 0.157
0.079 1.990
0.024 2.680 0.010 f) 0.070 0.890 -0.079 1.508 1.660 0.102 -0.521
0.089 0.052 0.024
-0.418 -0.001 -0.094
Midterm 2022 – EEP 118/ IAS 118 – Villas-Boas
Name _________________________ SID _______________________ GSI ___________________
Answers to a) – h)
Midterm 2022 – EEP 118/ IAS 118 – Villas-Boas
Name _________________________ SID _______________________ GSI ___________________
i) (5 points) If I were to run the same regression with a variable GPA (grade point average) as an additional explanatory variable, the regression of mental recall on GPA, age, education, income, downwind pollution, and upwind pollution has an R-squared of 0.99. Adding GPA to the regression model will not change the coefficients much compared to the estimates of the table you filled out, but the standard errors will increase a lot for the variables in the regression model.
Circle True or False or Depends: TRUE / FALSE / DEPENDS: concisely explain in 2 sentences. ___________________________________________________________________________________ ___________________________________________________________________________________ ___________________________________________________________________________________
j) (5 points) Using the R output below, test whether age and educ jointly influence the mental recall test outcome Y. Show your work using the five steps of hypothesis testing. Use a significance level of α=0.10.
reg1<-lm(y ~ upwind+downwind+educ+age, mydata) reg2<- lm(y ~ upwind+downwind, mydata)
Midterm 2022 – EEP 118/ IAS 118 - Villas-Boas
Name _________________________ SID _______________________ GSI ___________________
Question 3 (18 points) A researcher wishes to understand the effect of recent events on biotech companies’ stock prices. They collected data on stock prices for the major vaccine manufacturer in their country from October 2020 through February 2022. The testable hypothesis is whether, after controlling for other factors, the vaccine company stock price (measured in dollars) was affected by the spread of COVID-19 in that country and also affected by the FDA approvals of vaccination for children.
The researcher ran the following regression, where Yd = the log of stock price of the major vaccine manufacturer on day d. The researcher has N=500 days of data. Log(Number Cases) d is the log of daily COVID- 19 cases in the country, and FDAd is a dummy variable equal to one if the FDA approves vaccinations for children under 12 on day d, and equal to zero otherwise. These are the estimated parameters and standard errors in parentheses underneath.
Yd = 0.37 - 0.06 log(Number Cases)d + 0.05 FDAd R2 = 0.22 (0.07) (0.01) (.001) N = 500
a) (6 points) Interpret the coefficient on log(Number Cases) (remember to discuss Sign, Size, and Significance).
b) (6 points) What would be the predicted level of the vaccine company’s stock price (measured in dollars) on an FDA approval day when the number of COVID-19 cases equals 60000? Show your work/explain.
Midterm 2022 – EEP 118/ IAS 118 - Villas-Boas
Name _________________________ SID _______________________ GSI ___________________
c) (3 points) The researcher may be concerned that the economy as a whole is generally doing poorly, like for example due to the Russian-Ukraine crisis during certain days, and that this general decline also affects stock prices. Therefore, the researcher decides to add a Crisis dummy variable (ForeignCrisisd) to the above regression and obtains the new sample regression:
Yd = 0.05 - 0.5 log(Number Cases)d + 0.05 FDAd -0.09 ForeignCrisisd R2 = 0.42 (0.07) (0.01) (0.001) (0.001) N = 500
Compare the coefficients on FDAd in the two regressions (the regression immediately above versus the regression for parts (a) and (b)). How does the coefficient change? Based on the estimated regression in (c), what can you say about the correlation between FDA and the Crisis variable? Show your work/explain.
d. (3 points) TRUE or FALSE: “Suppose the correlation between the log number of COVID-19 cases and the Unemployment Rate is positive, and that unemployment is negatively correlated with stock prices of companies in general, holding all else constant. If you add the unemployment rate into the regression, this will generate a larger estimate of the coefficient on the log number of COVID-19 cases.”
Circle one: TRUE FALSE
Explain your answer below in a maximum of two sentences and use equations if helpful.
Midterm 2022 – EEP 118/ IAS 118 - Villas-Boas
Name _________________________ SID _______________________ GSI ___________________
Question 4. (2 points)
The R-squared (model fit) of a linear model between Y=Mental Recall, a constant, and x=Upwind PM2.5 is going to be very good (close to 1) given the above scatter plot. Indicate TRUE / FALSE and explain briefly:
______________________________________________________________________ ______________________________________________________________________
END OF EXAM
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com