Midterm_2022
Indicate the Following:¶
Copyright By PowCoder代写 加微信 powcoder
Name: _ SID: __ GSI: ___
Midterm 2022 – EEP/IAS 118 – Villas Boas¶
Instructions: Please answer questions in the boxes and spaces provided. This is an open book midterm. To receive full credit, answers must include a correct answer, demonstrate all steps used to obtain the answer, and be uploaded to Gradescope within the allotted deadline, correctly indicating on which pages your answers to each question can be found. This exam does not require R. R may be used as a calculator or to obtain critical values from statistical tables, but no credit will be given for the use of confidence interval or hypothesis test functions. If R is used to complete steps on a problem, make sure to include the utilized code and output. Final answers must be placed in the appropriate text/Markdown cell – text/comments in coding cells will not be graded. 75 Points Total
Question 1 (Total: 10 Points)¶
In January 2020 Harvard School of Public Health and Politico produced a report based on a nationally representative survey of 1004 randomly selected U.S. adults contacted via phone, focused on attitudes towards Build Back Better legislation. The responses to the question below posed to half the sample, N=502, will be used in your answer to the questions following.
Each table cell represents the percentage of respondents for a given category (total respondents, and Democrats, Republicans, or Independents separately) who agreed with the row’s statement.
(a) (5 points) Please test at the 10% significance level whether the true proportion of all respondents in January 2020 (502 respondents) that believe Build Back Better (BBB) components will slow climate change (NET) (that is, respondents that answered a lot or a little) is equal to 50%. Use the five steps of hypothesis testing. Round to 4 decimal places.
➡️ Type your answer / steps for Q1-(a) here. (Markdown Cell)
# Include any code used for Q1-(a) here only. (Coding Cell) Final answers do not belong in this cell.
(b) (5 points) Within the subset of N=111 republican respondents, construct a 99% confidence interval for the true proportion in this subpopulation that agree that BBB will not make much difference (denote this as p_Rep_not) to slow climate change.
Round your final answer to 4 decimal places.
Write Solution Here: 99% Confidence interval p_Rep_not = [ , ]
➡️ Type your answer / steps for Q1-(b) here. (Markdown Cell)
# Include any code used for Q1-(b) here only. (Coding Cell) Final answers do not belong in this cell.
Question 2 (Total: 45 Points)¶
(a) – (h) (5 points each) Please complete the missing elements in the Stata output below from a regression of
$$Y = \beta_0 + \beta_1 \text{Upwind Pollution (PM2.5)} + \beta_2 \text{Downwind} + \beta_3 \text{age} + \beta_4 \text{educ} + \beta_5 \text{Inc} + u$$where $Y$ is a measure of Mental Recall in Cognitive tests.
➡️ Type your answer / steps for Q2-(a) here. (Markdown Cell)
➡️ Type your answer / steps for Q2-(b) here. (Markdown Cell)
➡️ Type your answer / steps for Q2-(c) here. (Markdown Cell)
➡️ Type your answer / steps for Q2-(d) here. (Markdown Cell)
➡️ Type your answer / steps for Q2-(e) here. (Markdown Cell)
➡️ Type your answer / steps for Q2-(f) here. (Markdown Cell)
➡️ Type your answer / steps for Q2-(g) here. (Markdown Cell)
➡️ Type your answer / steps for Q2-(h) here. (Markdown Cell)
# Include any code used for Q2-(a)~(h) here. (Coding Cell) Final answers do not belong in this cell.
(h) (5 points) ) If I were to run the same regression with a variable GPA (grade point average) as an additional explanatory variable, the regression of mental recall on GPA, age, education, income, downwind, and upwind pollution has an R-squared of 0.99. Adding GPA to the regression model will not change the coefficients much compared to the estimates of the table you filled out, but the standard errors will increase a lot for the variables in the regression model.
Indicate: True or False or Depends: TRUE / FALSE / DEPENDS. And concisely explain in a maximum of 2 sentences.
➡️ Type your answer / steps for Q2-(h) here. (Markdown Cell)
(i) (5 points) Using the R output below, test whether age and educ jointly influence the mental recall test outcome Y. Show your work using the five steps of hypothesis testing. Use a significance level of $\alpha = 0.10$.
Left Regression: $$reg1 \text{ <- } lm(y \sim upwind + downwind + educ + age, ~mydata)$$ Right Regression: $$reg2 \text{ <- } lm(y \sim upwind + downwind, ~mydata)$$ ➡️ Type your answer / steps for Q2-(i) here. (Markdown Cell) # Include any code used for Q2-(i) here. (Coding Cell) Final answers do not belong in this cell. Question 3 (Total: 18 Points)¶ A researcher wishes to understand the effect of recent events on biotech companies’ stock prices. They collected data on stock prices for the major vaccine manufacturer in their country from October 2020 through February 2022. The testable hypothesis is whether, after controlling for other factors, the vaccine company stock price (measured in dollars) was affected by the spread of COVID-19 in that country and also affected by the FDA approvals of vaccination for children. The researcher ran the following regression, where $Y_d$ = the log of stock price of the major vaccine manufacturer on day d. The researcher has N=500 days of data. $\log(Number~Cases)_d$ is the log of daily COVID-19 cases in the country, and $FDA_d$ is a dummy variable equal to one if the FDA approves vaccinations for children under 12, and equal to zero otherwise. These are the estimated parameters and standard errors in parentheses underneath. $$Y_D = 0.37 - 0.06 \log(Number~Cases)_d + 0.05 FDA_d~~~~~~~~~~~~ R^2 = 0.22$$$$~~~~~~~~~(0.07)~~~(0.01)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~(0.001)~~~~~~~~~~~~~~~~~~~~~N = 500$$ (a) (6 points) Interpret the coefficient on log(Number Cases) (remember to discuss Sign, Size, and Significance). ➡️ Type your answer / steps for Q3-(a) here. (Markdown Cell) # Include any code used for Q3-(a) here. (Coding Cell) Final answers do not belong in this cell. (b) (6 points) What would be the predicted level of the vaccine company’s stock price (measured in dollars) on an FDA approval day when the number of COVID-19 cases equals 60000? Show your work/explain. ➡️ Type your answer / steps for Q3-(b) here. (Markdown Cell) # Include any code used for Q3-(b) here. (Coding Cell) Final answers do not belong in this cell. (c) (3 points) The researcher may be concerned that the economy as a whole is generally doing poorly, like for example due to the Russian-Ukraine crisis during certain days, and that this general decline also affects stock prices. Therefore, the researcher decides to add a Crisis dummy variable ($ForeignCrisis_d$) to the above regression and obtains the new sample regression: $$Y_d = 0.05 - 0.5 \log(Number~Cases)_d + 0.05 FDA_d - 0.09 ForeignCrisis_d ~~~~~~~R^2 = 0.42$$$$~~~~~~~(0.07)~~~~(0.01)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~(0.001)~~~~~~~~~~~~~~~~~~(0.001)~~~~~~~~~~~~~~~~~N = 500$$Compare the coefficients on $FDA_d$ in the two regressions (the regression immediately above versus the regression for parts (a) and (b)). How does the coefficient change? Based on the estimated regression in (c), what can you say about the correlation between $FDA$ and the Crisi variable? Show your work/explain. ➡️ Type your answer / steps for Q3-(c) here. (Markdown Cell) # Include any code used for Q3-(c) here. (Coding Cell) Final answers do not belong in this cell. (d) (3 points) TRUE or FALSE: “Suppose the correlation between the log number of COVID-19 cases and the Unemployment Rate is positive, and that unemployment is negatively correlated with stock prices of companies in general, holding all else constant. If you add the unemployment rate into the regression, this will generate a larger estimate of the coefficient on the log number of COVID-19 cases.” State TRUE or FALSE below, and explain your answer in a maximum of two sentences and use equations if helpful. ➡️ Type your answer / steps for Q3-(d) here. (Markdown Cell) Question 4 (Total: 2 Points)¶ TRUE or FALSE: The R-squared (model fit) of a linear model between Y=Mental Recall, a constant, and x=Upwind PM2.5 is going to be very good (close to 1) given the above scatter plot. Indicate: True or False or Depends: TRUE / FALSE / DEPENDS. And concisely explain in a maximum of 2 sentences. ➡️ Type your answer / steps for Q4 here. (Markdown Cell) 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com