Instruction
Midterm Exam
Econometrics I
[Write Your Name and Student ID Number Here] October 21, 2020
• Type all your answers and codes in a Rmarkdown file and submit both a Rmarkdown file and a pdf file at Avenue.
• Time: Wednesday, October 21, 10 am – Thursday, October 22, 10 am in EDT. (If you are currently in a different time zone, please double-check the deadline.)
• This exam is composed of two parts (Analytical Questions and Empirical Questions).
• This is an open-book exam. You are permitted to use any materials from the class and
the textbook. However, you cannot consult with any person regarding the exam.
• Please answer questions with clear justification and/or with a relevant R code chunk.
• A proper format of the outcome files accounts for 10% of the marks. Please check the
following in your files:
– Isn’t any answer truncated?
– Are the question numbers printed out correctly?
– Are the R chunks printed correctly?
– Did I submit both a Rmarkdown file and a pdf file?
• Should you have any clarification questions, please send me an email at shiny11@mcm aster.ca.
1
Part A: Analytical Questions
1. A pair of random variables (X1,X2) follows a bivariate normal distribution with the following moments E(X1) = 0, E(X2) = 1, var(X1) = 1, var(X2) = 1, and cov(X1, X2) = 0. Are X1 and X2 i.i.d.?
[Write your answer here]
2. In any year the weather can inflict storm damage to a home. From year to year, the damage is random. Let Y denote the dollar value of damage in any given year. Suppose thatin95%oftheyearsY =0butin5%oftheyearsY =30,000.
a. What are the mean and standard deviation of the damage in any year? [Write your answer here]
b. Consider an insurance pool of 100 people whose homes are sufficiently dispersed so that, in any year, the damage to different homes can be viewed as independently distributed random variables. Let Y denote the average damage to these 100 homes in a year. What is the expected value of the average damage, Y ?
[Write your answer here]
c. What is the probability that Y exceeds $3000?
[Write your answer here]
3. Let Y1,…,Yn be an i.i.d. draw from the population of mean μ. A test of H0 : μ = 5 vs. H1 : μ ̸= 5 using the usual t-statistic yields a p-value of 0.065. Does the 95% confidence interval contain μ = 5? What about the 90% confidence interval? Explain.
[Write your answer here]
4. You study the relationship between wage and education. Consider the following population regression:
wagei =β0 +β1educi +ui, ui =educi ·ei
where wagei is an hourly wage and educi is education measured by years. The random
variable ei is independent of educi with E(ei) = 0 and var(ei) = σe2.
a. Show that the conditional mean zero assumption is satisfied, i.e. E(ui|educi) = 0.
[Your answer here]
b. Show that the conditional variance of the error term ui is var(ui|educi) = educi ·σe2. Is the error term homoskedastic or heteroskedastic?
[Your answer here]
c. Why do you think the variance of wagei increases with educi.
[Your answer here]
2
5. Consider the OLS estimator of a linear regression model:
yi =β0 +β1xi +ui i=1,2,…,n.
We have learned that under some regularity conditions the OLS estimator βˆ1 converges to a normal distribution. Why are we interested in deriving the (asymptotic) distribution of βˆ1?
[Your answer here]
6. From a random sample, you have the following results:
Yˆ =2.6 + 4.2X, R=0.03, SER = 2.7 (3.4) (2.1)
a. TestH0 :β1 =0vs.H1 :β1 ̸=0atthe5%level. [Your answer here]
b. Suppose that you learned that Yi and Xi were independent. How can you explain the discrepancy between the test result in (a) and the independence between Yi and Xi?
[Your answer here]
Part B: Empirical Questions
1. The new management at a bakery claims that workers are now more productive than they were under old management, which is why wages have “generally increased.” We will test this claim using the two wage data sets.
a. The data set midterm-data1-old.csv contains a sample of 152 wages under old management. Estimate the mean, standard deviation, and 95% confidence interval.
[Your answer here]
b. The data set midterm-data1-new.csv contains a sample of 146 wages under new management. Estimate the mean, standard deviation, and 95% confidence interval.
[Your answer here]
c. Let wageo be a random variable for the wage under old management and wagen be that under new management. E(wageo) = μo and E(wagen) = μn. Formally state the null hypothesis that there has been no change in average wages.
[Your answer here]
d. Test the null hypothesis in (c) against the alternative at the 5% and 1% levels.
[Your answer here]
3
e. Calculate the p-value for the test in (d). [Your answer here]
2. Use the data set midterm-data2.csv and answer the following questions. In the data set, the variable salary is annual compensation, in thousands of dollars, and ceoten is prior number of years as company CEO.
a. Find the average salary and the average tenure in the sample [Your answer here]
b. How many CEOs are in their first year as CEO (that is, ceoten=0)? What is the longest tenure as a CEO?
[Your answer here]
c. Estimate the following simple regression model and report the results.
log(salary) = β0 + β1ceoten + u.
Note that the dependent variable is log(salary) not salary.
[Your answer here]
d. What is the (approximate) predicted percentage increase in salary given one more year as a CEO? Is the result significant at the 5% level?
[Your answer here]
e. What percentage of log(salary) is explained by ceoten?
[Your answer here]
4