ECM308 Econometrics
COURSEWORK 2021/22
Answer BOTH PARTS of the assignment.
Part I: Regression Analysis
The data set HTV.dta contains information on wages, education, parents’ edu-
cation, and several other variables for 1,230 working men in 1991.
1. Undertake preliminary analysis of the data to answer the following questions.
(a) What is the range of the educ variable in the sample? How many different
values are taken on by educ in the sample? [1 marks]
(b) What percentage of men completed 12th grade but no higher grade? Do
the men or their parents have, on average, higher levels of education? [1
marks]
(c) Does educ have a continuous distribution? Plot a histogram of educ
with a normal distribution overlay. Does the distribution of educ appear
anything close to normal? Does this observation matter for the further
analysis? Explain your answer. [5 marks]
2. Consider the regression model
educ = β0 + β1motheduc + β2 f atheduc + u. (1)
Write down this linear regression model in matrix notation. Estimate the
model and report the estimation output. Describe how the software computed
the coefficient of the regression. Carefully interpret the estimated coefficients.
Are there any surprises in the slope estimates? [8 marks]
3. Estimate a larger linear regression model that includes also the variable abil
(a measure of cognitive ability). Report the estimation results. Does “abil-
ity” help to explain variations in education, even after controlling for parents’
education? Explain carefully. [5 marks]
4. Explain carefully how you would test the null hypothesis that β1 = β2 against
a two-sided alternative. What is the p-value of the test? With the help of a
graph, explain what a p-value means for this test. Last, say if you reject the
null at the 1% significance level. [8 marks]
1
5. Now estimate an equation where abil appears in quadratic form:
educ = β0 + β1motheduc + β2 f atheduc + β3abil + β4abil
2 + u. (2)
Report the estimation results.
(a) Explain carefully how you would test the null hypothesis that educ is
linearly related to abil against the alternative that the relationship is
quadratic. Undertake the test. [5 marks]
(b) Using THREE criteria, explain whether and why you prefer specification
in (2) to the two smaller models you have previously estimated. [6 marks]
(c) Using the estimates β̂3 and β̂4, use calculus to find the value of abil, call
it abil∗, where educ is minimized. You could also verify that the second
derivative is positive so that you do indeed have a minimum. Argue that
only a small fraction of men in the sample have “ability” less than abil∗.
Why is this last observation important? [10 marks]
6. Add the two college tuition variables to the regression specification (2).
(a) Explain carefully how you would test whether the tuition variables are
jointly statistically significant. Undertake the test. [5 marks]
(b) What is the correlation between tui t17 and tui t18? Explain why using
the average of the tuition over the two years might be preferred to adding
each separately. [8 marks]
(c) Estimate a linear regression model that, instead of tui t17 and tui t18,
includes the average tuition variable. Does the coefficient on the aver-
age tuition variable make sense when interpreted causally? Explain your
answer. [9 marks]
Part II: Time Series
7. Consider an ARMA(2,1) process
yt = θ1 yt−1 + θ2 yt−2 + εt +φεt−1,
where εt is white noise.
(a) Under what conditions is this process covariance stationary? Explain your
answer. [5 marks]
2
(b) Assuming that the process is covariance stationary, derive the mean and
covariance function of this process. [10 marks]
8. Excel file FTSE_DAX contains the historic closing values of FTSE100 Index
and DAX Index from 24 September 2020 to 22 October 2021.
(a) Plot the closing prices of both indices. Do the plots resemble a plot of
stationary data? Describe how to undertake a formal stationarity test.
According to the test you described, are the data stationary? [8 marks]
(b) Describe carefully how you would test whether FTSE100 and DAX indices
are cointegrated. Undertake the test and interpret the results. [6 marks]
3