Econ 527 Assignment 10
The due date for this assignment is Friday November 27.
1. Consider the regression model
where β ∈ R is an unknown scalar parameter, and the following estimator of β:
Yi = βXi+Ui,
ni=1 Yi βn = n .
i=1 Xi Assume that {(Yi,Xi) : i = 1,…,n} are iid.
(a) Specify the weakest conditions under which βn is a consistent estimator, and show its consistency.
the OLS estimator of β is smaller than that of βn derived in part (b).
2. Consider the following model:
Yi = g(Xi)+Vi, (1) E (Vi|Xi) = 0,
EV2|X = σ2, ii
where g : Rk → R is some unknown and potentially nonlinear function.
(a) Assume that EXiXi′ is finite and positive definite, and that E (g (Xi))2 < ∞. Define β = arg minb∈Rk E (g (Xi) − Xi′b)2. We say that Xi′β is the best linear approximation of g (Xi) as it minimizes the average squared distance between the true conditional expectation function g(Xi) and all linear functions Xi′b, b ∈ Rk. Show that β = (EXiXi′)−1EXig(Xi).
(b) Define Ui = Yi − Xi′β. Using the results from part (a), show that EXiUi = 0. (c) Is it true that E (Ui|Xi) = 0?
(d) Find E Ui2|Xi. Is Ui homoskedastic?
(e) Suppose that the econometrician observes iid data {(Yi, Xi) : i = 1, . . . , n} generated from model (1) and let βˆn = (ni=1 XiXi′)−1 ni=1 XiYi. Show that the results in Lecture 8 imply that βˆn is a consistent and asymptotically normal estimator of β defined in part (a). Justify any additional assumptions you have to make. What is the asymptotic variance of βˆn?
3. Consider the same model as in Question 2 with k = 2, 1
Xi= X , i,2
where Xi,2 ∼ N(0,1), and for x ∈ R2 the function g(x) given by
(b) Specify the conditions for the asymptotic normality of βn. Show the asymptotic normality
and find the asymptotic variance of βn.
(c) Suppose now that E(Ui2 | Xi) = σ2 > 0. Show directly that the asymptotic variance of
g(x) = g
x
= x2.
x1 3
1
2
(a) Find β = (β1,β2)′ for this model. Hint: Use the moment generating function of the normal distribution.
(b) Generate n = 2, 000 iid observations as
Yi =g(Xi)+Vi,
where g(·) and Xi are as above, and Vi ∼ Uniform(−10,10) and is independent of Xi. Hint:
• Use runif(n,-10,10) to simulate V ’s.
(c) Estimate the regression
Yi = β1 + β2Xi,2 + Ui,
i.e. compute the OLS estimators of β1 and β2 using the simulated data on Y ’s and X’s.
Report your estimates of β1 and β2. Are the close to the values obtained in part (a)?
(d) Compare the homoskedastic standard errors with the heteroskedasticity-robust standard
errors. Do they appear to be different? Hints:
• To quickly produce heteroskedasticity-robust standard errors after m=lm(y~x), you can use coeftest(m,vcov=hccm(m,type=”hc0″)).
• Do not forget to load the required package (“AER”).
(e) On the same graph, plot the true function g(·), its best linear approximation x′β, and the estimated version of the latter, x′βˆn, where βˆn denotes the OLS estimator of β. Does the linear approximation for the true function g(·) seem to be accurate for most of the observations? Hints:
• Construct a grid of values for the regressor between −4 and 4. For that purpose you can use x=seq(-4,4,0.05), where the third number is the step size.
• Given two arrays of numbers x and y, you can plot a line using plot(x,y,type=”l”,col=”red”,ylim=c(-70,70)). The option ylim=c(-70,70) sets the limits for the y-axis between −70 and 70. (You can use other numbers.)
• To add x- and y-labels, use the options xlab=” ” and ylab=” ” inside the plot() function.
• To add more lines to the same figure, use lines(x,y,col=”blue”).
• Use legend() to add a legend to your figure.
Uˆi = Yi − Xi′βˆn be the OLS residuals. Plot the squared residuals Uˆi2 against Xi,2. • The lm() command saves the residuals. For example, define m=lm(y~x). You can
(f) Let
Do they appear to be homoskedastic or heteroskedastic? Use the results from Question 2(d) and the figure in part (e) of this question to explain the patterns you see. Hint:
access the residuals as m$residuals.
(g) Using 1000 Monte Carlo simulations, simulate the distribution of
T = βˆn,2 − β2 , std.err
where std.err is the heteroskedasticity-robust standard error of βˆn,2. In each Monte Carlo repetition, use the sample size n = 20. Plot the histogram of the simulated distribution against the graph of the standard normal PDF. Does it appear that the asymptotic normal approximation works well? Hints:
2
• The OLS estimator and standard error must be re-computed for each Monte Carlo repetition after new data are simulated.
• After running m=lm(y~x), you can define the following object containing the ro- bust standard errors: ct=coeftest(m,vcov=hccm(m,type=”hc0″)). To access the robust standard error for the slope parameter, use ct[2,2].
• You can access the estimate of the slope parameter using ct[2,1]. You can also access it using m$coefficients[2].
• You can use hist(T,breaks=seq(-low,high,step),freq=FALSE,ylim=c(0,0.4)) to produce the histogram. Here, T is an array of the simulated values of the test statistic, low is the minimum value of the range, high is the maximum value of the range, and step is the width of a bin. (Supply specific values.) The option freq=FALSE produces a density plot instead of frequency.
• Use lines() to add the density of the standard normal distribution to the same figure.
(h) Based on the simulations in part (g), compute the simulated probabilities of the event T < zα and T > z1−α for α = 0.1, 0.05, 0.01. Are the simulated probabilities close to the corresponding values of α? Does it appear that the asymptotic normal approximation works well?
(i) Repeat parts (g)-(h) with n = 500. Discuss the differences from the results in parts (g)-(h)
3