Advanced Microeconometrics Homework Assignment 1
1. Estimating Equations (15 marks)
(a) Show that the least squares estimating equations, sometimes also called “normal equations” for β1 and β2 in following specifications of the bi- variate regression model:
are:
whereY = 1 N N i=1
Yi =β1 +β2X2i +ui, i=1,…,N, NN
N >2
i=1 i=1 whose solution is given by
i=1
N
2i i=1
Y =Nβ +β X ,
(1)
i 1 2 2i i=1
i=1 NNN
YX =βX +βX2. (2) i2i 1 2i 2 2i
β2 =
N
N Ni=1 YiX2i − X2i Ni=1 Yi
i=1 , (3) NX2 −(N X2i)2
i=1
N i=1
(b) Express equations (1)-(2) and the solution of these linear equations in
matrix notation. (4 marks)
(c) Write the usual moment condition of the linear regression model in matrix notation. (2 marks)
(d) Show that moment conditions imply the least squares equations (1)-(2). (6 marks)
1
β1 =Y −β2X2,
Yi andX2 = 1 N X2i. (6marks)
(4)
2. Generalised Least Squares (15 marks)
Consider the linear regression model
yi = x′iβ + ui, E[ui|xi] = 0
and suppose that the errors ui exhibit the following correlation structure: ρσ2 if |i − j| = 1
E[u2i |xi] = σ2, E[uiuj |xi, xj ] = 0 otherwise
This implies that the errors of immediately adjacent observations are corre-
lated whereas errors are otherwise uncorrelated. In matrix form we have y = Xβ + u
(a) Verify that Ω = E[uu′] is a band matrix with non-zero terms only on the diagonal and the first off-diagonal; and give these nonzero terms (2 marks). (Hint: E[u2i ] = E[E[u2i |xi]]). (3 marks)
(b) Show that V [βˆ|X] = (X′X)−1X′ΩX(X′X)−1 where βˆ is the OLS esti- mator. (2 marks)
(c) Is the usual OLS estimate s2(X′X)−1 a consistent estimator of V[βˆ|X]? Justify your answer. (2 marks)
(d) Is White’s heteroskedasticity robust estimator of V[βˆ|X] consistent? Justify your answer. (2 marks)
(e) State how to obtain a consistent estimate of V[βˆ|X] that does not depend on unknown parameters. (6 marks)
3. Minimizing a Quadratic Form (20 marks)
Consider the linear regression model
y = Xβ + u
(a) Obtain the formula for βˆ which maximizes the objective function QN(β) = −u′Wu
where W has full rank. (9 marks) 2
(b) For which W does your answer to (a) equal the OLS estimator? (3 marks)
(c) For which W does your answer to (a) equal the GLS estimator? (4 marks)
(d) Use your answer to (c) to explain how you would obtain the Feasible GLS estimator if Ω = E[uu′] is that of question 2 above. (4 marks)
4. Data Analysis (50 marks)
You may use STATA or any other statistical software to answer this ques- tion. The datafile is nerlove63.csv available on Blackboard. These very old data were used by Marc Nerlove in a 1963 classic paper, “Returns to Scale in Electricity Supply,” Chapter 7 in C.F. Christ, ed., Measurement in Economics: Studies in Honor of Yehuda Grunfeld. They are also used in a number of text books. You will use these data on 145 electricity generating plants to study the relationship between costs and output (i.e. a cost func- tion) and to make inferences about returns to scale in electricity generation. The variables in this file are:
ORDER: The number of the observation, ascending in order from smallest in size to largest
COSTS: Total Production Costs in Millions of Dollars (dependent variable) KWH: Kilowatt hours of output, in billions
PL: The wage rate per hour
PF: The price of fuels in cents per million BTU’s PK: The rental price index of capital
The regression model for cost function, obtained from the theory of cost minimization for a given level of output, is specified as
C = ky1/rpα1/rpα2/rpα3/ru 123
where C denotes COSTS, k denotes a constant (an unknown parameter), y denotes KWH, and (p1, p2, p3) are the three input prices (PL,PF,PK), and u is a multiplicative error term. The parameter r is defined as r = α1 + α2 + α3, where α1,α2, and α3 (and A also) are parameters in the Cobb-Douglas production function
y = Axα1xα2xα3. 123
3
(a) Apply the log transformation to all the variables. A variable in lower case with prefix l denotes the log-transformed value of the correspond- ing variable, e.g. lpl means ln(PL) etc. Obtain the correlation matrix of all variables except ORDER. Plot lcosts against lkwh to form a rough idea of the shape of the cost as a function of output. (5 marks)
(b) Using the production function linearized by log transformation, run two regressions. First regress lcosts on lkwh and an intercept. Next regress lcosts on lkwh, lpl, lpk, lpf and an intercept. Compare the coefficient of lkwh in the two regressions. Explain why the two estimates are different. (8 marks)
(c) Using the R2 measure of goodness of fit would you say that the first regression in (b) provides a satisfactory fit to the data? Explain. (3 marks)
(d) Generate the fitted values of the dependent variable in the first regres- sion of part (b). Provide a scatter plot of the fitted and observed values of lcosts. Comment on the goodness of fit of the model. (10 marks)
(e) Generate a scatter plot of observed values of lcosts and the fitted values of the second regression in part (b). Interpret the results. What information does the scatter plot provide regarding the fit of the model? (8 marks)
(f) In the conventional specification of this model it is a standard assump- tion that the error term u has log-normal distribution, i.e. ln(u) ∼ N (0, σ2). What advantages does this assumption have? (5 marks)
(g) Suppose we change the functional form of the cost function to C = ky1/rpα1/rpα2/rpα3/r + u,
123
in which the error term enters additively and is assumed to have N(0, σ2) distribution. Is ordinary least squares an appropriate estimator? Jus- tify your answer. (3 marks)
(h) An investigator proposes the following alternative specification: C =kpα1pα2pα3u.
y123 4
What advantages if any does the original (with multiplicative error) functional form have relative to this one? (4 marks)
(i) Estimate this alternative model and interpret the regression results. (4 marks)
5