编程辅导 ECON6300/7320: Advanced Microeconometrics

Instruction
ECON6300/7320: Advanced Microeconometrics
Problem Set 1
March 14, 2024

Copyright By PowCoder代写 加微信 powcoder

Answer all questions and clearly label your answers. For empirical questions, you should show your R script(s) and outputs (e.g., screenshots for commands, tables, and figures, etc.). You will lose 2 points whenever you fail to provide R commands and outputs. When you are asked to explain or discuss something, your response should be brief and compact. You should upload your assignment (in PDF or Word format) via the “Turnitin” submission link (in the “Problem Set 1” folder under “Assessment”) by 11:59 AM on the due date March 28, 2024. Do not hand in a hard copy. You are allowed to work on this assignment in groups; that is, you can discuss how to answer these questions with your group members. However, this is not a group assignment, which means that you must answer all the questions in your own words and submit your work separately. The marking system will check the similarity, and UQ’s student integrity and misconduct policies on plagiarism apply.
OLS (30 points)
Use the cps09mar dataset described in Tutorial 1. Take the sub-sample of non-Hispanic women to estimate the following wage equation:
log(wage) = β0 + β1education + β2experience + β3experience2/100 + u, (1) where wage = earnings/(hours × week) and experience = age − education − 6.
(a) Estimate equation (1) using the described sub-sample and compute the R2 (5 points).
(b) Include a set of dummy variables for regions and marital status in (1) and estimate the extended model. For regions, create dummy variables for Northeast, South, and West so that Midwest is the excluded group. For marital status, create variables for married (marital ≤ 3), widowed or divorced, and separated, so that single (never married) is the excluded group (6 points). Calculate standard errors using the HC0, HC1, and HC3 methods (3 points). Are they very different?
(c) In what follows, use the estimation results obtained from (a). Let θ be the ratio of the return to one year of education to the return to one year of experience for experience = 10. Write θ as a function of the regression coefficients and compute θˆ from the estimated model (3 points). Suppose the OLS estimator for model (1) is consistent. Is θˆ consistent (3 points)? Hint: Apply continuous mapping theorem.
(d) Compute the regression function at education = 16 and experience = 10 (2 points). Compute a 95% confidence interval for the regression function at this point (3 points). Hint: You can use the glht() function in the multcomp package.

(e) Consider the same out-of-sample individual as in (d) (education = 16, experience = 10). Construct an 90% forecast interval for their wage (5 points). Hint: To obtain the forecast interval for the wage, apply the exponential function to both endpoints.
MLE: Tobit Regression (25 points)
Consider the Tobin (1958) regression model:
Y∗ =XTβ+ewithe|X∼N(0,σ2), (2)
Y =max{Y∗,0}. (3)
Tobin (1958) used this model to study household consumption Y of durable goods. He observed that in survey data, Y is zero for a positive fraction of households. He proposed treating the observed Y as a censored realization from a latent continuous variable Y ∗ (like “willingness to pay”);thatis,Y =Y∗ whenY∗ >0andY =0whenY∗ ≤0. HeretheobservedvariableY is censored Y ∗ from below at zero. After all, negative pay is infeasible. This model is known as Tobit regression, censored regression, or Type I Tobit model.
Since e|X ∼ N(0,σ2) is assumed in (2), Tobit model (2)–(3) is parametric and its unknown parameters (β,σ) can be estimated using the maximum likelihood method. By definition, it is easy to write out the distribution function of Y conditional on X:
P(Y ≤y|X =x)=P(Y∗ ≤0|X =x)1[y≤0] ·P(Y∗ ≤y|X =x)1[y>0] =P(xTβ+e≤0)1[y≤0] ·P(xTβ+e≤y)1[y>0]
= Φ(−xT β/σ)1[y≤0] · Φ((y − xT β)/σ)1[y>0], (4) where 1[·] is the indicator function1 and Φ(·) is the CDF of N(0,1). Taking derivative of (4)
with respect to y yields the likelihood function:
fY |X (y|x) = Φ(−xT β/σ)1[y≤0] · [σ−1φ((y − xT β)/σ)]1[y>0], (5)
where φ(·) is the PDF of N(0,1).
Use the CHJ2004 dataset (see the data description file). The variables tinkind and income are household transfers received in-kind and household income, respectively. Divide both vari- ables by 1000 to standardize.
(a) Estimate a linear regression of tinkind on income and income2 (5 points).
(b) Calculate the percentage of censored observations (tinkind = 0) (3 points). Estimate a linear regression of tinkind on income and income2 by omitting the censored observations (5 points).
(c) Estimate a Tobit regression of tinkind on income and income2 (6 points). Explain the differences between your results in (a)–(c) (6 points). Hint: You can use the tobit() function in the AER package. Alternatively, you can also use (5) to code up the maximum likelihood estimation and inference by yourself and check if you can replicate the results returned by tobit(). But this is optional.
11[A] = 1 if event A occurs and 1[A] = 0, otherwise.

2SLS and GMM (25 points)
In an influential paper, (1995) suggested that if a potential student lives close to a college, this reduces the cost of attendance and raises the likelihood that the student will attend college. However, college proximity does not directly affect a student’s skills or abilities, so it should not affect their wage. These considerations suggest that college proximity can be a valid IV for education in a wage regression.
Use the Card1995 dataset to replicate the baseline analysis conducted in Card (1995). Con- sider the following model:
log(wage) =β0 + β1education + β2experience + β3experience2/100
+ β4black + β5south + β6smsa + u, (6)
where education is years of schooling and experience = age − education − 6. For all the questions below, only use data for 1976. See the data description file for variable definitions.
(a) Estimate model (6) by OLS and 2SLS (using nearc4 as the instrument for education) (8 points).
(b) Estimate model (6) by 2SLS using instruments {nearc4a, nearc4b} (3 points). What is the impact on the structural estimate of model (6) (2 points)?
(c) Are the instruments used in (b) strong or weak? Test it (4 points).
(d) Use the 2SLS regression in (b) to test the exogeneity of education (4 points).
(e) Use the 2SLS regression in (b) to test the exogeneity of instruments (4 points).
(f) (Optional) Re-estimate model (6) using the same set of instruments as in (a) by efficient GMM. Do the results change meaningfully? Hint: Use the gmm() function in the gmm package. This question has no points; it is here just because we don’t have room for it in Tutorial 4.
Theoretical Questions (20 points)
1. Prove that the regression errors of LPM must be heteroskedastic (5 points).
2. Prove that E[Y |X] = arg ming E[(Y −g(X))2]; i.e., E[Y |X] has the smallest mean squared error (MSE) among all functions of X. Hint: Consider (Y − g(X))2 = [(Y − E[Y |X]) + (E[Y |X] − g(X))]2 and apply the law of iterated expectation (LIE) (5 points).
3. Consider the following simple linear regression model:
yi = βxi + εi, i = 1, …, n, (7)
where {xi, yi}ni=1 are i.i.d., E[εi|xi] = 0, and V[εi|xi] = σ2. Suppose xi > 1 and E[x2i ] < ∞ for all i = 1, . . . , n. We have the following three estimators of β. βˆ A = X x i y i / X x 2i , i=1 i=1 nn βˆ B = X y i / X x i , i=1 i=1 βˆC=1Xn yi. n i=1 xi (a) Show that βˆA, βˆB and βˆC are all unbiased (6 points). Hint: Use (7) and LIE. (b) Among βˆA, βˆB and βˆC, which one has the smallest variance? Why? (2 points) Hint: Review the Gauss-Markov Theorem.2 (c) Show that βˆC is consistent (2 points). Hint: Apply WLLN and LIE. 2When we say an estimator is linear, we mean the estimator is a linear function of y. 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com