Econ6083: Assignment 3 Important:
To submit the assignment, upload your RMD (RMarkdown) file and its HTML output.
Use Knit HTML to convert RMD to HTML.
Group submissions are allowed with maximum 3 names per submission. Include all names on the submission.
Copyright By PowCoder代写 加微信 powcoder
For this assignment, use data set ¡°Affairs¡± in the packages ¡°AER¡±:
suppressMessages(library(AER))
data(“Affairs”)
Create the following variables:
happy : self rating of marriage as 5=very happy .
Cheater : regular affairs: affairs==12
age2 : squared age for possible nonlinear effects
yearsmarried2 : squared years of marriage for possible nonlinear effects
Affairs$happy=(Affairs$rating==5)
Affairs$Cheater=(Affairs$affairs==12)
Affairs$age2=Affairs$age^2
Affairs$yearsmarried2=Affairs$yearsmarried^2
Estimate by OLS (using the lm() function) the regression of Cheater against happy , gender , age , age2 , children , as.factor(religiousness) ,
education , and as.factor(occupation) . This is a so-called linear probability model. We use as.factor() to create dummies out of numerical variables with values representing different categories. However, we treat the age and education variables as numeric.
Note that we assume that the marriage satisfaction variable happy is a sufficient statistic in the sense that conditioning on marriage satisfaction, yearsmarried has no effect on the probability of having regular affairs. Therefore, yearsmarried is not included among the right-hand side controls.
Report the estimated coefficients, their heteroskedasticity-robust standard errors, and significance levels. Use coeftest(NAME,vcov=vcovHC(NAME, type = “HC0”)) to compute heteroskedasticity-robust standard errors and significance levels, where
NAME is the name of a regression-type object created by lm() and other similar commands.
Apply confint(ROBUST,level=0.90) to report the 90% confidence interval for the estimated coefficient of happy , where ROBUST is the name of the object created by the coeftest() command above. By default, confint() computes confidence intervals on all coefficients. However, you can use for example a matrix selection operator [2,] to select just the row corresponding to happy .
Do not use confint() on the object created by lm() as it would produce a confidence interval using homoskedastic standard errors.
What is the estimated effect of happy on the probability of having regular affairs? Is the estimated effect significant?
Does the sign of estimated effect makes sense?
Comment on the magnitude of the effect: does it make sense?
You can use the stargazer() command from the package ¡°stargazer¡± to produce a nicer looking tables for regression outputs.
The marriage satisfaction variable happy can be endogenous due to a reverse causality effect from the regular affairs variable Cheater to happy . To address the potential endogeneity of happy , we re-estimate the liear probability model from part A using IVs.
Since a marriage can be terminated when the partner is unhappy, it is plausible to assume that yearsmarried is related to happy . We continue to maintain the assumption from part A that, conditional on marriage satisfaction, yearsmarried has no direct effect on the probability of having regular affairs. Hence, we can use
yearsmarried as an IV for happy .
To account for a potential non-linear relationship between happy and yearsmarried ,
we use yearsmarried and yearsmarried2 as IVs for happy .
To account for potential heterogeneity of the relationship between happy and the IVs, we generate additional IVs by interacting yearsmarried and yearsmarried2 with all the right-hand side controls in part A.
Using ivreg() , estimate the linear probability model from part A, using yearsmarried and yearsmarried2 as well as the interaction terms (yearsmarried+yearsmarried2)*( gender + age + age2 + children +
as.factor(religiousness)+education+as.factor(occupation) ) as IVs for happy .
Report the IV estimates for the coefficient on happy and its heteroskedasticity- robust standard errors and significance levels. Again, you can use coeftest() for that purpose as in part A.
Report the heteroskedasticity-robust IV-based 90% confidence interval for the coefficient of happy . You can use again the command confint() .
Comment on the IV estimates of the magnitude of the effect.
Discuss the significance of IV estimates.
Compare with OLS estimates in part A.
Report the first stage estimates with their heteroskedasticity-robust standard errors and significance levels. Do the IVs appear to be related to happy ?
Use rlassoIV() from the package ¡°hdm¡± to re-estimate the IV model in part B. Apply Lasso selection to the IVs and controls.
Report the post- V-based estimates for the effect of happy on the probability of regular affairs.
Report the 90% confidence interval for the estimated post-Lasso effect of happy . Use rlasso() to check which controls are useful for predicting Cheater .
Use rlasso() to check which IVs and controls are useful for predicting happy (the first stage).
Note that rlasso commands by default produce heteroskedasticity-robust standard errors, t-statistics, and p-values.
Collect the results for the effect of happy on Cheater from parts A-C in a single table.
Include in the table the estimates, their heteroskedasticity robust standard errors, and the 90% confidence intervals.
Compare and discuss the three sets of results.
Which estimates make most sense?
Are Lasso/post-Lasso methods useful here for improving estimation results?
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com