ECON 322: Econometric Analysis 1 Assignment 7 (Suggested Solution)
Project 1
In this project, we want to test the prediction of the Solow growth model. This model predicts that poor countries (or countries with low level of capital) grow faster than rich countries. As a result, we should see a convergence of the standard of leaving between countries. In other words, every countries should eventually have the same level of GDP (gross domestic product).
We can test the hypothesis by running the following regression: G = β0 +β1GDP0 +u
where G if the average GDP growth between year t0 and year t and GDP0 is the GDP at t0. According to the Solow model, if the initial GDP is low, the growth should be high. The coefficient β1 should therefore be negative.
To answer the questions, use the datset Growth.RData that you will find in the Tutorial folder of Learn. The data is taken from a research paper published in 1995 in the Journal of Applied Econometrics by S.N. Durlauf and P.A. Johnson. The title of the paper is “Multiple Regimes and Cross-Country Growth Behavior”. The variables are
- oil: Is the country an oil-producing country (1 if yes 0 otherwise)?
- inter: Does the country have a better quality data (1 if yes 0 otherwise)?
- oecd: Is the country a member of the OECD (1 if yes 0 otherwise)?
- gdp60: Per capita GDP in 1960
- gdp85: Per capita GDP in 1985
- gdpgrowth: Average growth rate of per capita GDP from 1960 to 1985 (in percent)
- popgrowth: Average growth rate of working-age population 1960 to 1985 (in percent).
- invest: Average ratio of investment (including Government Investment) to GDP from 1960 to 1985 (in percent)
- school: Average fraction of working-age population enrolled in secondary school from 1960 to 1985 (in percent).
- literacy60: Fraction of the population over 15 years old that is able to read and write in 1960 (in percent).
Before starting, remove the observations with missing values by running the following code:
1. Estimate the following model, plot the growth as a function of 1960 GDP and add the regression line. Discuss your results. Do we observe any convergence?
gdpgrowth = β0 + β1gdp60 + u
Econ 322 Assignment 7 Sol. Page 1 of 16
gdpgrowth = 4.07062216 − 0.00002544 gdp60 (0.20125981)∗∗∗ (0.00002276)
n
=
100, R2 = 0.01259, SSR = 322.1902, R ̄2 = 0.00251 ∗pv < 0.1; ∗∗pv < 0.05; ∗∗∗pv < 0.01
Convergence of GDP
● ●
●
● ●
● ●●
●
●
● ● ● ●●
● ●
●●
●● ●●
●●
● ●● ●
●
● ●
● ●
● ●
●
● ●●●
● ● ●●
●●● ●●● ●
●
●● ●
● ●
●● ●
● ●●●
●●
●
●
●●● ●
●
● ●
●
● ●
●● ●
●
0
20000 40000 60000
gdp60
80000
For convergence, we need poor countries to grow faster. In other words, we need the coefficient of gdp60 to be negative and significant. According to the above result, only the intercept is significant. However, we need to do a one-sided test since we want to test if it is significantly negative. Before doing it, we want to make sure we compute the right standard error. According to the above graph, it looks like the errors are heteroscedastic. However, the value of the BP test is 1.5797 and its p-value is 0.2088. We therefore fail to reject the homoscedasticity assumption. The above standard errors should therefore by valid.
The t-ratio for the slope coefficient is -1.1177 and the critical value for a test H0 : β1 = 0 against H1 : β1 < 0 is -1.6606 if we use the student-t distribution with 98 degrees of freedom, or -1.6449 is we use the N(0,1) assumption. In both cases, we fail to reject the hypothesis, which suggests that we do not observe convergence.
We may want to try dropping the outlier. As we can see on the above graph, one country is completely separated from the group. It is possible that the result is driven entirely, by this observation. A close look at the data allows us to see that it is the observation 47 that is that far away from the sample.
We can run the regression without the obervation and see what happens:
Econ 322 Assignment 7 Sol. Page 2 of 16
gdpgrowth
02468
gdpgrowth = 4.15649902 − 0.00005426 gdp60 (0.26956398)∗∗∗ (0.00006409)
n = 99, R2 = 0.00734, SSR = 321.4226, R ̄2 = −0.0029 ∗pv < 0.1; ∗∗pv < 0.05; ∗∗∗pv < 0.01
The t-ratio for the slope coefficient is -0.8466 which is still greater than the critical value. We therefore fail to reject H0 and there is no evidence of convergence. Removing the observation 47 does, however, make the slope more negative as we can see on the graph below.
Convergence of GDP
●
● With obs 47
●
Without obs 47
● ●
● ●●
●
●
● ● ● ●●
● ●
●●
●● ●●
●●
● ●● ●
●
● ●
● ●
● ●
●
● ●●●
● ● ●●
●●● ●●● ●
●
●● ●
● ●
●● ●
● ●●●
●●
●
●
●●● ●
●
● ●
●
● ●
●● ●
●
0 20000 40000 60000 80000
gdp60
2. The above model is the simplified version. In a more realistic model, the Solow model predicts conditional convergence. It says that countries with the same population growth, the same investment rate, and the same level of education, should eventually converge to the same GDP. The following is the modified model that we need to use for testing conditional convergence:
gdpgrowth = β0 + β1log(gdp60) + β2log(invest/100) + β3log(popgrowth/100) +β4log(school/100) + u
Estimate the model, report the result and interpret it. Test whether poor countries are growing faster than rich countries. Do the same graph as for the previous question and add the regression line for the average log(invest/100), log(popgrowth/100) and log(school/100). Discuss.
In order to compare the regression line in both models, we will rerun the regression from the previous question using the log of gdp60:
Econ 322 Assignment 7 Sol. Page 3 of 16
gdpgrowth
02468
gdpgrowth = 3.9781053 − 0.0006657 log(gdp60) (1.4560)∗∗∗ (0.1884)
n = 100, R2 = 0, SSR = 326.2971, R ̄2 = −0.0102 ∗pv < 0.1; ∗∗pv < 0.05; ∗∗∗pv < 0.01
Notice that it does not affect the significance of the slope. Now we add the new variables and plot both lines on the same graph.
gdpgrowth = 21.3413 − 0.8689 log(gdp60) + 1.8665 log(invest/100) (2.3112)∗∗∗ (0.2086)∗∗∗ (0.3608)∗∗∗
+ 1.1368 log(popgrowth/100) + 0.8625 log(school/100) (0.2487)∗∗∗ (0.2415)∗∗∗
n = 100, R2 = 0.45848, SSR = 176.6952, R ̄2 = 0.43568 ∗pv < 0.1; ∗∗pv < 0.05; ∗∗∗pv < 0.01
●
● ●
●
● ●
● ●
●●
●
●
●● ●●
●●
●
●
●
●
● ●●●
●● ●● ●●●
●●● ●●
●●
●● ●●●●●
●
●
●
● ●●●
●
● ● ● ●●
●●● ●●●●
● ●●
●●
●●
●
● ●●●
●
● ●●●●
●●●
●
● ●
●●
● ●●
●
●
First Second
6 7 8 9 10 11
log(gdp60)
The t-ratio for the slope coefficient is now -4.1664, which is much smaller than the critical value. It is therefore significantly negative. We therefore observe convergence when we control for characteristics such as investment rate, population growth and schooling. It is also clear from the graph that the slope if much steeper in the second model. This result makes sense. We cannot expect a poor country with no education and very low investment rate to converge to a rich country with high education and investment. The result implies that countries with the same
Econ 322 Assignment 7 Sol. Page 4 of 16
gdpgrowth
02468
investment rate, the same population growth, and the same level of education, will eventually converge to the same GDP.
3. The problem with the above model is that it does not take into account other factors that may prevent poor countries to grow. There are in fact poor countries that are in some kind of poverty trap and do not converge. First test whether the coefficients of poor and rich countries are the same, where poor means having a GDP less than 1800 in 1960 (this is a Chow test).
The regression results are:
29.3974 − 1.8463 log(gdp60) + 1.5000 log(invest/100) (6.0236)∗∗∗ (0.6198)∗∗∗ (0.4673)∗∗∗
+ 1.7098 log(popgrowth/100) + 0.9244 log(school/100) (0.8392)∗∗ (0.2992)∗∗∗
48, R2 = 0.47426, SSR = 83.2425, R ̄2 = 0.42536 ∗pv < 0.1; ∗∗pv < 0.05; ∗∗∗pv < 0.01
gdpgrowth =
n=
for the low GDP countries, and
22.4413 − 1.1852 log(gdp60) + 1.9679 log(invest/100) (3.2027)∗∗∗ (0.2785)∗∗∗ (0.6162)∗∗∗
+ 1.1193 log(popgrowth/100) + 0.1462 log(school/100) (0.2673)∗∗∗ (0.5717)
52, R2 = 0.51554, SSR = 72.97009, R ̄2 = 0.47431 ∗pv < 0.1; ∗∗pv < 0.05; ∗∗∗pv < 0.01
gdpgrowth =
n=
for the high GDP countries. The p-value of the Chow test is very close to 0, which implies that the coefficients are significantly different.
4. Now do the same with group 1 being contries with dgp60 < 1800 and literacy60 < 50 and group2 being countries with dgp60 ≥ 1800 and literacy60 ≥ 50.
There is a typo in this question. The second group should be the opposite of the first, which implies dgp60 ≥ 1800 and/or literacy60 ≥ 50. If we don’t do it this way, the sample size of the unrestricted model is 84 and it is 100 for the unrestricted one. It would make the Chow test invalid. It is fine if you did not see it. It is my mistake. To get the opposite condition in R, we can put a ! (not) in front of the condition.
The regression results are:
gdpgrowth = 27.3871 − 1.7802 log(gdp60) + 1.2839 log(invest/100) (6.2349)∗∗∗ (0.6637)∗∗ (0.4797)∗∗
+ 1.5664 log(popgrowth/100) + 0.8023 log(school/100) (0.8647)∗ (0.3324)∗∗
n = 42, R2 = 0.40997, SSR = 71.72388, R ̄2 = 0.34618 ∗pv < 0.1; ∗∗pv < 0.05; ∗∗∗pv < 0.01
for the low GDP countries, and
22.6940 − 1.0422 log(gdp60) + 2.5491 log(invest/100) (2.8464)∗∗∗ (0.2354)∗∗∗ (0.5644)∗∗∗
+ 1.2614 log(popgrowth/100) + 0.1462 log(school/100) (0.2590)∗∗∗ (0.5212)
58, R2 = 0.53057, SSR = 83.41504, R ̄2 = 0.49515 ∗pv < 0.1; ∗∗pv < 0.05; ∗∗∗pv < 0.01
gdpgrowth =
n=
Econ 322
Assignment 7 Sol. Page 5 of 16
for the high GDP countries. The p-value of the Chow test is also very close to 0, which implies that the coefficients are significantly different.
5. Discuss the results you obtained in the last three questions. What can you conclude about the prediction of the Solow growth model?
The following tables show the t-ratio and the pvalues for the above models,
Q2 Q3low Q3high Q4low Q4high
Coef of gdp60 t-ratio -0.8689 -4.1664 -1.8463 -2.9787 -1.1852 -4.2551 -1.7802 -2.6821 -1.0422 -4.4265
one-sided p-value 0.0000 0.0024 0.0000 0.0054 0.0000
All coefficients are significantly negative, which implies that we do
(conditional means that we condition on investment, population growth and schooling). We do, however, observe that it is less significant for poor countries. The Chow tests do suggest that the way poor countries converge is different from the way rich countries do. The graph also shows that the model does not fit the data very well. There are lots of variability not explained. Many very poor countries also have low growth. For a better analysis, we would therefore need to control for more factors that may be determinant for high growth.
detect conditional convergence
Econ 322 Assignment 7 Sol. Page 6 of 16
Project 2
We want to test whether an instructor look by itself has an impact on the students evaluation score in University courses. For that, we use a dataset collected from 2000 to 2002 at the University of Texas Austin. We have the evaluation scores for 463 courses taught by 94 instructors. We also have some instructor characteristics, and a measure of beauty which was determined by a panel of students. More specifically, the variables are:
- minority: 1 if the instructor belongs to a minority (non-Caucasian) and 0 otherwise.
- age: Professor’s age in years
- male: 1 if the instructor is a man and 0 otherwise.
- single: 1 if it is a single credit course and 0 otherwise.
- beauty: Rating of the instructor’s physical appearance. The scale is centered to have a mean of zero. The range goes from -1.45 to 1.97.
- eval: The course evaluation (1 to 5).
- lower: 1 if the course is a lower level (first or second year) and 0 otherwise.
- native: 1 if the instructor is a native English speaker and 0 otherwise.
- tenure:. 1 if the instructor is on tenure track and 0 otherwise.
- students: The number of students that participated in the evaluation.
- allstudents: The number of students enrolled in the course.
- prof: This an instructor identifier because the same instructor may have taught more tan 1 course. You do not really need that variable for this project.
Table 1 provides summary statistics of the variables.
Table 1: Summary statistics of the variables
Statistic N
Min Max 29 73
−1.450 1.970 2.100 5.000 0 1
0 1
5 380 8 581 0 1 0 1 0 1
minority 463
age 463
beauty 463
eval 463
native 463
tenure 463
students 463 allstudents 463
male 463 0.579 single 463 0.058 lower 463 0.339
Mean St. Dev.
0.138 0.346 01
48.365 9.803 0.00000 0.789 3.998 0.555 0.940 0.239 0.780 0.415
0.494 0.235 0.474
36.624 45.018 55.177 75.073
Before building our model, we want to have a sense of how students’ evaluation differs across groups. It is not meant to measure some kind of discrimination, but simply to see whether those variables are
Econ 322 Assignment 7 Sol. Page 7 of 16
related to evaluation score. Table 2 shows the results of regressions on different dummy variables. We did not include the robust standard errors because BP test fails to reject homoscedasticity in all cases. We can see that males and native English speakers tend to get better evaluation scores, while instructors who belong to a minority or are on a tenure track get on average lower evaluation scores. All differences are highly significant except for the minority dummy variable which is only significant at 10%.
Table 2: Difference in evaluation scores between some specific groups
male
minority
native
tenure
Constant
Observations R2
Adjusted R2
Residual Std. Error (df = 461) F Statistic (df = 1; 461)
Note:
3.901∗∗∗ (0.039)
463 0.022 0.020 0.549 10.562∗∗∗
0.329∗∗∗ (0.107)
3.689∗∗∗ (0.104)
463 0.020 0.018 0.550 9.410∗∗∗
(1)
0.168∗∗∗ (0.052)
(2)
−0.123∗ (0.075)
4.015∗∗∗ (0.028)
463 0.006 0.004 0.554 2.725∗
(3)
(4)
−0.173∗∗∗ (0.062)
4.133∗∗∗ (0.055)
463 0.017 0.015 0.551 7.866∗∗∗
Dependent variable:
eval
Can we conclude from that result that there is some kind of discrimination toward some instructors based on gender or ethnicity? Certainly not. It is possible that the intructors’ abilities are related to the qualitative variables, and that it is thought those abilities that we obtain a significant relation. For example, native speakers are easier to follow by English speaking students. It is therefore possible that students give a better evaluation to these instructors simply because it is easier to succeed when the intructor is easy to understand. For the ”tenure” variable, it is well known that instructors on tenure track are much more concern about producing good research papers than being good teachers. It is therefore possible that the negative coefficient reflects the fact that intructors on tenure track are simply bad teachers.
How about male? Can a man be a better instructor than a woman? Everything else being equal, probably not. But suppose the number of male instructors is much higher and that the University looks for some kind of parity between men and women, than they would accept women who are not as good as men simply to reach parity. If that’s the case, men instructors would be on average better
∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01
Econ 322 Assignment 7 Sol. Page 8 of 16
than women.
All of this is pure speculation. We simply went through some possible interpretations to make our
point that results from Table 2 do not say anything about possible discrimination.
The objective is to test if beauty by itself is a determinant of evaluation score. In other words, we want to test if two instructors with identical abilities but different level of beauty, have different evaluation scores on average. If that’s the case, it would constitute evidence of discrimination based
on look.
We therefore have to control for factors that may affect students’ evaluation. We certainly have
to control for age since it is likely to be correlated with beauty. Also, age reflects experience, which may also have an impact on teaching ability. The above results show that the qualitative variables are related to evaluation. Since the measure of beauty is subjective, it may vary with gender and ethnicity. We therefore want to control for these characteristics. The other two variables (native and tenure) are less likely to be related to beauty. In fact, if we regress beauty on native or tenure, their coefficients are not significant. Omitting them is therefore unlikely to affect the coefficient of beauty. We can make the same argument with single and lower. Those variables characterize the type of course, which may affect evaluation, but are not likely to be related to beauty. The coefficient of single, however, is slightly significant when we regress beauty on it. It may be because instructors teaching one credit courses are younger, which may have an impact on their look.
Another factor to consider is the proportion of students who have filled the evaluation form. If the proportion is too small, the evaluation, as a measure of the instructor’s ability, may be biased if the choice of not participating is related to the quality of the course. We have therefore created a variable ”filled” equals to students/allstudents. A regression of beauty on filled shows a significant positive relation between these two variables. The evaluation are filled in class. It is therefore a measure of the proportion of students attending class. Therefore, it is probably related to the instructors’ ability as well.
We want to verify our hypothesis that native, tenure and lower have little impact on the coefficient of beauty because they are uncorrelated with it. Table 3 compare the regression with and without those variables. The model (2) is better in terms of R ̄2, but the coefficients of beauty are very similar. It may therefore be justifiable not to include them in the regression. Our goal is not to maximize the R ̄2 but to get a good measure of the effect of beauty on evaluation scores. The standard errors are also very close in both models, which implies that adding those variables do not improve efficiency much. If we only add native, there is a slight improvement in the standard error of the coefficient of beauty. We will therefore keep it not because we need to control for that variable, but because it improve the efficiency of our estimate.
Econ 322 Assignment 7 Sol. Page 9 of 16
Table 3: First few Models
Dependent variable:
eval (2)
0.1381∗∗∗ (0.0316)
−0.0017 (0.0026)
0.1992∗∗∗ (0.0506)
−0.1990∗∗∗ (0.0759)
0.6592∗∗∗ (0.1446)
0.5418∗∗∗ (0.1150)
0.2257∗∗ (0.1051)
−0.0498 (0.0612)
0.0307 (0.0539)
3.2887∗∗∗ (0.2129)
463
0.1946
0.1786
0.5029 (df = 453) 12.1629∗∗∗ (df = 9; 453)
beauty
age
male
minority
(1)
0.1410∗∗∗ (0.0316)
−0.0016 (0.0026)
0.1921∗∗∗ (0.0500)
−0.2560∗∗∗ (0.0713)
(3)
0.1411∗∗∗ (0.0315)
−0.0015 (0.0026)
0.1959∗∗∗ (0.0498)
−0.2013∗∗∗ (0.0746)
0.6564∗∗∗ (0.1435)
0.5896∗∗∗ (0.1048)
0.2440∗∗ (0.1031)
3.2346∗∗∗ (0.1962)
463
0.1928
0.1803
0.5023 (df = 455) 15.5206∗∗∗ (df = 7; 455)
filled 0.6582∗∗∗ (0.1442)
single 0.6157∗∗∗ (0.1047)
native
tenure
lower
Constant 3.4764∗∗∗ (0.1684)
Observations R2
Adjusted R2 Residual Std. Error F Statistic
Note:
463
0.1828
0.1721
0.5049 (df = 456) 17.0015∗∗∗ (df = 6; 456)
∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01
Econ 322
Assignment 7 Sol. Page 10 of 16
Model (3) indicates the variables we want to control for. However, we need to make sure our model is correctly specified. First p-values of BP and White tests are respectively 0.5868 and 0.8295. We therefore fail to reject the homoscedasticity assumption. However, the p-value of the RESET test is 0.0372. Therefore, we seem to be missing some nonlinearity in our model.
We tried adding age2, but both coefficients were non-significant. However, adding age2 and the interactions between beauty and both age and age2 improve the model substantialy. All new variables are significant. The result is shown below.
eval = 4.5723672 + 1.7557121 beauty − 0.0623476 age + 0.0006706 I(age2) (0.5873970)∗∗∗ (0.6951570)∗∗ (0.0253534)∗∗ (0.0002646)∗∗
n =
+ 0.1929777 male − 0.1819307 minority + 0.6263906 filled (0.0488775)∗∗∗ (0.0735505)∗∗ (0.1444312)∗∗∗
+ 0.5913058 single + 0.2739869 native − 0.0862384 beauty : age (0.1033013)∗∗∗ (0.1010256)∗∗∗ (0.0307029)∗∗∗
+ 0.0010582 beauty : I(age2) (0.0003279)∗∗∗
463, R2 = 0.23641, SSR = 108.6126, R ̄2 = 0.21951 ∗pv < 0.1; ∗∗pv < 0.05; ∗∗∗pv < 0.01
We tried adding beauty2 but the effect was negligible. We may also think that the effect of beauty is a function of gender. But again, the coefficient was not significant. In fact, no interactions between beauty and the other dummy variables were significant. We therefore keep the above model. The p-values of BP and White tests are respectively 0.0179 and 0.1064. We therefore reject with the BP test, but fail to reject the homoscedasticity assumption with the White test. Since we want to make sure our inference is valid, we report the result below using robust standard errors. Finally, the p-value of the RESET test is 0.4611. We therefore fail to reject the hypothesis that our model is correctly specified.
eval = 4.5723672 + 1.7557121 beauty − 0.0623476 age + 0.0006706 I(age2) (0.5712212)∗∗∗ (0.6633627)∗∗∗ (0.0252393)∗∗ (0.0002638)∗∗
n =
+ 0.1929777 male − 0.1819307 minority + 0.6263906 filled (0.0502832)∗∗∗ (0.0708159)∗∗ (0.1441700)∗∗∗
+ 0.5913058 single + 0.2739869 native − 0.0862384 beauty : age (0.1027906)∗∗∗ (0.0922291)∗∗∗ (0.0297017)∗∗∗
+ 0.0010582 beauty : I(age2) (0.0003196)∗∗∗
463, R2 = 0.23641, SSR = 108.6126, R ̄2 = 0.21951 (Robust S-E) ∗pv < 0.1; ∗∗pv < 0.05; ∗∗∗pv < 0.01
In our final model, beauty has an effect on evaluation, but the effect is a function of age. With the complexity of the model, it is hard to tell when the effect is positive and when it is negative. The following graph shows the effect for difference age, holding the other regressors to their sample mean. As we can see, the effect gets larger as age increases. In fact, the effect seems negligible when age is below 50.
Econ 322 Assignment 7 Sol. Page 11 of 16
6.0 5.5 5.0 4.5 4.0 3.5
age = 60
age = 30
−1.0 −0.5 0.0 0.5 1.0 1.5
6.0 5.5 5.0 4.5 4.0 3.5
beauty*age effect plot
−1.0 −0.5 0.0 0.5 1.0 1.5
age = 70
age = 40
eval
We can conclude that there is a relation between beauty and evaluation score. However, we cannot pretend that this relation is causal. Individuals we better look develop confidence, better social skills, and larger social network (at least according to some psychologists). Those qualities affect the ability to communicate in from of a class. It also helps being appreciated by the students. We did not control for those qualities that may affect the instructors’ abilities and therefore the students’ evaluations. We do have a variable prof. Controling for who is teaching would be a way of controling for ability. There are 94 intructors. We would need to add 93 dummy variables. If we do it, the coefficient of beauty becomes not significant and negative. However, its standard error increases to 2.16. We therefore need to address the problem differently.
Econ 322 Assignment 7 Sol. Page 12 of 16
beauty
age = 50
−1.0 −0.5 0.0 0.5 1.0 1.5
Code Project 1
data <- na.omit(data) res <- lm(gdpgrowth ~ gdp60, data=data)
w <- which(data$gdp60>60000) res2 <- lm(gdpgrowth ~ gdp60, data=data[-47,])
res3 <- lm(gdpgrowth ~ log(gdp60), data=data)
library(lmtest) bp1 <- bptest(res) plot(gdpgrowth ~ gdp60, data=data, main="Convergence of GDP") abline(res, col=2, lwd=2)
plot(gdpgrowth ~ gdp60, data=data, main="Convergence of GDP") abline(res, col=2, lwd=2) abline(res2, col=3, lwd=2) legend("topleft", c("With obs 47", "Without obs 47"), col=2:3, lwd=2,lty=1)
form <- gdpgrowth ~ log(gdp60) + log(invest/100) + log(popgrowth/100) + log(school/10 res4 <- lm(form, data=data)
0)
plot(gdpgrowth ~ log(gdp60), data=data) abline(res3, col=2, lwd=2) b <- coef(res4) a1 <- b[1]+b[3]*mean(log(data$invest/100))+b[4]*mean(log(data$popgrowth/100))+
b[5]*mean(log(data$school/100)) abline(a1, b[2], col=3, lwd=2) legend("topright", c("First", "Second"), col=c(2,3), lwd=2,lty=1)
reslow <- lm(form, data=data, subset=gdp60<1800) reshigh <- lm(form, data=data, subset=gdp60>=1800) ssrUR <- sum(residuals(reslow)^2)+sum(residuals(reshigh)^2) ssrR <- sum(residuals(res2)^2) F <- (ssrR-ssrUR)/ssrUR*(nobs(reslow)+nobs(reshigh)-10)/5 pv <- 1-pf(F, 5, nobs(reslow)+nobs(reshigh)-10)
Econ 322 Assignment 7 Sol. Page 13 of 16
reslow2 <- lm(form, data=data, subset=gdp60<1800 & literacy60<50) reshigh2 <- lm(form, data=data, subset=!(gdp60<1800 & literacy60<50)) ssrUR <- sum(residuals(reslow2)^2)+sum(residuals(reshigh2)^2) ssrR <- sum(residuals(res2)^2)
F <- (ssrR-ssrUR)/ssrUR*(nobs(reslow2)+nobs(reshigh2)-10)/5 pv <- 1-pf(F, 5, nobs(reslow2)+nobs(reshigh2)-10)
t1 <- summary(res4)$coef[2,3] t2 <- summary(reslow)$coef[2,3] t3 <- summary(reshigh)$coef[2,3] t4 <- summary(reslow2)$coef[2,3] t5 <- summary(reshigh2)$coef[2,3] ans <- c(t1,t2,t3,t4,t5) pv <- pt(ans, c(95, 43, 47, 37, 53)) b <- c(coef(res4)[2], coef(reslow)[2], coef(reshigh)[2],
coef(reslow2)[2], coef(reshigh2)[2]) ans <- cbind(b, ans, pv)
colnames(ans) <- c("Coef of gdp60", "t-ratio", "one-sided p-value") rownames(ans) <- c("Q2", "Q3low","Q3high","Q4low","Q4high") library(xtable) xtable(ans, digits=4)
Project 2
data$male <- as.numeric(data$gender=="male") data$gender <- NULL data$single <- as.numeric(data$credits=="single") data$credits <- NULL
data$lower <- as.numeric(data$division=="lower") data$division <- NULL data$native <- as.numeric(data$native=="yes") data$minority <- as.numeric(data$minority=="yes") data$tenure <- as.numeric(data$tenure=="yes")
res <- lm(eval~age+I(age^2)+beauty+male+single+lower+native+minority+tenure, data=data)
res1 <- lm(eval~male, data=data) res2 <- lm(eval~minority, data=data) res3 <- lm(eval~native, data=data) res4 <- lm(eval~tenure, data=data)
bnres <- lm(beauty~native, data=data) btres <- lm(beauty~tenure, data=data)
Econ 322 Assignment 7 Sol. Page 14 of 16
bsres <- lm(beauty~single, data=data) blres <- lm(beauty~lower, data=data) data$filled <- data$students/data$allstudents bfres <- lm(beauty~filled, data=data)
form1 <- eval~beauty+age+male+minority+filled+single form2 <- eval~beauty+age+male+minority+filled+single+native+tenure+lower form3 <- eval~beauty+age+male+minority+filled+single+native reg1 <- lm(form1, data=data) reg2 <- lm(form2, data=data) reg3 <- lm(form3, data=data)
yhat <- fitted(reg3) u2 <- residuals(reg3)^2 bp1 <- bptest(reg3) bp2 <- lm(u2~yhat+I(yhat^2)) library(car) bp2 <- linearHypothesis(bp2, names(coef(bp2))[-1]) yhat2 <- yhat^2 yhat3 <- yhat^3 form4 <- eval~beauty+age+male+minority+filled+single+native+yhat2+yhat3 reset <- lm(form4, data=data) reset <- linearHypothesis(reset, c("yhat2","yhat3"))
form5 <- eval~beauty*(age+I(age^2))+male+minority+filled+single+native reg5 <- lm(form5, data=data)
yhat <- fitted(reg5) u2 <- residuals(reg5)^2 bp1 <- bptest(reg5) bp2 <- lm(u2~yhat+I(yhat^2)) bp2 <- linearHypothesis(bp2, names(coef(bp2))[-1]) yhat2 <- yhat^2 yhat3 <- yhat^3 form6 <- eval~beauty*(age+I(age^2))+male+minority+filled+single+native+yhat2+yhat3 reset <- lm(form6, data=data) reset <- linearHypothesis(reset, c("yhat2","yhat3"))
library(sandwich) se <- sqrt(diag(vcovHC(reg5)))
Econ 322 Assignment 7 Sol. Page 15 of 16
library(effects) plot(Effect(c("beauty","age"), reg5))
Econ 322 Assignment 7 Sol. Page 16 of 16