程序代写代做代考 graph Assignment 10, Question 3

Assignment 10, Question 3
suppressMessages(library(“AER”))
Part (a)
By the result of Question 2(a), the true value of $\beta$ is given by $\beta=\left(EX_{i}X_{i}^{\prime}\right)^{-1}EX_{i}g\left(X_{i}\right)$. In this case, $EX_{i}X_{i}^{\prime}=\begin{pmatrix} 1 & E X_{i,2} \\ E X_{i,2} & E X_{i,2}^2\end{pmatrix}=\begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}$. Next, $EX_{i}g\left(X_{i}\right)=\begin{pmatrix} E X_{i,2}^3 \\ EX_{i,2}^4 \end{pmatrix}$. By the symmetry of the standard normal distribution around zero, $E X_{i,2}^3=0$. To compute$EX_{i,2}^4$, we can use the MGF of the $N(0,1)$ distribution: $M(t)=\exp( t^2 /2)$. The fourth derivative of the MGF at $t=0$ is equal to $3$. Hence, $EX_{i,2}^4=3$. We now have: $\beta= \begin{pmatrix} 0\\ 3\end{pmatrix}$.
Part (b)
Custom function to generate data:
data_sim <- function(n){ x2<-rnorm(n,0,1) v<-runif(n,-10,10) y=x2^3+v data<-list(Y=y,X=x2) return(data) } Generate data: D=data_sim(2000) y=D$Y x2=D$X Part (c) Run the OLS regression m=lm(y~x2) m$coefficients ## (Intercept) x2 ## -0.08369941 3.05545370 Part(d) Homoskedastic standard errors: coeftest(m) ## ## t test of coefficients: ## ## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.083699 0.138117 -0.606 0.5446
## x2 3.055454 0.138488 22.063 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Heteroskedastic standard errors: coeftest(m,vcov=hccm(m,type="hc0")) ## ## t test of coefficients: ## ## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.083699 0.137884 -0.607 0.5439
## x2 3.055454 0.179058 17.064 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 • The heteroskedastic standard error for the slope porameter is larger. Part (e) A grid of values for the regressor $X_{i,2} grid=seq(-4,4,0.05) The corresponding values of $g(X_i)$: g=grid^3 The regression line with $\beta_1$ and $\beta_2$: reg_line=0+3*grid The estimated regression line: est_reg_line=summary(m)$coefficients[1,1]+summary(m)$coefficients[2,1]*grid Plotting: plot(grid,g,type="l",col="red",ylim=c(-70,70),xlab="regressor",ylab="dependent variable") lines(grid,reg_line,col="blue") lines(grid,est_reg_line,col="black") legend(0,-20,legend=c("True function","True regression","Estimated regression"),col=c("red","blue","black"),lty=1) • The linear approximation appears to work well for the values of the regressor in the (-2,2) range. • Since the regressor has a standard normal distribution, most of the observations would fall within the range. Part (f) Plotting the squared residuals agains the regressor: plot(x2,(m$residuals)^2) • The residuals appear heteroskedastic: the second moment of the residuals as a function of the regressor is higher for larger positive or negative values of the regressor. • The residuals $U_i$ include the approximation error $g(X_i)-X_i'\beta$. According to the results of Question 2(d), $E(U_i^2\mid X_i)$ depends on $(g(X_i)-X_i'\beta)^2$. From the graph in part (e), we can see that the magnitude of the approximation error is larger for larger positive/negative values of the regressor. This explains larger $\hat{U}_i^2$ for larger positive/negative values of the regressor. Part (g) R=10^4 n=20 T=rep(0,R) for (r in 1:R){ data=data_sim(n) m=lm(data$Y ~ data$X) ct=coeftest(m,vcov=hccm(m,type="hc0")) T[r]=(ct[2,1]-3)/ct[2,2] } Plot the distribution: low=min(T) high=max(T) B=max(-low,high)+0.2 hist(T,breaks=seq(-B,B,0.2),xlab="T-statistic values",main="The simulated distribution of the T statistic",freq=FALSE,ylim=c(0,0.4)) x=seq(-6,6,0.01) f=exp(-x^2/2)/sqrt(2*pi) lines(x,f,col="red") • The simulated distribution of $T$ has thicker tails than the standard normal distribution. • Moreover, the distribution of $T$ is also skewed to the left. Part (h) alpha=c(0.01,0.05,0.10) P_right=rep(0,3) P_left=rep(0.3) for (j in 1:3){ P_right[j]=sum(T>qnorm(1-alpha[j]))/R
P_left[j]=sum(Tz_{1-\alpha}\):
cbind(alpha,P_right)
## alpha P_right
## [1,] 0.01 0.0320
## [2,] 0.05 0.0771
## [3,] 0.10 0.1226
• For both events, $Tz_{1-\alpha}$, the simulated probabilities exceed $\alpha$. The deviations are of much higher magnitude for $Tqnorm(1-alpha[j]))/R
P_left[j]=sum(Tz_{1-\alpha}$:
cbind(alpha,P_right)
## alpha P_right
## [1,] 0.01 0.0040
## [2,] 0.05 0.0366
## [3,] 0.10 0.0796
• The simulated distribution of $T$ is still somewhat skewed to the left, but to a much smaller extent.
• The sumulated probabilities for the tail events are now much closer to the values of $\alpha$.
• The normal approximation appears to be much more accurate.

Related Posts