程序代写代做代考 graph Assignment 10, Question 3

Assignment 10, Question 3
suppressMessages(library(“AER”))
Part (a)
By the result of Question 2(a), the true value of \(\beta\) is given by \(\beta=\left(EX_{i}X_{i}^{\prime}\right)^{-1}EX_{i}g\left(X_{i}\right)\). In this case, \(EX_{i}X_{i}^{\prime}=\begin{pmatrix} 1 & E X_{i,2} \\ E X_{i,2} & E X_{i,2}^2\end{pmatrix}=\begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}\). Next, \(EX_{i}g\left(X_{i}\right)=\begin{pmatrix} E X_{i,2}^3 \\ EX_{i,2}^4 \end{pmatrix}\). By the symmetry of the standard normal distribution around zero, \(E X_{i,2}^3=0\). To compute\(EX_{i,2}^4\), we can use the MGF of the \(N(0,1)\) distribution: \(M(t)=\exp( t^2 /2)\). The fourth derivative of the MGF at \(t=0\) is equal to \(3\). Hence, \(EX_{i,2}^4=3\). We now have: \(\beta= \begin{pmatrix} 0\\ 3\end{pmatrix}\).
Part (b)
Custom function to generate data:
data_sim <- function(n){ x2<-rnorm(n,0,1) v<-runif(n,-10,10) y=x2^3+v data<-list(Y=y,X=x2) return(data) } Generate data: D=data_sim(2000) y=D$Y x2=D$X Part (c) Run the OLS regression m=lm(y~x2) m$coefficients ## (Intercept) x2 ## -0.08369941 3.05545370 Part(d) Homoskedastic standard errors: coeftest(m) ## ## t test of coefficients: ## ## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.083699 0.138117 -0.606 0.5446
## x2 3.055454 0.138488 22.063 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Heteroskedastic standard errors: coeftest(m,vcov=hccm(m,type="hc0")) ## ## t test of coefficients: ## ## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.083699 0.137884 -0.607 0.5439
## x2 3.055454 0.179058 17.064 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 • The heteroskedastic standard error for the slope porameter is larger. Part (e) A grid of values for the regressor $X_{i,2} grid=seq(-4,4,0.05) The corresponding values of \(g(X_i)\): g=grid^3 The regression line with \(\beta_1\) and \(\beta_2\): reg_line=0+3*grid The estimated regression line: est_reg_line=summary(m)$coefficients[1,1]+summary(m)$coefficients[2,1]*grid Plotting: plot(grid,g,type="l",col="red",ylim=c(-70,70),xlab="regressor",ylab="dependent variable") lines(grid,reg_line,col="blue") lines(grid,est_reg_line,col="black") legend(0,-20,legend=c("True function","True regression","Estimated regression"),col=c("red","blue","black"),lty=1)  • The linear approximation appears to work well for the values of the regressor in the (-2,2) range. • Since the regressor has a standard normal distribution, most of the observations would fall within the range. Part (f) Plotting the squared residuals agains the regressor: plot(x2,(m$residuals)^2)  • The residuals appear heteroskedastic: the second moment of the residuals as a function of the regressor is higher for larger positive or negative values of the regressor. • The residuals \(U_i\) include the approximation error \(g(X_i)-X_i'\beta\). According to the results of Question 2(d), \(E(U_i^2\mid X_i)\) depends on \((g(X_i)-X_i'\beta)^2\). From the graph in part (e), we can see that the magnitude of the approximation error is larger for larger positive/negative values of the regressor. This explains larger \(\hat{U}_i^2\) for larger positive/negative values of the regressor. Part (g) R=10^4 n=20 T=rep(0,R) for (r in 1:R){ data=data_sim(n) m=lm(data$Y ~ data$X) ct=coeftest(m,vcov=hccm(m,type="hc0")) T[r]=(ct[2,1]-3)/ct[2,2] } Plot the distribution: low=min(T) high=max(T) B=max(-low,high)+0.2 hist(T,breaks=seq(-B,B,0.2),xlab="T-statistic values",main="The simulated distribution of the T statistic",freq=FALSE,ylim=c(0,0.4)) x=seq(-6,6,0.01) f=exp(-x^2/2)/sqrt(2*pi) lines(x,f,col="red")  • The simulated distribution of \(T\) has thicker tails than the standard normal distribution. • Moreover, the distribution of \(T\) is also skewed to the left. Part (h) alpha=c(0.01,0.05,0.10) P_right=rep(0,3) P_left=rep(0.3) for (j in 1:3){ P_right[j]=sum(T>qnorm(1-alpha[j]))/R
P_left[j]=sum(Tz_{1-\alpha}\):
cbind(alpha,P_right)
## alpha P_right
## [1,] 0.01 0.0320
## [2,] 0.05 0.0771
## [3,] 0.10 0.1226
• For both events, \(Tz_{1-\alpha}\), the simulated probabilities exceed \(\alpha\). The deviations are of much higher magnitude for \(Tqnorm(1-alpha[j]))/R
P_left[j]=sum(Tz_{1-\alpha}\):
cbind(alpha,P_right)
## alpha P_right
## [1,] 0.01 0.0040
## [2,] 0.05 0.0366
## [3,] 0.10 0.0796
• The simulated distribution of \(T\) is still somewhat skewed to the left, but to a much smaller extent.
• The sumulated probabilities for the tail events are now much closer to the values of \(\alpha\).
• The normal approximation appears to be much more accurate.