————————————————————————
STAT 341/641: Intro to EDA and Statistical Computing
LAB #5: Loops and the Jackknife
TEACHING ASSISTANT: “Fill in the name of your TA”
————————————————————————
{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE)
DIRECTIONS: You will use the lab time to complete this assignment.
————————————————————————
TASK: ANALYZE THE GAPMINDER DATA
Install and load the gapminder data.
#install.packages(“gapminder”)
library(gapminder)
1: Compute the average life expectancy and GDP per capita by country using the data from 1987 to 2007. Plot these values in a scatterplot. Color the points by the continent to which they belong.
SOLUTION:
2: Use the Mahalanobis distance method to detect outliers in the output from question one. Which countries would you call outliers? Why?
SOLUTION:
3: Using the full dataset, implement the jackknife to identify which countries have a large influence on the βgdpPercap parameter in the regression equation
_l__i__f__e__E__x__p__i_ = _α_ + _β_gdpPercap_g__d__p__P__e__r__c__a__p__i_ + _β_pop_p__o__p__i_ + _ϵ__i_.
To do this, write a for-loop. In each iteration of the loop, drop one of the countries and compute the OLS coefficients. Plot the value of βgdpPercap with the name of countries on the x-axis and the values of βgdpPercap on the y-axis.
SOLUTION:
5.4: Write a for-loop to sample N = 1, 704 points with replacement from the data. Do this R = 250 times. For each iteration compute the mean of the population. Visualize the 250 means with a boxplot. Then compute the variance of the means. Compare this to the standard estimate for the variance of the sample mean
$$\frac{\hat{\sigma}^2}{N}.$$
SOLUTION:
————————————————————————