STAT 318 Exam 2 – Take home portion
Enter all your answers on Canvas. To access the test, go to quizzes and click on Exam 2 Take Home. Be
sure that everything you enter (especially any R code or output that you have copied and pasted) is easy
to read. If I have a hard time reading what you have put in for an answer, I may deduct points on that
problem. Also be sure to read each question carefully, as I sometimes ask for more than one answer per
question. There is no time limit on the test, so you can save your answers and work on it later if you
want. Be aware, however, that the test will no longer be available after 12:00am on Sunday, April 23. If
you run into any problems with R, entering answers on Canvas, or understanding a question, please let
me know.
The file beetle.txt contains data on the number of eggs laid by a certain species of leaf beetle under
certain conditions. The beetles in the experiment were randomly assigned to cages. If TRT=I, then the
cage contained one female and one male. If TRT=G, then the cage contained 5 males and 5 females.
Each cage was assigned a temperature, and the response variable is the number of eggs laid per female.
Use these data to answer the following questions.
1. Fit a Poisson regression model with Eggs as the response and Temp and TRT as the explanatory
variables. Report the summary() output (4 pts)
2. Predict the average number of eggs laid per female for the following two scenarios:
1 male and 1 female in a cage with temperature 21
5 males and 5 females in a cage with temperature 21
What does this tell you about how the number of eggs laid per female is related to how many
beetles there are in a cage when the temperature is at 21? (6 pts)
3. Find and interpret the estimated percentage change in the mean number of eggs laid per female
for a 2 unit increase in temperature. (5pts)
4. Calculate and interpret the 95% confidence interval for the percentage change you calculated in
number 3 (5 pts)
5. Split the data into training and validation sets (use 80% for the training set and 20% for the
validation set). For this question, I want to see the R code you used. Do not show me the
contents of the two sets, but do show me the sample sizes for each of the sets (4 pts)
6. Fit the same model from number 1 using both the training and validation sets. Report both sets
of coefficients. What does comparing the coefficients tell you about the model? (5 pts)
7. Evaluate the predictive capability of this model. Be sure to include all three steps discussed in
class (be sure to comment on how these relate to the predictive capability of the model). Note
that you will need to save your plot and upload it into question 7a on Canvas. You can enter the
other two parts into question 7b (6 pts)
The file Haircut.csv contains data on 49 individuals. We have information on gender and the price each
individual paid for his or her most recent haircut. Use these data to answer the following questions.
8. Using Cost as the response, construct two box plots, one for males and one for females.
Compare the plots with respect to shape, center, and variability. Is this what you would expect
to see for these data? Why or why not? (7 pts)
9. Using a permutation test, complete the last three steps for the following hypothesis test:
𝐻0: 𝜇𝐹 − 𝜇𝑀 = 0 vs. 𝐻𝑎: 𝜇𝐹 − 𝜇𝑚 > 0
𝛼 = 0.05
Note that 𝜇𝐹 and 𝜇𝑀 are the average haircut prices for females and males, respectively. Finish
the hypothesis test AND give me the code you used to calculate the approximate p-value (8 pts)