—
title: “STAT 2150 — Assignment 3”
geometry: margin=2.25cm
output:
html_notebook: default
highlight: tango
theme: united
fontsize: 11pt
—
## Instructions :
1. Assignments must be submitted to Crowdmark before 11:59 PM on Friday, March 20. Late assignments will be graded but will score 0.
2. You must submit one pdf file which includes answers to the questions and the R code solely written by you.
3. **How to prepare the pdf document:**
a) Use the scaffolding provided here
b) Make sure each code chunk runs properly
c) Knit the document to a pdf file (click the down arrow beside “Preview” above and click “Knit to PDF”). If you have not yet been able to knit to PDF come see Dr. Gerstein or your TA or use the UMLearn Discussion board. You can knit to HTML and then save to PDF but you are **strongly** encouraged to learn how to knit to PDF.
d) The PDF will be saved in the same folder that you have saved this .Rmd file. Before you upload the assignment, please rename the pdf file “LabSection_LastName_StudentNumber_A2.pdf” (e.g., B01_Gerstein_01234456_A3.pdf)
4. **In your submissions, make sure that:**
a) you comment on the results from R outputs whenever you are asked to do so.
b) your plots and graphs are properly labeled.
5. **Submit on Crowdmark**
\newpage
*Note you should use Preview or Knit to properly display the formulas below.*
## Question 1:
Let $X_1,X_2,\ldots, X_n$ be a random sample from $N(\mu, \sigma^2)$.
\bigskip
In the case where $\sigma^2$ is known, the $100(1-\alpha)\%$ confidence interval for $\mu$ is given by:
\begin{equation}
\label{zint}
\left(\overline{x}-z_{\alpha/2}\; \dfrac{\sigma}{\sqrt{n}},
\;\;
\overline{x}+z_{\alpha/2} \; \dfrac{\sigma}{\sqrt{n}} \right),
\end{equation}
where $z_{\alpha/2}$ is the $(1-\frac{\alpha}{2})^{\text{th}}$ quantile of the standard normal distribution.
\bigskip
In the case where $\sigma^2$ is unknown, the $100(1-\alpha)\%$ confidence interval for $\mu$ is obtained using:
\begin{equation}
\label{tint}
\left(\overline{x}-t_{n-1, \alpha/2} \; \dfrac{s}{\sqrt{n}},
\;\;
\overline{x}+t_{n-1 ,\alpha/2} \; \dfrac{s}{\sqrt{n}} \right),
\end{equation}
where $s = \sqrt{\dfrac{\sum_{i=1}^n (x_i – \overline{x})^2}{n-1}}$ is the sample standard deviation and $t_{n-1 ,\alpha/2}$ is the $(1-\frac{\alpha}{2})^{\text{th}}$ quantile of the Student t-distribution with $n-1$ degrees of freedom.
\bigskip
In this question, you are going to conduct a simulation study to compare the lengths and coverage probabilities of the Z and t intervals given in equations \eqref{zint} and \eqref{tint}, respectively.
a) Generate $1000$ random samples of size $n = 20$ from $N(\mu=75, \sigma^2 = 25)$ distribution, using `set.seed(1875)` in the beginning of your R code.
For each sample, calculate the $95\%$ Z interval using equation \eqref{zint} and the $95\%$ t interval using equation \eqref{tint}. Save the lower and upper bounds of each interval in a matrix or data frame. Report the intervals you obtained for the first five samples.
“`{r}
set.seed(1875)
# define the number of samples to generate
# set the size of the sames
# set alpha, mu and sigma
# create an empty dataframe or matrix for zInt and another for tInt to store the interval bounds
# sample M times:
for(){
# generate the sample
# store and calculate the Z interval
# store and calculate the t interval
}
# print the first five calculated intervals from Z and t
“`
b) The length of a confidence interval is obtained by subtracting the lower bound from the upper bound. Calculate the lengths of the $95\%$ Z and t intervals you obtained in part (a).
Report the proportion of samples (out of $1000$) for which the t interval was shorter than the Z interval. Briefly comment on the result.
“`{r}
# calculate the length of the Z interval
# calculate the length of the t interval
# find the proportion of samples where t < Z # Comment on the result ``` c) The coverage probability is obtained by counting the number of intervals containing the true population parameter and dividing this number to the total number of intervals. Calculate and report the coverage probabilities of the $95\%$ Z and t intervals you obtained in part (a). Which one has a coverage probability closer to $0.95$? ```{r} # calculate the Z coverage probability # calculate the t coverage probability # Which one has a covereage probability closer to 0.95 ``` ## Question 2 Let $X_1, \ldots, X_n$ be a random sample from Normal$(\mu, \sigma^2)$ where $\sigma^2$ is known. We want to test $$ H_0 : \mu \geq \mu_0 \qquad \text{versus} \qquad H_1 : \mu < \mu_0. $$ The decision rule is to reject $H_0$ if $\bar{X} < c$. Suppose $\sigma^2 = 15$, $\mu_0 = 32$, $n = 20$ and $\alpha = 0.01$. Use R to answer the following questions: a) What is the critical region of the test? ```{r} # set n, mu0, sigmasq, alpha # calculate the critical c value using qnorm # what is the critical region? ``` b) When $\mu = 29$, what is the power of the test? ```{r} # set mu1 # calculate beta using pnorm # calculate power as 1 - beta # what is the power of the test? ``` c) Find $n$ and $c$ such that $\alpha = 0.01$ and $\beta = 0.05$. Display your solution graphically. ```{r} # set alpha and beta # Write two functions, 'Acrit_func' and 'Bcrit_func' # test critical n values from 1-40 and graphically display the solution # Use the uniroot funciton to find the intersection of the two curves from the above plot # Answer ``` ## Question 3 A study collected data on $n = 400$ college students and classified them according to both frequency of marijuana use and parental use of alcohol and psychoactive drugs. \begin{center} \begin{tabular}{|l|ccc c|} \hline & \multicolumn{3}{c}{Student frequency of marijuana use} &\\ Parental drug and alcohol use & Never & Occasional & Regular & \\ \hline Neither & 120 & 50 & 40 & \\ One & 70 & 45 & 30 &\\ Both & 15 & 10 & 20&\\ \hline \end{tabular} \end{center} \bigskip Is there any relationship between student frequency of marijuana use and their parents' use of alcohol and psychoactive drugs? Use R to test this hypothesis at $\alpha = 0.01$. Briefly state your conclusion. ```{r} # define a matrix with the observations # conduct a test # what is your conlcusion? ```