程序代写代做代考 graph html go R Exercise: Inference for Population Proportion

R Exercise: Inference for Population Proportion
Economics 3818
Due date: Wednesday, July 1, 2020 11:59 p.m.
For this exercise, work in R script so you can save all of your code. In answering the questions below, please paste your R code and results into a word file, save into PDF, and upload to Canvas.
In this exercise, we will build up the analytical tools for conducting hypothesis tests and confidence intervals for a single population proportion.
Hypothesis Testing
Let’s first think about our hypothesis testing framework. H0 :p=p0
Ha :p̸=p0,pp0
Test size = α
Our test statistic:
pˆ − p 0
z = 􏰀p0(1−p0)
n
Let’s begin by using some specific numbers so that as you build your code, you put in the numbers, but then replace them with different numbers later. We will first build this off of a 2-sided test.
H0 :p0 =0.5 Ha :p̸=p0. α = 0.05
pˆ = 0.41
n = 80
1. In your R script file, let’s define objects that are assigned these numbers. For example we will define p0<-0.5 and then the next line phat<-0.41, n<-80. We name these so when we change values the formulas we have created don’t need changing. After the three lines that turned those values into objects, type each object name on separte lines. Then run your R script file on those lines by highlighting the lines and hitting the run button in the top right of the editing box. R should create your objects and then when it hits the line p0 will print the value, phat will print that value, n will print that value. 2. Now let’s build the Z statistic. First begin by creating the standard error that is in the denominator of the Z statistic. Do this a few lines below the things your wrote out for part 1. se<- sqrt((p0*(1-p0))/n). We are writing in the objects so that we can then change the object values and use the procedure over and over. Then on the next line define z<- writing out the formula for the the z statistic: difference between phat and p0 divided by se. Type each object name on a separate line below so you can get the output. Run that part of the code to see that it works. 1 Before you begin on 3, open up the standard normal applet to visualize the tail probabilities associated with the p-values you will be finding. You should already have a z value that is a number from what you have already done. Go to the applet link below and move the scroll to that z value number. It will provide you with a graphic depiction of the p-value you are after in 3. http://digitalfirst.bfwpub.com/stats_applet/stats_applet_7_norm.html 3. Next we need to calulate the p-value for our test statistic. Remember what the p-value is: the probability that we saw something as extreme or more extreme as what we observe, pˆ = 0.41, when the true population parameter is 0.5. Picture the Z distribution. To calculate the tail probabilities, we need to think about whether z is positive or negative. In the example we are starting with, z will be negative since pˆ < p0. But if we had a different pˆ, say 0.57, z would be positive and we need the upper tail probability. For a two-sided test with z negative, our p-value will be pvalue = 2 ∗ P (Z < z). To get R to do this, we use the pnorm function: 2*pnorm(z,0,1,lower.tail=TRUE). But if z were positive, pˆ > p0, the p-value would be 2*pnorm(z,0,1,lower.tail=FALSE) – we need to calculate the upper tail probabiltiy which is why we have lower.tail=FALSE. We want to be able to generally handle these two cases. To this end we will use the ifelse function in R. ifelse has three parts you need to type in. A test – in this case z<0. The next entry is what you want the pvalue to be if z<0 is true which is 2*pnorm(z,0,1,lower.tail=TRUE). The next entry is what you want the p-value to be if z ≥ 0 which is 2*pnorm(z,0,1,lower.tail=TRUE). pvalue<-ifelse(z<0, 2*pnorm(z,0,1,lower.tail=TRUE),2*pnorm(z,0,1,lower.tail=FALSE) ) On the next line write pvalue so that when you run the code it gives you the value. Interpret the p-value in terms of statistical evidence against H0 : p = 0.5 4. Let’s change to a different problem given below. Suppose we take a sample of 100 and observe 82 successes, pˆ = 0.82. Using the code you created, conduct a hypothesis test with test size α = 0.05. Make sure you provide all the values and your conclusion. H0 :p=.70 Ha :p̸=.70. α = 0.05 pˆ = 0.82 n = 100 Confidence Intervals Now we turn to confidence intervals. Recall the form of the confidence interval. 􏰁 pˆ(1 − pˆ) n 5. Create the code, similar what you did for the hypothesis test, to find the upper and lower bounds of the confidence interval. Here you will need to make objects defining phat, z∗, n and then define the formula for the lower and upper bounds using these objects. Begin with the same values as in problem 4 to create a 95% confidence interval: z∗ = 1.96 pˆ = 0.82, n = 100, 82 successes and 18 failures. What are your lower and upper ends of the CI? 6. For our last problem, let’s construct the plus-4 confidence interval using the confidence interval we constructed if problem 5. Remember how the plus-4 confidence interval is constructed. We add 4 imaginary observations assuming 2 successes and 2 failures. p ̃ =(number of success in sample+p2)/(n+4). The plus-4 CI looks the same but with p ̃ in place of pˆ and n+4 in place of n. What are the lower and upper bounds on the CI and how do they compare to the CI in problem 5? pˆ ± z ∗ p ̃ ± z∗ 􏰁 p ̃(1 − p ̃) n+4 (1) 2