CS计算机代考程序代写 Hypothesis Testing

Hypothesis Testing

• basic ingredients of a hypothesis test are

1. the null hypothesis, denoted Ho

2. the alternative hypothesis, denoted Ha

3. the test statistic

4. the the data

5. the conclusion

• the hypotheses are usually statements about the values of one or more
unknown parameters, denoted θ here

• the null hypothesis is usually a more restrictive statement than the alter-
native hypothesis, e.g. Ho : θ = θo, Ha : θ 6= θo

• the burden of proof is on the alternative hypothesis

• we will continue to believe in the null hypothesis unless there is very strong
evidence in the data to refute it

• the test statistic measures agreement of the data with the null hypothesis

– a reasonable combination of the data and the hypothesized value of
the parameter

– gets bigger when the data agrees less with the null hypothesis

• when θ̂ is an estimator for θ with standard error s
θ̂
, a common test statistic

has the form

z =
θ̂ − θo

s
θ̂

• when the data agrees perfectly with the null hypothesis, z = 0

• when the estimated and hypothesized values for θ become farther apart,
z increases in magnitude

• there are two closely related approaches to testing

1. one weighs the evidence against Ho

2. the other ends in a decision to reject, or not to reject Ho.

• the first uses the significance probability or P-value

– the probability of obtaining a value of the test statistic as or more
extreme than the value actually observed, assuming that Ho is true

– this requires knowledge of the distribution of the test statistic under
the assumption that Ho is true, the null distribution

1

• for the two-sided alternative and test statistic mentioned above, the P-
value is

P = 2Pr(|z| ≥ |zobserved|)

• the factor 2 is required because a priori the sign of zobserved is not known,
and large (in magnitude) negative and positive values of z give evidence
against Ho

• occasionally we use a one-sided alternative, Ha : θ > θo or Ha : θ < θo • in these cases P = Pr(z ≥ zobserved) and P = Pr(z ≤ zobserved) respectively • the strength of the evidence against Ho is determined by the size of the P -value – a smaller value for P gives stronger evidence • the logic is that if Ho is true, extreme values for the test statistic are unlikely, and therefore a possible indication that Ho is not true • by convention we draw the following conclusions P value Strength of evidence against Ho > .10 none

(.05, .10] weak
(.01, .05] strong

< .01 very strong • when P < .01, for example, we could say that ‘the results are statistically significant at the .01 level’ • the second approach to hypothesis testing requires a decision be made whether or not to reject Ho • one way to do this is to compare the P value to a small cut-off called the significance level α and to reject Ho if P ≤ α • another approach is to choose a rejection region and to reject Ho if the test statistic falls in this region • two types of error are possible with this approach 1. a type I error occurs if Ho is rejected when it is true 2. a type II error occurs if Ho is not rejected when it is false 2 • the type I error is considered to be much more important than the type II error • a common analogy is with a court of law – in murder cases the presumption of innocence (Ho) is rejected only when the jury is convinced “beyond a shadow of a doubt” by very strong evidence (an extreme value for the test statistic) – the type I error would be to convict and hang the accused (reject Ho) when he is innocent (Ho is true) – the type II error, considered less serious, would be to let a guilty man go free (don’t reject Ho when it is false) • recognizing the seriousness of the type I error, the rejection region is chosen so that the probability of rejecting Ho when it is true is a small value α • for example, the test statistic z discussed above frequently has an approx- imate normal distribution. For the two-sided alternative, with α = .05, the rejection region consists of the values |z| ≥ zα/2 = 1.96. • when the data is assumed to be normally distributed and the variance is unknown and estimated by a sample variance, we use the t distribution • finally, the data is collected and the test statistic is computed • if the test statistic falls in the rejection region we reject Ho at level α. • otherwise we do not reject Ho at level α • remember that – a rejected Ho may in fact be true – an Ho which is not rejected is probably not true either (This is why I never say ‘Ho is accepted’). – a result which is statistically significant (i.e. we have rejected Ho) may have no practical significance. With a very large sample size almost any Ho will be rejected. 3