Statistical Inference STAT 431
Lecture 5: Basic Concepts of Inference (II) Hypothesis Testing
An Overview of Statistical Inference Problems
• Making probabilistic statements about an unknown population parameter based on random sample from the population
• Estimation (last lecture)
– Point Estimation: estimate the value of the population parameter
– Confidence Interval: estimate an interval in which the parameter lies
• Hypothesis Testing
– Make decision (yes/no) on a hypothetical statement about the parameter
STAT 431
Example: Revenue Neutral Tax Bill
• Problem setup:
– A senator proposes a new tax bill to simplify the tax code
– The senator claims that the changes are revenue neutral:
On balance, tax revenues will not change.
– To evaluate the senator’s claim, a random sample of 100 tax returns is selected; They are recomputed, using the proposed rule changes
– Results: average change in sample = -$219, sample SD = $725
• We want to test the data against the senator’s claim: the average change is zero!
• The parameter of interest here is μ = the mean change per tax return.
Adapted from Freedman, Pisani, Purves, and Adhikari, Statistics, 2nd Ed.
STAT 431
The Null and The Alternative Hypotheses
• We want to decide between two competing theories about a parameter
– Inthetaxexample,thetwotheoriesareμ=0andμ6=0
• One of the theories serves as a baseline: the null hypothesis H0
– Here, the null hypothesis is H0 : μ = 0
• The other theory is the alternative hypothesis H1
– Here, the alternative hypothesis is H1 : μ 6= 0
• Question: Which theory should serve as the null hypothesis?
• Answer: The crucial idea behind hypothesis testing is to prove H1 by disproving H0 ! So, we should put the theory we doubt as the null.
– We will believe in H1 and hence reject the null hypothesis when the data supportH1 strongly!
STAT 431
1.
2.
3. 4.
X1,…,Xn ⇠ F✓
– Need a null hypothesis H0 and an alternative hypothesis H1
Steps for Hypothesis Testing Random sample: i.i.d.
Formulate two hypotheses
– Both are statements on the population parameter ✓
Calculate the value of a test statistic t = T (x1, . . . , xn)
– Test statistic “measures” the difference between the empirical distribution
and the null hypothesis Find the P-value
– How probable is your observed data if the null hypothesis is true? Compare the P-value with a pre-specified significance level
èDecide whether or not to reject the null hypothesis STAT 431
Test Statistic
• A test statistic T (X1, . . . , Xn) is a random variable
– With observed data, we obtain t = T(x1,…,xn)
– Its distribution is known underH0 [at least approximately]
• In the tax return example, suppose we know = s = 725
X ̄ Z = /pn
– Consider the statistic
– With observed data, we could compute the value z = /pn
– Under the null hypothesisH0 : μ = 0, so Z ⇠ N(0,1) . So, Z is a valid test statistic!
x ̄
– Pluginx ̄= 219èz= 3.02
– What does z = 3.02 imply? What does it have to say about H0 ?
STAT 431
P-Value
• Definition: the probability under H0 of obtaining a test statistic at least as
“extreme” as the observed value on the current dataset.
• Direction of extremeness: What is the right P-value for our tax return example?
– Candidate 1:
P value = Pμ=0(Z z) = ( 3.02) = 0.0013
– Candidate 2:
P value = Pμ=0(|Z| |z|) = 2 ⇥ ( 3.02) = 0.0026
STAT 431
Significance Level: From P-Value to Decision
• Significance level is usually pre-specified at some fix value ↵
• Conventional significance levels:
= 0.01, 0.05, 0.1. ↵ = 0.05 is the most common choice.
• We reject the null hypothesis at significance level ↵, if P-value < ↵.
• In the tax return example, we reject H0 : μ = 0 at all conventional significance
levels
• Meaning of significance level: an identity
↵ = max P✓(Test rejects H0) ✓2H0
STAT 431
Interpretation of Hypothesis Testing Results
• When the null hypothesis is rejected, it means that the data give conclusive
evidence favoring the alternative.
• When the null hypothesis is not rejected, it does not mean the data give
conclusive evidence favoring the null!
– H0 might actually be true
– H0 might be false, but the evidence in the current dataset is not yet strong enough to test against it
– Absence of evidence is not evidence of absence!
• If you hope to “discover” that something is true, put that something inH1 , notH0 – H0 : drug has no effect; H1: it has
– H0 : new website feature has no positive effect on sales;
H1 : new feature improves sales
STAT 431
Errors in Testing and Power
• Two types of error in hypothesis testing
– Type I error: reject H0 when H0 is true [false positive]
– Type II error: do not reject H0 when H0 is false [false negative]
H0
True False
• Probabilities of two types of errors
– ↵a= P(Type I error) = max✓2H0 P✓(Rejecting H0), i.e., significance level – a= P(Type II error) = P✓(Not rejecting H0)
– Power = 1a = P✓(Rejecting H0)
– What properties should a good testing procedure have?
STAT 431
Decision
Do not reject
Reject Hl 0
Type I Error
Correct
l0 Correct
Type II Error
H
• Key points of this class:
Class Summary
– Concepts in hypothesis testing
• Null and alternative hypotheses
• Test statistics
• P-value / significance level
• Type I and type II errors / power
– Steps for a hypothesis testing problem
– Philosophy behind hypothesis testing
• The asymmetric roles of the null and the alternative hypothesis
• Reading: Section 6.3 of the textbook
• Next class: Sampling Distributions (I) (Ch.5)
STAT 431