Statistical Inference STAT 431
Lecture 6: Inferences for Single Samples (I)
• Setup: X1,…Xn
Var(Xi) = 2 X=n Xi
i.i.d.
⇠ F with E(Xi) = μ
• An intuitive estimator for μ is the sample mean ̄ 1 Xn
i=1
Sample Mean
• Using the mean and the variance of F , we have E(X ̄)=μ, Var(X ̄)= 2
n
– So X ̄ is an unbiased estimator for μ, with the variance shown above.
• A further question: What is the sampling distribution of X ̄ ?
STAT 431 2
Sample Mean: Exact Sampling Distributions
• From probability theory, we know the exact sampling distribution of the sample
mean in the following two cases
1. Bernoulli population: X1 , . . . , Xn i.i.d. Bernoulli(p)
⇣ ̄x⌘ Xn !✓n◆x n x P X=n =P Xi=x = x p(1 p)
i=1
2. Normal population: X1, . . . , Xn i.i.d. N(μ, 2) X ̄ ⇠ N ✓ μ , 2 ◆
n
, x=0,1,…,n.
STAT 431
3
Sample Mean: Approximate Sampling Distributions • Central Limit Theorem:
LetX1,…,Xn bearandomsampledrawnfromanarbitrary 2
distribution with meanμ and variance . As n ! 1, X ̄ μ
/pn )N(0,1) • Intuitively: For large n, d ✓ 2 ◆
X ̄ ⇡ N μ , n
• A practical question: How large an n do we need to apply the approximation?
– The shape of the distribution: skewness
– The desired accuracy
STAT 431 4
Skewness and Normal Approximation
population density
sample size = 2
0.0
0.2 0.4
0.6
0.8
1.0
0.0 0.2
0.4 0.6 0.8
x.bar
sample size = 25
1.0
x
sample size = 5
0.2
0.4
0.6
0.8
0.3 0.4
0.5 0.6
x.bar
0.7
x.bar
STAT 431
5
0.0 0.5 1.0 1.5 2.0
2.5 3.0
3.5
0.6
0.8
0 2 4
6
0.0 0.5 1.0 1.5 2.0
Density
Density
1.0 1.2 1.4
Density
Density
Skewness and Normal Approximation
population density
sample size = 2
●
● ●●●●●● ●
● ●● ●
● ●●●
0.0 0.2 0.4 0.6 0.8 1.0 x
sample size = 5
−3 −2 −1 0 1 2 3 Theoretical Quantiles
sample size = 25
●
● ●
● ●●●
● ●● ●
●
● ●
● ●
●●●● ●
● ●
●● ●●●●
● ●
●
● ●
● ●●●●●● ●
−3 −2 −1 0 1 2 3 Theoretical Quantiles
−3 −2 −1 0 1 2 3 Theoretical Quantiles
STAT 431
6
Sample Quantiles
−3 −2 −1 0 1 2 3 0.6
0.8
Density
1.0 1.2 1.4
Sample Quantiles
−3 −2 −1 0 1 2 3 4
−2
Sample Quantiles
−1 0 1 2
Skewness and Normal Approximation (Cont’d)
population density
sample size = 2
0.0 0.2 0.4
0.6 0.8
1.0
0.1 0.2 0.3 0.4
x.bar
sample size = 25
x
sample size = 5
0.05 0.10 0.15 0.20 0.25 0.30 0.35
x.bar
0.10 0.12 0.14 0.16 0.18 0.20 0.22 0.24
x.bar
STAT 431
7
Density
Density
0 1 2 3 4
0 2 4 6 8 10
Density
Density
05101520 0123456
Skewness and Normal Approximation (Cont’d)
population density
sample size = 2
● ●
● ●
● ● ●●●●●●
● ● ●
● ●●●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0 x
sample size = 5
−3 −2 −1 0 1 2 3 Theoretical Quantiles
sample size = 25
●
●
●● ●●●
●● ●
● ●
●
● ● ●
●● ●
●
●
●
● ●●
●●
●
●
●
●●● ●
● ●
● ●●
−3 −2 −1 0 1 2 3 Theoretical Quantiles
−3 −2 −1 0 1 2 3 Theoretical Quantiles
STAT 431
8
Sample Quantiles
−2 −1 0 1 2 3 4
0 1
Density
2 3 4
Sample Quantiles
−3 −2 −1 0 1 2 3
Sample Quantiles
−2 −1 0 1 2 3 4
A Special Case: Bernoulli Population
• If X1,…Xn i.i.d.Bernoulli(p), thennX ̄ ⇠ Bin(n,p)
• Ruleofthumb:ifboth np 10 andn(1 p) 10,then X ̄ p n X ̄ n p d
pp(1 p)/n = pnp(1 p) ⇡ N (0, 1)
• Continuity correction Example:n=20,p=0.5
– Exact probability: P (nX ̄ 8) = – Without continuity correction
̄
P
✓nX ̄ p 8 10 ◆ P (nX 8) ⇡ P pnp(1 p) p20(0.5)(0.5)
8 20 i=0 i
(0.5)i (0.5)20 i = 0.2517 ⇡ ( 0.8944) = 0.1867
⇡ ( 0.6708) = 0.2514
–
With continuity correction
✓ nX ̄ p 8 10+0.5 ◆ P (nX 8) ⇡ P pnp(1 p) p20(0.5)(0.5)
̄
STAT 431
9
Large Sample Inferences for Mean i.i.d.
X1,…,Xn ⇠ F
• F is not necessarily normal!
• Parameter of interest: the mean value μ of F
• We shall deal with the large sample case:
– The sample size n is large. Typically, n 30.
– The population SD could be assumed known.
If not, then the sample SD S gives accurate estimation. [can think of as S = ]
• Point estimation of μ : we always use X ̄ .
• Interest: confidence interval and hypothesis testing
• Under our setup, the following pivotal r.v. for μ plays the key role
X ̄ μ d
/pn ⇡N(0,1)
STAT 431 10
• •
Example: Chips Ahoy!
In the mid-1990’s, Nabisco advertised Chips Ahoy! as being “1000 chips delicious”
In late 1990’s, a group of students at the US Air Force Academy conducted a study on the number of chips contained in each
18 oz bag by sampling 42 bags from 275 sent by the company
STAT 431 11
Data: counts of chips per bag in 42 sampled bags
• •
• •
Interested in constructing a 95% confidence interval for the average number of chips μ per bag
By pivotality of Z ,
✓ X ̄ μ ◆
Histogram of chipcount
Example: Chips Ahoy! (Cont’d)
Pμ 1.96Z = /pn 1.96 ⇡0.95 So, a 95% confidence interval for μ is
̄ ̄ X 1.96pn, X + 1.96pn
1000
1100 1200 1300 1400 1500 1600
chipcount
n = 42
x ̄ = 1261.57, ⇡ s = 117.58 .
So, we obtain the CI
[1226.01, 1297.13]
Sample mean & sample SD
x ̄ = 1261.57 s = 117.58
STAT 431
12
For this data,
and
Frequency
0 5 10 15
Confidence Interval with Pre-specified Width
• We want the confidence interval [for a given confidence level] to be narrow
• Sometimes, we might want it to be no wider than a pre-specified width
• Fix the confidence level at 95%, the half-width of the CI is 1.96 pn
1.96 is determined by the confidence level
is a population parameterèIts value cannot be changed
– – –
does n needtobe? p p
We could only increase the sample size to reach the pre-specified width
• Chip Ahoy! Example: If we want the 95% CI to be no wider than 25 , how large
– Notethat ⇡s=117.58èwidth=2⇥1.96 / n=460.91/ n
– Solve 460.91/pn 25 è n 339.9
– Therefore, we need a sample of size at least 340 to produce a 95% confidence interval no wider than 25
– Note: sample size has to be an integer!
STAT 431 13
General Two-Sided Confidence Intervals for the Mean
at any confidence level
• We can construct CI for the mean ̄ ̄
100(1 ↵)%
X z↵/2pn, X +z↵/2pn
μ
• ThewidthofthisCIis 2⇥z /2pn
• IfwewanttheCItobenowiderthana
pre-specified width 2E , solving
E z↵/2 /pn l⇣z↵/2 ⌘2m
èwe need a sample size n E
• The half-width E is also called margin of error of the CI
STAT 431
14
Example: Chips Ahoy! (Cont’d)
• Nabisco advertised Chips Ahoy! as being “1000 chip delicious”
• To guarantee that, the average number of chips per bag needs to be at least 1200
• Suppose you work for Nabisco, and would like to prove the above claim.
How should you set up the null and the alternative hypotheses? 1. H0 :μ1200, vs. H1 :μ>1200
2.
3. H0 :μ=1200, vs. H1 :μ6=1200
H0 :μ 1200, vs. H1 :μ<1200
• What if you work for a competing company of Nabisco?
STAT 431 15
Example: Chips Ahoy! (Cont’d) • Suppose we have decided to test
H0 :μ1200, vs. H1 :μ>1200
X ̄ 1200
• To perform the test, we use the test statistic Z = /pn
1261.57 1200
– With the current sample, the observed value is z = 117.58/p42
• The next step is to compute the P-value. Here, ✓ X ̄ 1200 P-value= maxPμ(Z z)= max Pμ p
◆
z
μ1200 μ1200 / n
✓ X ̄ μ 1200 μ ◆ =maxPμ p z+p
μ1200 / n / n
✓ 1200 μ◆ = max 1 z+ p
μ1200 / n = 1 (3.39) = 0.0003
So, H0 is rejected at significance level = 0.05
STAT 431 16
General Lower One-Sided Test • Consider the general form of the problem
H0 :μμ0 vs. H1 :μ>μ0 X ̄ μ 0 x ̄ μ 0
• Test statistic Z = /pn (observed value z = /pn )
✓ X ̄ μ 0 p
◆
z
• Computing P-value
P-value = maxPμ(Z z) = max Pμ
/ n
μμ0 =maxPμ
μμ0
=max1 z+ p
μμ0
✓ X ̄ μ μ 0 μ ◆ p z+ p
/ n / n ✓ μ0 μ◆
• Decision: for a test with significance level ↵ RejectH0 when P-value=1 (z)<