程序代写代做代考 Statistical Inference STAT 431

Statistical Inference STAT 431
Lecture 8: Inferences for Single Samples (III)

Inferences with Normal Population Distribution
1. Inferences for μ with small sample, with known
2. Inferences for μ with small sample, with unknown
3. Inferences for 2
• We are mainly interested in confidence interval and hypothesis testing problems
• For all the above problems, we assume that the population distribution is normal
i.i.d. 2
X1,…,Xn ⇠ N(μ, )
STAT 431

Chi-Square Distribution
• Connection to normal distribution:
i.i.d. Pn 2 2 IfZ1,…Zn ⇠ N(0,1),then i=1Zi ⇠n.
• Shape of the Chi-square distributions
– Positively skewed
– The level of skewness decreases as
the degrees of freedom (d.f.) increase
– The curves shift to the right
as the d.f. increase
– For large d.f., the density curve has
an approximate bell shape
Chi−Square Densities
STAT 431
0 10 20 30 40 50 x
n = 5, 10, 15, 25
Density
0.00 0.05 0.10 0.15

Student’s t-Distribution • Connection to normal and chi-square distributions:
IfZ⇠N(0,1)isindependentofU⇠2n,thenT=pZ ⇠tn . U/n
t−Distribution Densities
Density
0.0 0.1 0.2 0.3 0.4
• Shape of the t distributions
– Symmetric around 0
– Have heavier tails than N (0, 1)
– Larger d.f.èlighter tails
– As d.f. tends to infinity, the density
curve converges to that of N (0, 1)
STAT 431
−4 −2 0 2 4
n = 1, 2, 5; N(0,1)

Key Properties of Sample Mean and Sample Variance Assumption: i.i.d. 2
• Sample mean satisfies
• Sample variance S2 =
̄ 2 X ⇠ N(μ, n )
1 Pn (Xi X ̄)2 satisfies n1 i=1
(n 1)S2 2 2 ⇠ n1
X1,…,Xn ⇠ N(μ, )
• They are mutually independent.
• The assumption of a normal population distribution is critical.
STAT 431

Small Sample Inferences for Mean (Unknown ) i.i.d. 2
• The new pivotal r.v.:
• By its pivotality, we have
X1,…,Xn ⇠ N(μ, ) X ̄ μ
T = S/pn ⇠ tn1
✓ X ̄μ ◆
Pμ tn1,↵/2 T = S/pn tn1,↵/2 • So a two-sided 100(1 ↵)% confidence interval for μ is
 ̄S ̄S X tn1,↵/2 pn , X + tn1,↵/2 pn
=1↵
Two-sided t-interval
STAT 431

• Dataset: 29 measurements of earth • density made by Henry Cavendish in
1798 (Textbook Example 7.7)
Normal Q−Q Plot
95% two-sided confidence interval for earth density μ : t28,0.025 = 2.048
Example: Earth Density
[5.367, 5.542]
t28,0.025 = 2.048 > z0.025 = 1.960
• Compare z-interval X ̄ ± z0.025/n witht-interval X ̄±tn1,0.025S/n:
èWe need a larger multiplier to accommodate the extra uncertainty
in S
• The difference between tn1,/2
and z/2 becomes smaller as the sample size increases
• Note that
● ●
● ●●●
●● ●
● ●
●● ●
● ●
● ●●
●●● ●●



● ●
−2 −1 0 1 2 Theoretical Quantiles
x ̄ = 5.454, s = 0.230
STAT 431
Sample Quantiles
−2 −1 0 1

Normal distribution vs. t-distribution
P(-2 μ 0
X ̄ μ x ̄ μ 0 p 0 , observed value t = s/pn
S/ n
P-value
P(tn1 t) P(tn1 t)
H0 μ  μ
Rejection region (level ↵ )
0 μ μ0
μ<μ μ 6 = μ n1, n1, t-test t > t t<t $ x ̄ > μ + t s 0 n1, pn
$x ̄<μ t s 0 n1, pn 0 0 ⇥ | x ̄ μ | > t s 0 n1, pn
| t | > t
H0 H1 Rejection region (level ↵ )
μ = μ
• Compare with the case of known SD, i.e. z-test
0
n1,
P-value
1(z) (z)
2[1 (|z|)]
μμ μ>μ0 z>z$x ̄>μ0+z ⇥ 0 pn
μ μ 0 μ=μ0
μ < μ 0 μ6=μ0 ⇥ z < z $ x ̄ < μ 0 z p n ⇥ |z|>z/2 ⇥|x ̄μ0|>z/2pn
STAT 431

Inference for Variance / SD
i.i.d. 2 X1,…,Xn ⇠ N(μ, )
(n 1)S2
• Pivotal random variable ⇥2 = 2 ⇠ ⇥2n1
✓2 (n1)S2 2 ◆ P⇤↵⇤↵=1
n1,12 ⇥2 n1,2 • 100(1↵)% two-sidedCIfor2:
2n1,↵/2
“(n1)S2 (n1)S2# 2,2
n1, ↵ n1,1 ↵ 22
• 100(1↵)% two sided CI for :
“Ss(n1), Ss (n1) #
2n1, ↵ 2n1,1 ↵ 22
STAT 431

(n 1)S2 • Test statistic ⇥2 = 02

Hypothesis Testing for Variance
• Consider test hypotheses
H0 :2 =02 vs. H1 :2 6=02 •
Example: Earth Density
29 measurements of earth density
Test measurement accuracy
H0 :=0.2 vs. H1 :6=0.2 Sample SD s = 0.230
Test statistic
2 (291)⇥0.232
= 0.22 =37.03
• Under H0 , 2 ⇠ 2n1 •
• So, a level ↵ test rejects H0 if •
2 > 2 ↵ or 2 < 2 ↵ n1, 2 n1,1 2 • Similarly, we can derive level ↵ tests • for one-sided hypotheses about 2 For = 0.05 , 2n1,1 ↵ 2 2n1, ↵ 2 = 15.31 = 44.46 (For details, see Table 7.6 of textbook) • Decision: do not reject H0 STAT 431 Class Summary • Key points for this class: for random sample from normal distribution – Inferences for mean, small sample, known SD pivotal r.v. X ̄ μ Z= /pn ⇠N(0,1) – Inferences for mean, small sample, unknown SD pivotal r.v. – Inferences for variance X ̄ μ T = S/pn ⇠ tn1 pivotal r.v. 2 (n1)S2 2 ⇥= 2 ⇠⇥n1 • Reading: Sections 7.2—7.3 • Next class: Inferences for two samples (I) (Sec. 8.1 & 8.3) STAT 431 Supplement: Probability Distributions in R • Standard distributions used in STAT 431 are all built into R – normal / binomial / Chi-square / Student’s t / uniform / exponential ... • For each distribution, four fundamental operations are needed 1. Evaluation of probability density / mass function 2. Evaluation of cumulative distribution / probability function 3. Evaluation of quantiles 4. (Pseudo-)random number generation • In R, for each distribution, there is a routine for each of the above four operations – For normal distribution, the routines are dnorm(density), pnorm(distribution), qnorm (quantile), rnorm (random numbers) STAT 431 Evaluation of Density / Probability Mass • Example 1: Let f be the density function of a N(2,32) distribution, what is the value of f(6)? In R, we type dnorm(6, mean=2, sd=3) , and the output is: > dnorm(6, mean=2, sd=3)
[1] 0.05467002
• Example 2: Let X ⇠ Bin(100, 1/3) , evaluate P (X = 10)
In R, we type dbinom(10, size=100, prob=1/3), and the output is:
> dbinom(10, size=100, prob=1/3)
[1] 4.157947e-08
STAT 431

Cumulative Distribution Function
• Example 1: LetX ⇠ t5, computeP(X  2).
In R, we type pt(2, df=5), and the output is: > pt(2, df=5)
[1] 0.9490303
• Example 2: Let X ⇠ 210 , evaluate P (X 2 [1.2, 3.5]) .
In R, we type pchisq(3.5, df=10) – pchisq(1.2, df=10), and the output
is:
> pchisq(3.5, df=10) – pchisq(1.2, df=10)
[1] 0.03250713
STAT 431

Quantiles
• Example 1: 0.95-th quantile of the standard normal distribution
In R, we type qnorm(0.95)[Why not specifying mean and sd?], and the output is: > qnorm(0.95)
[1] 1.644854
• Example 2: 0.95-th quantile of the t distribution with 5 degrees of freedom
> qt(0.95, df = 5)
[1] 2.015048
STAT 431

(Pseudo-)Random Number Generation
• Example 1: Generate a random sample of size 10 from the N (0, 1) distribution
In R, we type rnorm(10), and the output is: > rnorm(10)
[1] -0.979888281 -0.003701416 -1.573639923 -1.618217104 -0.664689207 [6] -0.477482135 -1.054771777 -1.133756694 0.035470016 -2.233821436
Type rnorm(10) again, we get a different set of 10 numbers:
> rnorm(10)
[1] 1.0300531 -2.0553873 0.1419764 -0.3631803 1.2523628 0.7639625 [7] 0.1044117 -1.0025112 -0.3160302 0.5591930
• These are pseudo random numbers. If we fix the seed, then the “random” numbers are generated from a deterministic sequence!
STAT 431

• For example, we can set the seed value to be 13. > set.seed(13)
> rnorm(10)
[1] 0.5543269 -0.2802719 1.7751634 0.1873201 1.1425261 0.4155261 [7] 1.2295066 0.2366797 -0.3653828 1.1051443
• Type these two commands again, we get the same set of 10 numbers. > set.seed(13)
> rnorm(10)
[1] 0.5543269 -0.2802719 1.7751634 0.1873201 1.1425261 0.4155261 [7] 1.2295066 0.2366797 -0.3653828 1.1051443
• It is suggested that you set the seed each time generating random numbers, so the results you obtain will be reproducible.
STAT 431

Routines for Probability Distributions in R
Distribution
Binomial Student’s t Exponential
Routines
(density/CDF/quantile/ra ndom number)
d{p,q,r}binom d{p,q,r}t d{p,q,r}exp
Normal
d{p,q,r}norm
Chi-square
d{p,q,r}chisq
Uniform
d{p,q,r}unif
STAT 431