CS计算机代考程序代写 Chapter 4

Chapter 4
Some Elementary Statistical Inference
4.1 Sampling and Statistics
1/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

Outline
1 Two basic sampling methods. Slide 5 – 8.
2 Two general statistical models. Slide 10 – 12.
3 Definition and properties of statistic. Slide 14 – 16.
2/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

3/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

Basic Sampling Methods
4/53

Two Basic Sampling Methods
Target: get a sample from the distribution that we are interested in.
􏰉 Sampling with replacement.
􏰉 Sampling without replacement.
5/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

Sampling with replacement
􏰉 An urn contains m balls, labeled from 1 to m.
􏰉 At the first trial, we take one ball out of the urn at random and let X1 denote the number. Because we can not know the number in advance, X1 is a random variable.
􏰉 What is the distribution of X1?
􏰉 Then, we put the first ball back in the urn, mix the balls, take
another ball out and let X2 denote the number.
􏰉 What is the distribution of X2?
􏰉 We can repeat this procedure n times and get n random variables X1, X2, · · · , Xn.
􏰉 X1, · · · , Xn are independent.
􏰉 Each of Xi has the same distribution.
6/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

Definition (4.1.1) of random sample
If the random variables X1, X2, . . . , Xn are independent and identically distributed (iid), then these random variables constitute a random sample of size n from the common distribution.
Example:
Suppose X1, . . . , Xn constitute a random sample from N(θ, 1). What is the joint pdf of X1,…,Xn?
7/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

Sampling without replacement
􏰉 At the first trial, we take one ball out of the urn and let X1 denote the number.
􏰉 Then, we don’t put the first ball back and take the second ball out of the urn. The number is denoted by X2.
􏰉 What is the distribution of X2?
􏰉 Are X1 and X2 independent?
􏰉 We can repeat this sampling method n times and get n random variables X1, · · · , Xn.
􏰉 Typically, X1, · · · , Xn are NOT independent.
8/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

Two General Statistical Models
9/53

􏰉 In general, statistics problems assume some structure (called models) on the data.
􏰉 Every model assumes that data are random variables following some distribution / density.
􏰉 Such distribution or density is unknown, and needs to be estimated from the data.
􏰉 Two models are studied in this class: non-parametric and parametric models.
10/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

Non-parametric and parametric models
Non-parametric model:
The pdf f(x) or pmf p(x) of a random variable we are interested in is completely unknown.
Parametric model:
􏰉 The form of f(x) or p(x) is known up to a parameter θ (or a
vector θ).
􏰉 Usually the pdf (or pmf) is denoted by f (x; θ) (or p(x; θ)).
􏰉 With θ ∈ Ω for a specified set Ω, Ω is called the parameter space.
We will focus on the parametric model in STAT 4101.
11/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

Examples of parameter space
􏰉 NormaldistributionN(μ,1),μ≡θ,μ∈R≡Ω.
􏰉 Exponential distribution exp(β), β ≡ θ, β ∈ R+ ≡ Ω.
􏰉 Binomial distribution Binom(n, p), p ≡ θ, p ∈ (0, 1) ≡ Ω.
􏰉 Normal distribution N(μ, σ2) has a θ = (μ, σ2), Ω = R × R+, where R+ is the positive half of the real line.
12/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

Statistic
13/53

Definition (4.1.2) of statistic
Suppose now that we want to use a random sample to gain information about an unknown parameter θ.
Definition: Let X1, . . . , Xn denote a sample on a random variable X. Let T = T(X1,X2,…,Xn) be a function of the sample. Then T is called a statistic.
Example:
1 Sample mean: T(X1,X2,…,Xn) = X ̄ = X1+…+Xn .
2 Sample variance: T(X1,X2,…,Xn) = S
Estimator or estimate:
Once the sample x1,…,xn is drawn, then t = T(x1,…,xn) is called the realization of T . The random variable T is called an estimator of θ and t an estimate of θ.
n
2 􏰃ni=1 (Xi −X ̄ )2
= n−1 .
14/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

Properties of statistic
􏰉 Given a parameter θ of interest, every statistic can be called an estimator of θ.
􏰉 We need to choose “good” estimators according to some criteria.
􏰉 For example, unbiased estimator, consistent estimator, estimator that has minimum MSE, etc.
15/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

Definition (4.1.3) of unbiasedness
Let X1, . . . , Xn denote a random sample on a random variable X with pdf f(x;θ), θ ∈ Ω. Let T = T(X1,X2,…,Xn) be a statistic. We say that T is an unbiased estimator of θ if E(T ) = θ.
Example
Let X1, . . . , Xn denote a random sample on a random variable X with mean μ and variance σ2.
1 2 3
X ̄ is an unbiased estimator of μ and find its variance.
1 X1 + 2 X2 is an unbiased estimator of μ and find its variance.
S2 = 1 􏰃n (Xi − X ̄)2 is an unbiased estimator of σ2. n−1 i=1
33
16/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

Chapter 4
Some Elementary Statistical Inference
4.2 Confidence Intervals
17/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

Outline
1 Motivation and definition of confidence intervals.
2 Examples of CI in different situations.
A detailed content table is seen on Slide 52
18/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

Motivation
􏰉 Suppose the parameter θ is estimated by a statistic θˆ = θˆ ( X 1 , . . . , X n ) .
􏰉 When the sample is drawn, it is unlikely the value of θˆ is exactly the true value of θ.
􏰉 We need an estimate of the error of the estimation θˆ, i.e., by how much did θˆ miss θ?
􏰉 Confidence interval…
19/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

Definition (4.2.1) of confidence interval
Let X1, . . . , Xn be a sample from a distribution that involves a parameter θ whose value is unknown. Let L = L(X1, . . . , Xn) and U = U(X1, . . . , Xn) be two statistics. We say that the interval
(L, U ) is a (1 − α)100% confidence interval for θ if
1−α=Pθ[θ∈(L,U)] ∀θ
The probability that the interval includes θ is 1 − α, which is called confidence coefficient (or confidence level) of the confidence interval.
20/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

CI for μ under normality (σ known)
Suppose that X1, . . . , Xn is a random sample from N(μ, σ2) distribution. Assume that σ2 is known and μ is unknown. A (1 − α)100% confidence interval for μ is given by
􏰍 ̄σ ̄σ􏰎 X−zα/2√n, X+zα/2√n .
Remark: Pivot random variable:
̄ 2 X ̄−μ
X ∼ N(μ,σ /n) ⇒ σ/√n ∼ N(0,1). P(Z > zα/2) = α/2, Z ∼ N(0,1).
Critical point: Confidence level:
􏰍 ̄σ ̄σ􏰎
P X−zα/2√n <μ2),
2 P(−2≤Z≤2). 3 P(|Z| > 1.645).
25/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

CI for μ under normality (σ known) (cont’d)
􏰉 When α = 0.05, zα/2 = 1.96.
􏰉 The 95% CI
􏰍 ̄σ ̄σ􏰎 X − 1.96√n, X + 1.96√n
is a random interval.
􏰉 Once the sample is drawn, denote the observed value of the
statistic X ̄ by x ̄, then the value of the CI is
􏰍σσ􏰎 x ̄−1.96√n, x ̄+1.96√n .
26/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

􏰉 Interpretation:
􏰉 if we drawn B samples independently from the underlying
distribution and we construct B confidence intervals for μ,
􏰉 we would expect about B(1 − α) successul confidence
intervals that trap μ.
􏰉 Thus we feel (1 − α)100% confidence that θ is within the
confidence interval.
􏰉 A measure of efficiency: the expected length.
Suppose (L1, U1) and (L2, U2) are two confidence intervals for θ at the same confidence level. Then we say (L1, U1) is more efficient than (L2, U2) if
Eθ(U1 − L1) ≤ Eθ(U2 − L2), ∀θ ∈ Ω.
􏰉 Selection of higher value of confidence level (1 − α) leads to a wider confidence interval.
􏰉 Choosing a larger sample size n leads to a narrower confidence interval.
27/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

Example (4.2.1) CI for μ under normality (σ unknown)
Suppose that X1, . . . , Xn is a random sample from N(μ, σ2) distribution. Assume that σ2 is unknown and μ is unknown. A (1 − α)100% confidence interval for μ is given by
􏰍 ̄S ̄S􏰎 X − tα/2,n−1 √n, X + tα/2,n−1 √n .
Remark: Pivot random variable: Student’s theorem, Thm 3.6.1: X ̄ − μ
S/√n ∼ tn−1.(requires the normality of X) Critical point:
P (T > tα/2,n−1) = α/2, T ∼ tn−1. Confidence level:
􏰍 ̄S ̄S􏰎
P X−tα/2,n−1√n <μ1,andVar(T)=E(T2)= r ifr>2. r−2
See Example 3.6.1.
30/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

Properties of t-Distribution
􏰉 The density function of t-distribution is symmetric, bell-shaped, and centered at 0.
􏰉 The variance of t-distribution is larger than the standard normal distribution.
􏰉 The tail of t-distribution is heavier (larger kurtosis).
31/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

Student’s Theorem (Thm 3.6.1)
Suppose X1, · · · , Xn are iid N(μ, σ2) random variables. Define the random variables,
1n 1n
X= 􏰄XiandS2= n i=1
􏰄􏰀Xi−X􏰁2 n − 1 i=1
Then
1 X∼N(μ,σ2);
n
2 X and S 2 are independent;
3 (n − 1)S2/σ2 ∼ χ2(n−1);
4 The random variable
X−μ T = S/√n
has a t-distribution with n − 1 degrees of freedom.
32/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

CI for σ2
Suppose that X1, . . . , Xn is a random sample from N(μ, σ2) distribution. Assume that both μ and σ2 are unknown. For
0 < α < 1, define χ2r,α/2 as the upper α/2 critical point of a χ2(r) distribution. A (1 − α)100% confidence interval for σ2 is given by 􏰐(n−1)S2 2 (n−1)S2􏰑 P χ2 <σ < χ2 =1−α. r,α/2 Remark: Pivot random variable: (n − 1)S2 σ2 r,1−α/2 􏰍2 (n−1)S22􏰎 P χr,1−α/2 < σ2 < χr,α/2 = 1 − α. 2 ∼ χ (n − 1). 33/53 Boxiang Wang Chapter 4 STAT 4101 Spring 2021 Central Limit Theorem 34/53 Theorem (4.2.1): Central limit theorem Let X1, . . . , Xn denote the observations of a random sample from a distribution that has mean μ and finite variance σ2. Then the√ distribution function of the random variable Zn = (X ̄ − μ)/(σ/ n) converges to Ψ, the distribution function of the N(0, 1) distribution, as n → ∞ (n > 30 in practice).
35/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

When σ is known,
􏰉 The pivot random variable Z = (X ̄ − μ)/(σ/√n) ∼ N(0, 1),
when X is normally distributed.
􏰉 With CLT,
X ̄ − μ Z = σ/√n
is approximately N(0,1), regardless of the distribution of X. When σ is unknown,
X ̄ − μ
􏰉 By Student’s theorem, the pivot random variable T = S/√n
follows tn−1, when X is normally distributed.
􏰉 With CLT and Slutsky’s theorem (will be discussed in Chapter
5),
X ̄ − μ T = S/√n
is approximately N(0,1), regardless of the distribution of X.
36/53
Boxiang Wang
Chapter 4 STAT 4101 Spring 2021

Large-sample CI for μ with known σ
Suppose X1, X2, . . . , Xn is a random sample on a random variable X with mean μ and known variance σ2, but the distribution of X is not normal. Since
X ̄ − μ √
σ/ n
is approximately N(0, 1), 􏰍 ̄σ ̄σ􏰎
P X−zα/2√n <μ