Chapter 5
Consistency and Limiting Distributions
5.1 Convergence in Probability
1/24
Boxiang Wang
Chapter 5 STAT 4101 Spring 2021
Motivation
In the calculus class, the limit has been defined for a sequence: xn →x, asn→∞.
What if we have a sequence of random variables X1, . . . , Xn? How can we say something like Xn → X?
For the random variables, let us define a type of limit: convergence in probability.
Intuitively, with probability that is arbitrarily close to one, the value of Xn is arbitrarily close to the value of X.
2/24
Boxiang Wang
Chapter 5 STAT 4101 Spring 2021
Definition: Convergence in Probability
Let {Xn} be a sequence of random variables and let X be a random variable defined on a sample space. We say Xn converges in probability to X if for all ε > 0,
equivalently,
lim P(|Xn−X|≥ε)=0; n→∞
lim P(|Xn−X|<ε)=1. n→∞
WewriteXn→P X.
Example:X∼Unif(0,1)andXn =X+I(0,1)(X).Thenwe
n
seeXn→P X.
A very common case: the limiting random variable X is a
constant a, say, Xn →P a.
3/24
Boxiang Wang
Chapter 5 STAT 4101 Spring 2021
Weak Law of Large Numbers
4/24
Useful Tool: Markov Inequality
Let X be a positive random variable and EX < ∞. Then for every positive real number a, we have that
P (X > a) ≤ EX . a
Sketch of proof: Note that
Y =X−aI(X>a)≥0.
ThusEY =EX−aP(X>a)≥0.
5/24
Boxiang Wang
Chapter 5 STAT 4101 Spring 2021
Useful Tool: Chebyshev Inequality
Let X be a random variable with mean μ and variance σ2. Then for any a > 0, we have that
P(|X−μ|>a)≤ Var(X). a2
Sketch of proof: By Markov inequality, we have
2 2E(X−μ)2 P(X−μ)>a≤ a2 .
6/24
Boxiang Wang
Chapter 5 STAT 4101 Spring 2021
Remarks on Chebyshev Inequality
Chebyshev’s inequality is a distribution-free result. It holds for any random variables with finite mean and variance.
Leta=kσ,wehave P(|X−μ|>kσ)≤Var(X)= 1.
Suppose k = 2, 3, then
k2σ2 k2 P(|X−μ|>2σ)≤ 1 =0.25,
22 P(|X−μ|>3σ)≤ 1 =0.11.
32
For any random variable, the probability between two values symmetric about the mean should be related to the standard deviation:
P(μ−kσ
Letusassumeσ=1,n=20.Plugina=1,thenwehave P (|X ̄n − μ| > 1) ≤ 0.05,
which says X ̄n ± 1 is an at least 95% confidence interval for μ.
Note that the confidence interval due to Chebyshev’s inequality is usually very conservative while distribution-free.
8/24
Boxiang Wang
Chapter 5 STAT 4101 Spring 2021
Weak Law of Large Numbers (LLN): X ̄n →P μ
Let X1, . . . , Xn be a random sample from a distribution with mean μ and finite variance σ2. Then,
n X ̄n=1Xi→P μ.
Sketch of proof:
From Chebyshev’s inequality:
n i=1
σ2 PX ̄n−μ>ε≤nε2 →0.
9/24
Boxiang Wang
Chapter 5 STAT 4101 Spring 2021
̄ a.s. Strong Law of Large Numbers: Xn → μ
Let X1, . . . , Xn be a random sample from a distribution with mean μ and E|Xi| < ∞. Then,
Plim X ̄n =μ=1. n→∞
Remarks:
The convergence shown above is called almost sure convergence, which implies convergence in probability (thus a.s. convergence is stronger).
The weak law has stronger conditions than that of strong law but the conclusion is weaker.
The strong law is optional in this class.
10/24
Boxiang Wang
Chapter 5 STAT 4101 Spring 2021
Consistency
11/24
Definition of Consistency
Let X be a random variable with pdf f (x; θ) or pmf p(x; θ) where
θ ∈ Θ. Let X1, . . . , Xn be a random sample from the distribution of X and let Tn denote a statistic.
Then Tn is a consistent estimator of θ if, for all θ ∈ Θ, T n →P θ .
If Tn is not consistent, then Tn is an inconsistent estimator of θ.
12/24
Boxiang Wang
Chapter 5 STAT 4101 Spring 2021
Examples of Consistent Estimators
iid ̄
If X1,...,Xn ∼ Bern(p), then EX = p. Thus X is a
consistent estimator of p. iid2 ̄
IfX1,...,Xn ∼N(μ,σ ),thenEX=μ.ThusXisa
consistent estimator of μ.
iid ̄
If X1,...,Xn ∼ Pois(λ), then EX = λ. Thus X is a consistent estimator of λ. √ √
iid ̄
If X1,...,Xn ∼ Gamma( θ, θ), then EX = θ. Thus X is a
consistent estimator of θ.
iid 2 ̄
If X1,...,Xn ∼ Gamma(θ,θ), then EX = θ and X is a 2√ ̄
consistent estimator of θ . Can we say X is a consistent estimator of θ?
13/24
Boxiang Wang
Chapter 5 STAT 4101 Spring 2021
Useful Results on Convergence in Prob.
14/24
Theorem 5.1.2
SupposeXn →P XandYn →P Y.ThenXn+Yn →P X+Y. Proof: From the triangle inequality, |a| + |b| ≥ |a + b|,
|Xn − X| + |Yn − Y | ≥ |(Xn + Yn) − (X + Y )| and therefore
P(|(Xn + Yn) − (X + Y )| > ε) ≤ P(|Xn − X| + |Yn − Y | > ε). But|Xn −X|+|Yn −Y|>εimpliesthatatleastoneof
|Xn−X|>ε and |Yn−Y|>ε 22
must be true. Therefore
P(|(Xn+Yn)−(X+Y)|>ε)≤P|Xn−X|> ε+P|Yn−Y|> ε 22
SinceXn →P XandYn →P Y,wehave
lim P |Xn − X | > ε = 0, n→∞ 2
lim P |Yn − Y | > ε = 0, n→∞ 2
the result follows.
15/24
Boxiang Wang
Chapter 5 STAT 4101 Spring 2021
Theorem 5.1.3
Suppose that Xn →P X and a is a constant. Then aXn →P aX.
Proof: The result is trivial for a = 0. Suppose that a ̸= 0. Then ε
P[|aXn−aX|≥ε]=P[|a||Xn−X|≥ε]=P |Xn−X|≥|a| . By hypothesis the limit of the term on the right is zero.
16/24
Boxiang Wang
Chapter 5 STAT 4101 Spring 2021
Theorem 5.1.4
Suppose Xn →P a and a real function g is continuous at a. Then g(Xn) →P g(a).
Proof:
Let ε > 0. Then since g is continuous at a, which says that there exists a δ > 0 such that |x − a| < δ implies that |g(x) − g(a)| < ε. Thus
|g(x) − g(a)| ≥ ε implies that |x − a| ≥ δ.
Substituting Xn for x in the above implication, we obtain P (g(Xn) − g(a)| ≥ ε) ≤ P(|Xn − a| ≥ δ).
17/24
Boxiang Wang
Chapter 5 STAT 4101 Spring 2021
Applications of Theorem 5.1.4
Theorem 5.1.4 gives us many useful results. For instance, if Xn →P a, then
Xn2 →P a2
1/Xn →P 1/a, provided a ̸= 0 √Xn →P √a, provided a ≥ 0
18/24
Boxiang Wang
Chapter 5 STAT 4101 Spring 2021
Theorem 5.1.5
IfXn →P XandYn →P Y,thenXnYn →P XY. Proof:
Using Theorem 5.1.2, 5.1.3, 5.1.4, we have
X n Y n = 1 X n2 + 1 Y n2 − 1 ( X n − Y n ) 2 222
→P 1X2+1Y2−1(X−Y)2=XY. 222
19/24
Boxiang Wang
Chapter 5 STAT 4101 Spring 2021
More Examples of Consistent Estimators
Applying Theorem 5.1.4, we have the follows.
iid ̄
If X1, . . . , Xn ∼ Bern(p), then X is a consistent estimator of
p. So X ̄ (1 − X ̄ ) is a consistent estimator of p(1 − p), which is
Var(X1 ).
iid ̄
If X1, . . . , Xn ∼ Pois(λ), then X is a consistent estimator of λ. So exp(−X ̄ ) is a consistent estimator of e−λ which is P(X1 =0).
iid ̄
If X1,...,Xn ∼ Gamma(θ,θ), then X is a consistent
2√ ̄
estimator of θ . So X is a consistent estimator of θ.
20/24
Boxiang Wang
Chapter 5 STAT 4101 Spring 2021
Consistent Estimator of Variance
Let X1, . . . , Xn be a random sample from a distribution with mean μ and finite variance σ2. We further assume EX14 < ∞.
1 ThesamplevarianceS2= 1 n (X−X ̄)2isa n n−1 i=1 i
consistent estimator of σ2. It is unbiased. Note:
S n2 = Why?
nn 1 X i − X n 2 = n 1 X i 2 − X 2n
n−1i=1
Let Yi = Xi2. Due to WLLN, Y ̄n →P EY1, which is
n−1 ni=1
→P 1·EX12−μ2=σ2.
1 n X 2 →P E X 2 . ni=1i 1
Also due to WLLN, X ̄n →P μ. Theorem 5.1.4 implies X ̄n2 →P μ2.
LastTheorem5.1.2shows1 n X2 −X2 →P EX2−μ2. ni=1in 1
21/24
Boxiang Wang
Chapter 5 STAT 4101 Spring 2021
2 The statistic T2 = 1 n (X − X ̄)2 is also a consistent n n i=1 i
estimator of σ2. It is biased:
E(Tn2) = n − 1σ2,
n
and the bias of Tn2 is −σ2/n, which vanishes as n → ∞.
22/24
Boxiang Wang
Chapter 5 STAT 4101 Spring 2021
Example (5.1.2): Max from a Uniform Distribution
Suppose that X1, . . . , Xn is iid from Uniform(0, θ). Then Yn = max{X1, . . . , Xn} is a consistent estimator for θ.
We see that
P(|Yn −θ|>ε) = P(Yn <θ−ε)
= P(Xi <θ−ε (i=1,...,n))
θ − εn
= θ → 0, as n → ∞.
Is Yn an unbiased estimator?
ThepdfofY = n tn−1 for0