程序代写代做代考 html Statistical Inference STAT 431

Statistical Inference STAT 431
Lecture 4: Basic Concepts of Inference (I) Point Estimation & Confidence Intervals

An Overview of Statistical Inference Problems
• Making probabilistic statements about an unknown population parameter based
on a random sample from the population
• Estimation
– Point estimation: estimate the value of the unknown parameter
– Confidence interval: estimate an interval in which the parameter lies
• Hypothesis Testing
– Make decision (yes/no) on a hypothetical statement about the parameter
STAT 431

Point Estimation
• Basic setup: a random sample from a population with unknown parameter✓
X1,…,Xn ⇠ F
• An estimator ˆ ˆ
= (X1,…,Xn)
– A statistic computed from the data
– Examples:=μ,ˆ=X ̄;=⇥2,ˆ=S2
• Estimator vs. Estimate
– Estimator: a random variable, a rule to compute the desired value from data
– Estimate: the specific value obtained using the rule on the observed data
– Example: X ̄ vs. x ̄
STAT 431
i.i.d.

●●
−1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0
●●
−1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0
−1.0 −0.5 0.0 0.5 1.0
−1.0 −0.5 0.0 0.5 1.0
−1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0
STAT 431

Quality of A Point Estimator
– Definition: ˆ ˆ , measuring average accuracy
• Bias
– An estimator is unbiased if Bias(✓) = 0, i.e.,E(✓) = ✓
• Mean squared error (MSE) ˆˆ
Bias(✓) = E(✓) ✓
ˆˆ
• Variance
– Definition: Var(✓) = E[✓ E(✓)]2 , measuring precision
ˆˆˆ
– Definition: MSE(✓) = E[✓ ✓]2 , i.e., the expected squared error loss
– Basic identity:
ˆˆˆ2 MSE(✓) = Var(✓) + [Bias(✓)]
STAT 431

From Point to Interval
• Point estimation: from the sample X1 , . . . , Xn to a random number ✓ˆ [Shoot a lion (parameter) with a spear]
• Confidence interval: from the sample X1 , . . . , Xn to a random interval [L, U ] [Catch a lion with a cage]
• Both L and U are random, i.e., L = L(X1,…,Xn), U = U(X1,…,Xn) They are called the confidence limits.
• A confidence interval [L, U ] is a 100(1 ↵)% CI if P✓(L  ⇥  U) 1
confidence level
• After a sample is collected, we observe x1, . . . xn . The confidence limits = L(x1,…,xn), u = U(x1,…,xn) can be computed as numbers.
STAT 431

95% CI for Normal Mean (Known Variance)
• A random sampleX1, . . . , Xn from a N(μ, 2) distribution – 2 is known; μ is unknown
– Goal: to construct a 95% confidence interval for μ
X ̄ μ
• Note that /pn ⇠ N(0,1), and P(1.96  N(0,1)  1.96) = 0.95
✓ X ̄μ ◆
• So, Pμ 1.96 /pn 1.96 =0.95
• Equivalently,

✓ ̄ ̄◆ X1.96pn μX+1.96pn
=0.95
L
U
• Given data, we obtain two numbers ⇥ = x ̄ 1.96pn, u = x ̄ + 1.96pn
STAT 431

Example: Profitability of e-Grocery
• Companies that sell groceries over the Internet are called e-grocers. Customers enter their orders, pay by credit card, and receive delivery by truck.
• In order for an e-grocer to make profit, the average order size (in USD) has to be reasonably large.
• To determine potential profitability, a local grocer in Chicago wants to
understand whether the average order size of potential customers is large enough.
STAT 431

Customer
Order (USD)
Example: e-Grocery
• A random sample of 85 orders has been collected
• Normal assumption seems reasonable
• Suppose we know that = $17.50
• From the sample, we calculate x ̄ = $89.27
1 83.30
2 91.13
3 78.65
……
84
85
84.07
97.74
Normal Q−Q Plot
• So,
u=x ̄+1.96pn =$92.99
⇥=x ̄1.96pn =$85.55

●● ●
●●
● ●
● ●●

● ●
● ●
● ●
● ●
● ●
● ●

● ●
● ●

●●● ● ●
●●● ●
−2 −1 0 1 2 Theoretical Quantiles
• The 95% confidence interval for the mean order size is then [$85.55, $92.99] .
• What does it MEAN? STAT 431
Sample Quantiles
−2 −1 0 1 2 3

Interpretation of The 95% Confidence Interval
Which of the following interpretation is correct?
1. With 95% percent chance, the mean order size μ lies inside the interval [$85.55, $92.99]
1. Given the observed data, with 95% percent chance, the mean order size μ lies inside the interval [$85.55, $92.99]
1. 95% of the values of μ lie inside the interval [$85.55, $92.99]
2. The interval [$85.55, $92.99] captures the true value μ 95% of the time
STAT 431

Interpretation of The 95% Confidence Interval (Cont’d)
• The correct interpretation:
“In an infinitely long series of trials in which repeated samples of size n are drawn from the same distribution and 95% CI’s for μ are calculated using the same method, the proportion of intervals that actually include μ will be 95%. ”
• The 95% confidence level is about the method, not about any interval obtained by applying the method to the observed data!
• For any particular CI obtained from the observed data, we do not know whether or not it contains μ!
• An illustration applet: http://www.rossmanchance.com/applets/NewConfsim/Confsim.html
STAT 431

Confidence Interval & Pivotal Random Variable i.i.d.
• X1,…,Xn ⇠ N(μ,2),withknown.Tofinda100(1↵)%CIforμ X ̄ μ
• FromthesamplingdistributionofX ̄ ,weobtainZ= /pn ⇠N(0,1)
– Z is a function of the unknown parameter μ , but involves no other unknown parameters
Z is a pivotal random
variable for μ
– The distribution of Z is free of unknown parameters ✓ X ̄μ ◆
• Thisleadsto Pμ z/2 Z = ⇥/pn z/2 =1
• So,
✓ ̄⇥ ̄⇥◆
Pμ Xz/2pn μX+z/2pn =1
L
U
STAT 431

Critical values corresponding to an area 1 ↵ under the standard normal curve
STAT 431

The Role of Sampling Distribution
• We have considered the problem of estimating and constructing confidence intervals for the mean parameter μ based on a random sample from a N(μ, 2)
distribution with known 2
• The sampling distribution of X is X ⇠ N (μ, n )
• For estimation:
– Mean of the sampling distributionèbias of the estimator
– Variance of the sampling distributionèvariance of the estimator
– TogetherèMSE of the estimator
• For confidence interval:
– Manipulation of the sampling distribution è the pivotal random variable
• The sampling distribution ofX ̄ plays the KEY role in the inference of μ! STAT 431
̄ ̄ 2

Class Summary • Key points of this class (Sec. 6.1—6.2)
– Point estimation
• Bias / variance / MSE
– Confidence intervals
• Confidence level / interpretation
• Two-sided CI for a normal mean
• Pivotal random variable
• One-sided CI (read after class)
• Reading: Sections 6.1—6.2 of the textbook
• Next class: Basic Concepts of Inference (II) – Hypothesis Testing (Sec. 6.3)
STAT 431