留学生考试辅导 MAST20005) & Elements of Statistics (MAST90058)

Interval estimation: Part 1
(Module 3)
Statistics (MAST20005) & Elements of Statistics (MAST90058)
School of Mathematics and Statistics University of Melbourne

Copyright By PowCoder代写 加微信 powcoder

Semester 2, 2022

Aims of this module
• Introduce the idea of quantifying uncertainty and describe some methods for doing so
• Explain interval estimation, particularly confidence intervals, which are the most common type of interval estimate
• Describe some important probability distribution that appear in many statistical procedures
• Work through some common, simple inference scenarios

The need to quantify uncertainty
Standard error
Confidence intervals Introduction
Definition
Important distributions Pivots
Common scenarios

Statistics: the big picture

How useful are point estimates?
Example: surveying Melbourne residents as part of a disability study. The results will be used to set a budget for disability support.
Estimate from survey: 5% of residents are disabled
What can we conclude?
Estimate from a second survey: 2% of residents are disabled
What can we now conclude?
What other information would be useful to know?

Going beyond point estimates
• Point estimates are usually only a starting point
• Insufficient to conclusively answer real questions of interest
• Perpetual lurking questions:
◦ How confident are you in the estimate? ◦ How accurate is it?
• We need ways to quantify and communicate the uncertainty in our estimates.

The need to quantify uncertainty
Standard error
Confidence intervals Introduction
Definition
Important distributions Pivots
Common scenarios

Report sd(Θˆ)?
Previously, we calculated the variance of our estimators.
Reminder:sd(Θˆ)=􏰚var(Θˆ)
This tells us a typical amount by which the estimate will vary from one sample to another, and thus (for an unbiased estimator) how close to the true parameter value it is likely to be.
Can we just report that? (Alongside our estimate, θˆ)
Problem: this is usually an expression that depends on the parameter
values, which we don’t know and are trying to estimate.

Estimate sd(Θˆ)!
We know how to deal with parameter values. . . we estimate them! Let’s estimate the standard deviation of our estimator.
A common approach: substitute point estimates into the expression for the variance.
Consider the sample proportion, pˆ = X/n.
We know that var(pˆ) = p(1−p) . n
Therefore, an estimate is v􏰢ar(pˆ) = pˆ(1−pˆ) . n

If we take a sample of size n = 100 and observe x = 30, we get pˆ = 30/100 = 0.3,
􏰅pˆ(1 − pˆ) n
􏰅0.3 × 0.7 100
We refer to this estimate as the standard error and write: se(pˆ) = 0.046

Standard error
The standard error of an estimate is the estimated standard deviation of the estimator.
• Parameter: θ
• Estimator: Θˆ
• Estimate: θˆ •Standarddeviationoftheestimator:sd(Θˆ) • Standard error of the estimate: se(θˆ)
Note: some people also refer to the standard deviation of the estimator as the standard error. This is potentially confusing, best to avoid doing this.

Reporting the standard error
There are many ways that people do this.
Suppose that pˆ = 0.3 and se(pˆ) = 0.046.
Here are some examples:
• 0.3 (0.046)
• 0.3±0.046
• 0.3 ± 0.092 [= 2 × se(pˆ)]
This now gives us some useful information about the (estimated) accuracy of our estimate.

Back to the disability example
More info:
• First survey: 5% ± 4%
• Second survey: 2% ± 0.1%
What would we now conclude?
What result should we use for setting the disability support budget?

The need to quantify uncertainty
Standard error
Confidence intervals Introduction
Definition
Important distributions Pivots
Common scenarios

Interval estimates
Let’s go one step further. . .
The form est ± error can be expressed as an interval, (est − error, est + error).
This is an example of an interval estimate.
More general and more useful than just reporting a standard error. For example, it can cope with skewed (asymmetric) sampling distributions.
How can we calculate interval estimates?

Random sample (iid): X1, . . . , Xn ∼ N(μ, 1)
The sampling distribution of the sample mean is X ̄ ∼ N(μ, n1 ).
Since we know that Φ−1 (0.025) = −1.96, we can write: 􏰌 X ̄−μ 􏰍
Pr −1.96< σ/√n <1.96 =1−2×0.025=0.95 or, equivalently, Pr μ−1.96√n 0, known as the degrees of freedom • Notation: T ∼ χ2k or T ∼ χ2(k)
• The pdf is:
• Mean and variance:
t k2 − 1 e − 2t f(t)=k k,t≥0
22 Γ(2) E(T ) = k
var(T) = 2k
• The distribution is bounded below by zero and is right-skewed

• Arises as the sum of iid standard normal rvs: Zi∼N(0,1) ⇒ T=Z12+···+Zk2∼χ2k
• When sampling from a normal distribution, the sample variance follows a χ2-distribution:
(n − 1)S2 2 σ2 ∼ χn−1

Student’s t-distribution
• Also known as simply the t-distribution
• Single parameter: k > 0, the degrees of freedom (same as for χ2) • Notation: T ∼tk or T ∼t(k)
• The pdf is:
Γ(k+1) 􏰌 t2􏰍−k+1 2
f(t)=√ 2 1+ , −∞1
var(T)= k , ifk>2 k−2

• The t-distribution is similar to a standard normal but with ‘wide’ tails
• Ask→∞,thentk →N(0,1)
• If Z ∼ N(0, 1) and U ∼ χ2(r), and they are independent, then
T = 􏰄U/r ∼ tr
• This arises when considering the sampling distributions of statistics from a normal distribution, in particular:
(n−1)S2 /(n − 1)
X ̄ − μ σ / √ n

F -distribution
• Also known as the Fisher-Snedecor distribution
• Parameters: m, n > 0, the degrees of freedom (same as before) • Notation: W ∼ Fm,n or W ∼ F(m,n)
• If U ∼ χ2m and V ∼ χ2n are independent then
F = U/m ∼Fm,n V/n
• This arises when comparing sample variances (see later)

Recall our general technique that starts with a probability interval using a statistic with a known sampling distribution:
Pr (a(θ) < T < b(θ)) = 0.95 The easiest way to make this technique work is by finding a function of the data and the parameters, Q(X1, . . . , Xn; θ), whose distribution does not depend on the parameters. In other words, it is a random variable that has the same distribution regardless of the value of θ. The quantity Q(X1, . . . , Xn; θ) is called a pivot or a pivotal quantity. 42 of 82 Remarks about pivots • The value of the pivot can depend on the parameters, but its distribution cannot. • Since pivots are a function of the parameteres as well as the data, they are usually not statistics. • If a pivot is also a statistic, then it is called an ancillary statistic. Examples of pivots • We have already seen the following result for sampling from a normal distribution with known variance: Z= σ/√n ∼N(0,1). Therefore, Z is a pivot in this case. • If we know the distribution of the pivot, we can use it to write a probability interval, and start deriving a confidence interval. • For example, in the normal case with known variance, 􏰌 X ̄−μ 􏰍 Pr a<σ/√n butterfat
[1] 481 537 513 583 453 510 570 500 457 555 618 327
[13] 350 643 499 421 505 637 599 392
> t.test(butterfat, conf.level = 0.9)
One Sample t-test
data: butterfat
t = 25.2879, df = 19, p-value = 4.311e-16
alternative hypothesis: true mean is not equal to 0
90 percent confidence interval:
472.7982 542.2018
sample estimates:
> sd(butterfat)
[1] 89.75082

> qqnorm(butterfat, main = “”)
> qqline(butterfat, probs = c(0.25, 0.75))
This gives us the following QQ plot. . .

−2 −1 0 1 2
Theoretical Quantiles
Sample Quantiles
350 400 450 500 550 600 650

• CIs based on a t-distribution (or a normal distribution) are of the form:
estimate ± c × standard error
for an appropriate quantile, c, which depends on the sample size
(n) and the confidence level (1 − α).
• The t-distribution is appropriate if the sample is from a normally
distributed population.
• Can check using a QQ plot (in this example, looks adequate).
• If not normal but n is large, can construct approximate CIs using
the normal distribution (as we did in a previous example). This is usually okay if the distribution is continuous, symmetric and unimodal (i.e. has a single ‘mode’, or maximum value).
• If not normal and n small, distribution-free methods can be used.
We will cover these later in the semester.

Normal, two means, known σ
Suppose we have two populations, with means μX and μY , and want
to know how much they differ.
Random samples (iid) from each population:
X1,…,Xn ∼ N(μX,σX2 ) and Y1,…,Ym ∼ N(μY ,σY2 ) The two samples must be independent of each other.
Assume σX2 and σY2 are known.
Then we have the following pivot (why?):
X ̄ − Y ̄ − ( μ X − μ Y )
􏰚σ2 σ2 ∼ N(0,1)

Defining c as in previous examples, we then write,  ̄ ̄
X−Y −(μX −μY)
Pr −c < 􏰚 σ2 σ2 < c = 1 − α Rearranging as usual gives the 100 · (1 − α)% confidence interval for μX−μY as 􏰅 σ X2 σ Y2 x ̄−y ̄±c n+m . . . but it is rare to know the population variances! Normal, two means, unknown σ, many samples What if we don’t know σX2 and σY2 ? If n and m are large, we can just replace σX and σY by estimates, e.g. the sample standard deviations SX and SY . Rationale: these will be good estimates when the sample size is large. The (approximate) pivot is then: X ̄ − Y ̄ − ( μ X − μ Y ) 􏰚S2 S2 ≈ N(0,1) This gives the following (approximate) CI for μX − μY : 􏰅 s 2X s 2Y x ̄−y ̄±c n+m Normal, two means, unknown σ, common variance But what if the sample sizes are small? If we assume a common variance, σX2 = σY2 pivot, as follows. = σ2, we can find a Also, since the samples are independent, X ̄ − Y ̄ − ( μ X − μ Y ) (n − 1)SX2 (m − 1)SY2 2 U = σ2 + σ2 ∼ χn+m−2 because U is the sum of independent χ2 random variables. 57 of 82 Moreover, U and Z are independent. So we can write, Z T = 􏰄U/(n+m−2) ∼tn+m−2 Substituting and rearranging gives, X ̄ − Y ̄ − ( μ X − μ Y ) T=􏰚11 􏰛(n − 1)SX2 + (m − 1)SY2 SP = n+m−2 is the pooled estimate of the common variance. Note that the unknown σ has disappeared (cancelled out), therefore making T a pivot (why?). We can now find the quantile c so that Pr(−cCS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com