Introduce the idea of quantifying uncertainty and describe some methods for doing so
Explain interval estimation, particularly confidence intervals, which are the most common type of interval
Describe some important probability distribution that appear in many statistical procedures Work through some common, simple inference scenarios
The need to quantify uncertainty
Copyright By PowCoder代写 加微信 powcoder
Statistics: the big picture
Interval estimation: Part 1
(Module 3)
Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2022
1 The need to quantify uncertainty 1
2 Standard error 2
3 Confidence intervals 3
3.1 Introduction…………………………………………… 3
3.2 Definition……………………………………………. 6
3.3 Importantdistributions …………………………………….. 8
3.4 Pivots……………………………………………… 9
3.5 Commonscenarios ……………………………………….. 10
Aims of this module
We have learnt how to do basic inference, using point estimates. What’s next?
How useful are point estimates?
Example: surveying Melbourne residents as part of a disability study. The results will be used to set a budget for disability support.
Estimate from survey: 5% of residents are disabled
What can we conclude?
Estimate from a second survey: 2% of residents are disabled
What can we now conclude?
What other information would be useful to know?
Going beyond point estimates
• Point estimates are usually only a starting point
• Insufficient to conclusively answer real questions of interest • Perpetual lurking questions:
– How confident are you in the estimate?
– How accurate is it?
• We need ways to quantify and communicate the uncertainty in our estimates.
Standard error
Report sd(Θˆ)?
Previously, we calculated the variance of our estimators.
Reminder:sd(Θˆ)=var(Θˆ)
This tells us a typical amount by which the estimate will vary from one sample to another, and thus (for an unbiased
estimator) how close to the true parameter value it is likely to be. Can we just report that? (Alongside our estimate, θˆ)
Problem: this is usually an expression that depends on the parameter values, which we don’t know and are trying to estimate.
Estimate sd(Θˆ)!
We know how to deal with parameter values. . . we estimate them!
Let’s estimate the standard deviation of our estimator.
A common approach: substitute point estimates into the expression for the variance.
Consider the sample proportion, pˆ = X/n. We know that var(pˆ) = p(1−p) . Therefore, an estimate is var(pˆ) = pˆ(1−pˆ) . nn
If we take a sample of size n = 100 and observe x = 30, we get pˆ = 30/100 = 0.3,
0.3 × 0.7 100
pˆ(1 − pˆ) n
We refer to this estimate as the standard error and write:
se(pˆ) = 0.046
Standard error
The standard error of an estimate is the estimated standard deviation of the estimator. Notation:
• Parameter: θ
• Estimator: Θˆ
• Estimate: θˆ •Standarddeviationoftheestimator:sd(Θˆ) • Standard error of the estimate: se(θˆ)
Note: some people also refer to the standard deviation of the estimator as the standard error. This is potentially confusing, best to avoid doing this.
Reporting the standard error
There are many ways that people do this. Suppose that pˆ = 0.3 and se(pˆ) = 0.046. Here are some examples:
• 0.3 (0.046)
• 0.3±0.046
• 0.3 ± 0.092 [= 2 × se(pˆ)]
This now gives us some useful information about the (estimated) accuracy of our estimate.
Back to the disability example
More info:
• First survey: 5% ± 4%
• Second survey: 2% ± 0.1%
What would we now conclude?
What result should we use for setting the disability support budget?
3 Confidence intervals
3.1 Introduction
Interval estimates
Let’s go one step further. . .
The form est ± error can be expressed as an interval, (est − error, est + error).
This is an example of an interval estimate.
More general and more useful than just reporting a standard error. For example, it can cope with skewed (asymmetric) sampling distributions.
How can we calculate interval estimates?
Random sample (iid): X1, . . . , Xn ∼ N(μ, 1)
The sampling distribution of the sample mean is X ̄ ∼ N(μ, n1 ). Since we know that Φ−1 (0.025) = −1.96, we can
or, equivalently,
Pr −1.96< √ <1.96 =1−2×0.025=0.95
Pr μ−1.96√
• The pdf is:
• Mean and variance:
t k2 − 1 e − 2t f(t)=2k2Γ(k2), t≥0
E(T ) = k var(T) = 2k
• The distribution is bounded below by zero and is right-skewed • Arises as the sum of iid standard normal rvs:
Zi∼N(0,1) ⇒ T=Z12+···+Zk2∼χ2k
• When sampling from a normal distribution, the sample variance follows a χ2-distribution:
σ 2 ∼ χ 2n − 1
Student’s t-distribution
• Also known as simply the t-distribution
• Single parameter: k > 0, the degrees of freedom (same as for χ2) • Notation: T ∼ tk or T ∼ t(k)
• The pdf is:
• Mean and variance:
Γ(k+1) t2−k+1 2
f(t)=√ 2 1+ , −∞
• The t-distribution is similar to a standard normal but with ‘wide’ tails
• Ask→∞,thentk →N(0,1)
• If Z ∼ N(0, 1) and U ∼ χ2(r), and they are independent, then
T = U/r ∼ tr
• This arises when considering the sampling distributions of statistics from a normal distribution, in particular:
X ̄ − μ ̄ σ/√n X−μ
(n−1)S2 /(n − 1) σ2
= √ ∼tn−1 S/ n
F -distribution
• Also known as the Fisher-Snedecor distribution
• Parameters: m, n > 0, the degrees of freedom (same as before) • Notation: W ∼ Fm,n or W ∼ F(m,n)
• If U ∼ χ2m and V ∼ χ2n are independent then
F = U/m ∼Fm,n V/n
• This arises when comparing sample variances (see later)
3.4 Pivots
Recall our general technique that starts with a probability interval using a statistic with a known sampling distribution: Pr (a(θ) < T < b(θ)) = 0.95
The easiest way to make this technique work is by finding a function of the data and the parameters, Q(X1, . . . , Xn; θ), whose distribution does not depend on the parameters. In other words, it is a random variable that has the same distribution regardless of the value of θ.
The quantity Q(X1, . . . , Xn; θ) is called a pivot or a pivotal quantity.
Remarks about pivots
• The value of the pivot can depend on the parameters, but its distribution cannot.
• Since pivots are a function of the parameteres as well as the data, they are usually not statistics. • If a pivot is also a statistic, then it is called an ancillary statistic.
Examples of pivots
• We have already seen the following result for sampling from a normal distribution with known variance:
Z= √ ∼N(0,1).
Therefore, Z is a pivot in this case.
• If we know the distribution of the pivot, we can use it to write a probability interval, and start deriving a
confidence interval.
• For example, in the normal case with known variance,
X ̄−μ Pr a< √ butterfat
[1] 481 537 513 583 453 510 570 500 457 555 618 327
[13] 350 643 499 421 505 637 599 392
> t.test(butterfat, conf.level = 0.9)
One Sample t-test
data: butterfat
t = 25.2879, df = 19, p-value = 4.311e-16
alternative hypothesis: true mean is not equal to 0
90 percent confidence interval:
472.7982 542.2018
sample estimates:
> sd(butterfat)
[1] 89.75082
> qqnorm(butterfat, main = “”)
> qqline(butterfat, probs = c(0.25, 0.75))
This gives us the following QQ plot. . .
−2 −1 0 1 2
Theoretical Quantiles
• CIs based on a t-distribution (or a normal distribution) are of the form: estimate ± c × standard error
for an appropriate quantile, c, which depends on the sample size (n) and the confidence level (1 − α).
• The t-distribution is appropriate if the sample is from a normally distributed population.
• Can check using a QQ plot (in this example, looks adequate).
• If not normal but n is large, can construct approximate CIs using the normal distribution (as we did in a previous example). This is usually okay if the distribution is continuous, symmetric and unimodal (i.e. has a single ‘mode’, or maximum value).
• If not normal and n small, distribution-free methods can be used. We will cover these later in the semester. Normal, two means, known σ
Suppose we have two populations, with means μX and μY , and want to know how much they differ. Random samples (iid) from each population: X1,…,Xn ∼ N(μX,σX2 ) and Y1,…,Ym ∼ N(μY ,σY2 ) The two samples must be independent of each other.
Assume σX2 and σY2 are known. Then we have the following pivot (why?):
X ̄ − Y ̄ − ( μ X − μ Y )
σX2 +σY2 ∼N(0,1)
Defining c as in previous examples, we then write, ̄ ̄
X − Y − (μX − μY )
Pr−c< σX2 +σY2
+ conf.level = 0.95)
t = 18.8003
df = 33.086
95% CI: 11.23214 13.95786
Pooled variance:
> t.test(X, Y,
+ conf.level = 0.95,
+ var.equal = TRUE)
t = 18.8003
95% CI: 11.23879 13.95121
• From box plots: look like very different population means and possibly differen
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com