程序代写代做代考 graph C Economics 430

Economics 430
Statistical Inference
1

Today’s Class
• Introduction to Inference
• InferenceusingProbability • Statistical Models
• Data Collection
• BasicInferences
2


Introduction to Inference
Statistical Inference
Def: Statistical inference is the process of using data analysis to deduce properties of an
underlying probability distribution.
– Inference requires assumptions about the data, where different sets of assumptions represent the different models we would consider.
Types of Models:
a) Fully Parametric: A PDF is assumed for the DGP
b) Non-Parametric: Very limited assumptions made about the DGP
c) Semi-Parametric: These models are between a) and c)

3

Introduction to Inference
4

Inference using Probability (Fully Parametric)
• Example (1)*: Should I purchase an extended warranty for my new washing machine? Assume you want to keep it for 5 – 6 years.
– Washing machine costs $650 and the extended warranty is $125
– Manufacturer’s warranty is good for 3 years – Extended warranty covers an added 5 years
• Q: How can we make an informed decision given the amount of uncertainty we face?
*Note: See problem 5.2.1 from Evans & Rosenthal 5

Inference using Probability (Fully Parametric)
• Solution:AssumethelifelengthXinyearsofthe washing machine follows an exp(𝜆 = 1) distribution.
• According to our model, the expected lifelengthforanewmachinewouldbe𝐸𝑋 =1yr.
• The smallest interval containing 95% of the probabiZlity for X is (0, c), where c satisfies
c
0.95= exdx=1ec !c⇡3
0
is equal to [0,3].
• How likely is it that it will still work after 5 yrs?
P(X5)=Z 1exdx=0.0067 5
012345 x
Better get that warranty! 😬
6
0.0 0.2 0.4
0.6 0.8 1.0
f(x)

Inference using Probability (Fully Parametric)
• Example(1):Supposeyoudidn’tgettheextended warranty, and a year has past but so far its working fine. Would your previous results change?

7

Statistical Models
• Example: Suppose that you work for an insurance company and historically they found that the number of claims they receive can be described by a Poisson distribution but the parameter value (i.e., 𝜃) is unknown.
– Recall that:
• They assign you the task of figuring out the probability that 6 claims will be made tomorrow given that today 4 were received.
• Q: How should you proceed?
8

Statistical Models
• Solution:
1. Try a ‘brut force approach’ by considering all the values 𝜃 can take, e.g., 𝜃=1, 𝜃=2,…, and for each one, computing the respective PMF.
2. From the family of PMFs, identify the one(s) most ‘consistent’ with the data.
3. Based on your PMF(s) from Step 2, compute the desired probability 𝑃 𝑋 = 6 𝜃 .
9

• Step 1:
Statistical Models
Try a ‘brut force approach’ by considering all the values 𝜃 can take, e.g., 𝜃=1, 𝜃=2,…, and for each one, computing the respective PMF.
PMF of Pois(1) PMF of Pois(2) PMF of Pois(3)
• Step 2:
Data: {X =4}
àP(X=4|𝜃=4)≈0.2
02468 02468 02468 xxx
PMF of Pois(5) PMF of Pois(6)
PMF of Pois(4)
02468 x
02468 02468 xx
10
0.00
0.05
pdf 0.10
0.15
0.00 0.05 0.10
pdf
0.15 0.20 0.25
0.00
0.05
0.10
0.15
0.00 0.05
0.05
pdf 0.10
0.15
0.20
0.0 0.1
pdf 0.2
0.3
pdf
pdf
0.10 0.15 0.20

Statistical Models
• Step 3:
Based on your PMF(s) from Step 2, compute the
desired probability 𝑃 𝑋 = 6 𝜃 .
From the figures we can try finding P(X=6) using the
parameter value 𝜃=4, i.e., P(X=6|𝜃=4): àP(X=6| 𝜃=4) = 0.10.
11

Statistical Models
Definitions
• Parameter Space: 𝛺 = {𝜃1, 𝜃2, …., 𝜃n}
Is the set of all possible values of the parameter 𝜃.
• Data: Observations (= 𝑥) obtained from a random mechanism (= 𝑃 = ‘probability measure’) assumed to have generated them.
• Statistical Model: Represents our choice of probability measure from the many plausible ones, such that given 𝜃 ∈ Ω, 𝑃. is the true probability measure.
Goal: We observe the data but not P, yet we want to make inferences about P
12

Statistical Models
• In practice, we often have more than just one observation, and the distribution may depend on more than one parameter.
• The previous formalism can be easily
generalized to a vector of parameters 𝜃⃗ and set of observations (𝑥0 , … , 𝑥3 ).
13

Statistical Models
• Example (see 5.3.2): Suppose there are two manufacturing plants for machines. It is known that machines built by the first plant have lifelengths distributed Exponential(1), while machines manufactured by the second plant have lifelengths distributed Exponential(2).
• You have purchased five of these machines knowing that all five came from the same plant, but you do not know which plant.
exp(1) exp(2)
Statistical Model: {P1, P2} Parameter Space: Ω = {1,2}
If we observe (x1,…x5) =
(5.0, 3.5, 3.3, 4.1, 2.8) à𝜃 = 2
(2.0, 2.5, 3.0, 3.1, 1.8) à𝜃 = 1
14
01234567 x
0.0 0.2 0.4
0.6 0.8 1.0
f(x)

Data Collection
• Ifthesampleislargeenoughandrepresentativeof the degree of variation of the population, you do not need to use the entire population.
• Thereare4*commonformsof‘ProbabilitySampling’ from a population:
a) Simple Random Sampling
b) Stratified Sampling (R package = splitstackshape)
c) Cluster & Multistage Sampling (R library = survey)
d) Systematic Sampling
*Note: See De Veaux, Velleman & Bock, Intro Stats (Ch 12) for more details
15

Data Collection
(A) Simple Random Sampling (SRS)
• Wedrawsamplesbecausewecan’tworkwiththe entire population.
– We need to be sure that the statistics we compute from the sample reflect the corresponding parameters accurately.
– A sample that does this is said to be representative.
• Example:TheIRSmayauditarandomsampleof500
tax returns from a small county with 20,000 people.
– This is financially and practically more feasible to do than to inspect all 20,000 returns
16

Data Collection
(B) Stratified Sampling
• Simplerandomsamplingisnottheonlyfairwayto sample.
• Morecomplicateddesignsmaysavetimeormoney or help avoid sampling problems.
• Allstatisticalsamplingdesignshaveincommonthe idea that chance, rather than human choice, is used to select the sample.
• Designsusedtosamplefromlargepopulationsare often more complicated than simple random samples.
19

Data Collection
(B) Stratified Sampling
• Sometimes the population is first sliced into homogeneous groups, called strata, before the sample is selected.
• Then simple random sampling is used within each stratum before the results are combined.
• This common sampling design is called stratified random sampling.
• Example: Suppose Boeing wants to determine the level of compliance with their no-drugs consumption policy. Would an SRS of 50 employees out of 2000 be appropriate?
àNo! Some departments are much larger than others,
and therefore more likely to be included in the sample. Instead use Stratified Sampling! 20

Data Collection
(C) Cluster and Multistage Sampling
• Sometimes stratifying is not practical and simple random sampling is difficult (e.g., to expensive to include everyone).
• Splitting the population into similar parts or clusters can make sampling more practical.
– Thenwecouldselectoneorafewclustersatrandomandperforma census within each of them.
– Thissamplingdesigniscalledclustersampling.
– Ifeachclusterfairlyrepresentsthefullpopulation,clustersampling
will give us an unbiased sample.
• Example: In the US, are auto shops charging women more than men for the same car repairs?
22

Data Collection
(C) Cluster and Multistage Sampling
• Sometimes we use a variety of sampling methods together.
• Sampling schemes that combine several methods are called multistage samples.
• Most surveys conducted by professional polling organizations use some combination of stratified and cluster sampling as well as simple random sampling.
24

Data Collection
(D) Systematic Sampling
• Sometimes we draw a sample by selecting individuals systematically.
– For example, you might survey every 10th person on an alphabetical list of students.
• To make it random, you must still start the systematic selection from a randomly selected individual.
• When there is no reason to believe that the order of the list could be associated in any way with the responses sought, systematic sampling can give a representative sample.
25

Basic Inferences
Descriptive Statistics
• Def: pth quanZtile (or 100 pth percentile) = 𝜋p ⇡p
– Consists of the min, Q1, Q2 (median), IQR, Q3 and max, where Q1 = 𝜋0.25, median = 𝜋0.50, Q3 = 𝜋0.75, andIQR=Q3 –Q1
p = f(x)dx = F(⇡p) 1
where𝐹=CDFand𝑓 𝑥 =PDF. • Five Number Summary
27

Basic Inferences
Descriptive Statistics
• Example: Given 𝑓 𝑥 = 2𝑒9:;, 𝑥 > 0, find 𝜋>.? .
• Solve for 𝜋p from:
0.5 =
Z ⇡0.5 0
2e
2x
dx = F(⇡0.5)
! ⇡0.5 = 1 ln(2) 2
28

Basic Inferences
Descriptive Statistics
• Boxplot: Graphical representation of the 5 number summary.
• Useful for comparing groups
• It consists of:
– Body: Q1, Q2, Q3
– Fences: Lower = Q1 – 1.5IQR
– Upper = Q3 + 1.5IQR
– Whiskers: dashed lines
Fences
Whiskers
Body
29

Descriptive Statistics
• Histogram: Graphical representation a
quantitative variable’s density distribution.
• Q: How many bins should be used?
• A: For a dataset with 𝑛 observations, a nearly optimal number is 𝑘 = 1 + log:(𝑛)
k = 1000: Too Noisy
−3 −2 −1 0 1 2 3
z−score
k = 1+log2(n) = 11: Optimal
−3 −2 −1 0 1 2 3
z−score
k = 4: Too Smooth
−3 −2 −1 0 1 2 3
z−score
Basic Inferences
Frequency Frequency Frequency
0 100 300 0 50 100 200 0 2 4 6 8
30