Columbia University MA in Economics
GR 5411 Econometrics I Seyhan Erden
By Markov’s inequality,
Solutions to Problem Set 1
due on Sept. 28th at 10am through Gradescope
__________________________________________________________________________________________
1. (5p) Let 𝑧” be a sequence of random variables such that 𝐸|𝑧”| → 0 as 𝑛 → ∞. Show that 𝑧” →) 0 Solutions:
𝑃𝑟(|𝑧”| ≥ 0) ≤ 𝐸|𝑧”| 𝑎
𝑓𝑜𝑟 𝑎 > 0
Therefore, taking the limit w.r.t. n on both sides we get
negative and it must be equal to zero for the above to be satisfied, hence
lim 𝑃𝑟(|𝑧”| ≥ 0) = 0 “→8
and therefore by definition 𝑧” →) 0
2. (5p) Suppose we do an experiment that consists of tossing a coin until a head appears. Let 𝑝 = probability of a head on any given toss and define a random variable 𝑋 = number of tosses required to get a head. Find the cumulative density function (cdf) of 𝑋.
Solution: if the first head appears on the 𝑥=> toss, it must be that the first (𝑥 − 1) tosses give tails and the last one gives heads. The probability of this sequence is
lim 𝑃𝑟(|𝑧”| ≥ 0) ≤ lim 𝐸|𝑧”| “→8 “→8 𝑎
𝑓𝑜𝑟 𝑎 > 0
The limit on the left hand side is zero because 𝐸|𝑧”| → 0, the limit on the right hand side is non-
Therefore,
𝑃𝑟(𝑋 = 𝑥) = (1 − 𝑝)ABC𝑝
AA
𝑃𝑟(𝑋 ≤ 𝑥) = D 𝑃𝑟(𝑋 = 𝑖) = D(1 − 𝑝)FBC𝑝 FGC FGC
Now recall the formula for the sum of the geometric series:
Therefore,
” 1−𝑡”
D 𝑡IBC = 1 − 𝑡 𝑓𝑜𝑟 𝑡 ≠ 1
IGC
𝐹 (𝑥)=𝑃𝑟(𝑋≤𝑥)=1−(1−𝑝)A𝑝=1−(1−𝑝)A
L
1−(1−𝑝)
3. (30p) The following table gives the joint probability distribution between employment status and college graduation among those either employed or looking for work (unemployed) in the working- age U.S. population for September 2018
(a) (2p) Compute 𝐸(𝑌)
(b) (2p) The unemployment rate is the fraction of the labor force that is unemployed. Show that
the unemployment rate is given by 1 − 𝐸(𝑌).
(c) (4p) Calculate 𝐸(𝑌|𝑋 = 1) and 𝐸(𝑌|𝑋 = 0)
(d) (2p) Calculate the unemployment rate for (i) college graduates and (ii) non-college graduates.
(e) (4p) A randomly selected member of this population reports being unemployed. What is the
probability that this worker is (i) a college graduate? (ii) a non-college graduate?
(f) (2p) Are educational achievement and employment status independent? Explain.
(g) (4p) Find 𝐸(𝑋N) and 𝐸(𝑌N)
(h) (4p) Find 𝑉𝑎𝑟(𝑋) and 𝑉𝑎𝑟(𝑌)
(i) (2p) Find 𝐸(𝑋|𝑌 = 1)
(j) (2p) Find 𝑉𝑎𝑟(𝑋|𝑌 = 1)
(k) (2p) Show the Law of Iterated Expectations works for this question.
Solutions:
(a)
Unemployed Employed
(𝑌 = 0) (𝑌 = 1) Total
Non-college grads (𝑋 = 0)
0.026
0.576
0.602
College graduate (𝑋 = 1)
0.009
0.389
0.398
Total
0.035
0.965
1.000
𝐸(𝑌)=𝜇Q =0×Pr(𝑌=0)+1×Pr(𝑌=1)=0×0.035+1×0.965=0.965
(b)
(c) Calculate the conditional probabilities first:
𝑃𝑟(𝑌 = 0|𝑋 = 0) = 𝑃𝑟(𝑌 = 0, 𝑋 = 0) =
𝑈𝑛𝑒𝑚𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡 𝑅𝑎𝑡𝑒 = #(𝑢𝑛𝑒𝑚𝑝𝑙𝑜𝑦𝑒𝑑) #(𝑙𝑎𝑏𝑜𝑟 𝑓𝑜𝑟𝑐𝑒)
= Pr(𝑌 = 0) = 1 − Pr(𝑌 = 1) = 1
− 𝐸(𝑌) = 1 − 0.965 = 0.035
0.026 = 0.043 0.602
0.576 = 0.957 0.602
0.009 = 0.023 0.397
0.389 = 0.978 0.397
𝑃𝑟(𝑌 = 1|𝑋 = 0) = 𝑃𝑟(𝑌 = 1, 𝑋 = 0) = Pr (𝑋 = 0)
𝑃𝑟(𝑌 = 0|𝑋 = 1) = 𝑃𝑟(𝑌 = 0, 𝑋 = 1) = Pr (𝑋 = 1)
𝑃𝑟(𝑌 = 1|𝑋 = 1) = 𝑃𝑟(𝑌 = 1, 𝑋 = 1) = Pr (𝑋 = 1)
The conditional expectations are
Pr (𝑋 = 0)
𝐸(𝑌|𝑋 = 1) = 0 × 𝑃𝑟(𝑌 = 0|𝑋 = 1) + 1 × 𝑃𝑟(𝑌 = 1|𝑋 = 1) = 0 × 0.023 + 1 × 0.978 = 0.978
𝐸(𝑌|𝑋 = 0) = 0 × 𝑃𝑟(𝑌 = 0|𝑋 = 0) + 1 × Pr(𝑌 = 1|𝑋 = 0) = 0 × 0.043 + 1 × 0.957 = 0.957
(d) Use the solution to part (b),
Unemployment rate for college graduates = 1 – E(Y|X=1) = 1−0.978 = 0.023.
Unemployment rate for non-college graduates = 1 – E(Y|X=0) = 1−0.957 = 0.043
(e) The probability that a randomly selected worker who is reported being unemployed is a
college graduate is
𝑃𝑟(𝑋 = 1|𝑌 = 0) = 𝑃𝑟(𝑋 = 1, 𝑌 = 0) = 0.009 = 0.257 Pr (𝑌 = 0) 0.035
The probability that this worker is a non-college graduate is
𝑃𝑟(𝑋 = 0|𝑌 = 0) = 1 − 𝑃𝑟(𝑋 = 1|𝑌 = 0) = 1 − 0.257 = 0.743
(f) Educational achievement and employment status are not independent because they do not satisfy that, for all values of x and y,
𝑃𝑟(𝑋 = 𝑥|𝑌 = 𝑦) + Pr (𝑋 = 𝑥)
For example, from part (e) Pr(𝑋 = 0|𝑌 = 0) = 0.743, while from the table
Pr(X = 0) = 0.602.
(g)
(h)
𝐸(𝑋N)=0N ×Pr(𝑋=0)+1N ×Pr(𝑋=1)=0×0.602+1×0.398=0.398 𝐸(𝑌N)=0N ×Pr(𝑌=0)+1N ×Pr(𝑌=1)=0×0.035+1×0.965=0.965
𝑉𝑎𝑟(𝑋) = 𝐸(𝑋N) − k𝐸(𝑋)lN = 0.398 − (0 × 0.602 + 1 × 0.398)N = 0.398 − 0.398N = 0.2396
𝑉𝑎𝑟(𝑌) = 𝐸(𝑌N) − k𝐸(𝑌)lN = 0.965 − 0.965N = 0.0338
(i) Calculate the conditional probabilities first:
Pr(𝑋 = 0|𝑌 = 1) = Pr(𝑋 = 0, 𝑌 = 1) = 0.576 = 0.5969
Pr(𝑌 = 1) 0.965
Pr(𝑋 = 1|𝑌 = 1) = Pr(𝑋 = 1, 𝑌 = 1) = 0.389 = 0.4031 Pr(𝑌 = 1) 0.965
The conditional expectation
(j)
𝐸(𝑋|𝑌 = 1) = 0 × 𝑃𝑟(𝑋 = 0|𝑌 = 1) + 1 × 𝑃𝑟(𝑋 = 1|𝑌 = 1) = 0.4031 𝑉𝑎𝑟(𝑋|𝑌 = 1) = 𝐸(𝑋N|𝑌 = 1) − k𝐸(𝑋|𝑌 = 1)lN
AsE(𝑋N|𝑌=1)=0N ×Pr(𝑋N =0|𝑌=1)+1N ×Pr(𝑋N =1|𝑌=1)=0.4031(because𝑋N =1 only if𝑋=1and,hence,Pr(𝑋N =1|𝑌=1)=Pr(𝑋=1|𝑌=1)),
Var(𝑋|𝑌 = 1) = 0.4031 − 0.4031N = 0.2406 (k) Want to show LIE: 𝐸[𝐸(𝑋|𝑌)] = 𝐸(𝑋)
𝐸(𝑋|𝑌) = 0 × Pr(𝑋 = 0|𝑌) + 1 × Pr(𝑋 = 1|𝑌) = Pr(𝑋 = 1|𝑌) = 𝐸[Pr(𝑋 = 1|𝑌)] = 0.398 Note that 𝐸(𝑋) = 0.398 from question (h). So, LIE holds.
4. (5p) Describe what is wrong with the following statements:
“The central limit theorem implies that, as sample size grows, the error distribution approaches
normality.”
This is a common misstatement: the distribution of any random draw, 𝑢F, is the population distribution of 𝑢, and the population distribution of 𝑢 does not change with the sample size, it is what it is (normal distr. or whatever distr.) Therefore, the random draws on 𝑢F, have the same distribution regardless of sample size.
A correct statement is that the standardized average of the errors
∑ 𝑢 F = √ 𝑛 𝑢t √𝑛
approaches normality as 𝑛 → ∞. This is a much different statement. (In regression analysis, we use the fact that ∑ 𝑥u𝑢 /√𝑛 generally converges to a multivariate normal
FF
distribution, which implies the convergence of ∑ 𝑢F /√𝑛 to normality when 𝑥F contains
unity.)
5. (10p) Suppose that 𝑌~𝑖𝑖𝑑(𝜇 ,𝜎N) and 0 < 𝜎N < ∞ FQQQ
Let an estimator of 𝜇Q be 𝜇̂ where
1−𝑎" BC "
since
(b) No! Not consistent
FGC
𝑣𝑎𝑟(𝜇̂)=𝑣𝑎𝑟}{
FGÄ
1−𝑎" BC "
| D𝑎FBC𝑌~ F
FGC
𝐸[𝜇̂]=𝐸}{
| D𝑎FBC𝑌~={ | D𝑎FBC𝜇 =𝜇 1−𝑎F1−𝑎
" 81−𝑎" D𝑎FBC =(1−𝑎")D𝑎F = 1−𝑎
𝜇̂={
1−𝑎" BC "
| D𝑎FBC𝑌 F
FGC
where 0 < 𝑎 < 1.
(a) Is 𝜇̂ unbiased? Why? (b) Is 𝜇̂ consistent? Why?
Solution:
(a) Yes! It is unbiased
1−𝑎" BC "
1−𝑎
FGC FGC
1−𝑎
1−𝑎" N "
={ | D𝑎N(FBC)𝜎N
1−𝑎 Q FGC
= 𝜎N (1 − 𝑎N")(1 − 𝑎)N Q (1−𝑎N)(1−𝑎")N
= 𝜎N (1 − 𝑎")(1 + 𝑎")(1 − 𝑎)N Q (1−𝑎)(1+𝑎)(1−𝑎")N
= 𝜎N (1 + 𝑎")(1 − 𝑎) Q (1−𝑎")(1+𝑎)
whichhasthelimit𝑣𝑎𝑟(𝜇̂)→𝜎N(CBÇ)>0 𝑛→∞. Q (CÉÇ)
Because 𝑌 is normally distributed, 𝜇̂ is normally distributed with mean 𝜇Q and variance given above.
Thus, 𝜇̂ has positive probability of falling outside any interval around 𝜇Q, so
𝑃𝑟(|𝜇̂ − 𝜇Q| ≥ 𝛿) does not tend to zero and 𝜇̂ is inconsistent.
6. (18p) Verifying CLT with Stata: Let 𝑈F (𝑖 = 1,2, … , 𝑛) be i.i.d. uniform random variables on [0,1]. By the CLT, the random variable 𝑍” is defined as
1 ” √ 𝑛 ( 𝑈á − 𝜇 ) 𝑍” = √𝑛𝜎N D(𝑈F − 𝜇) = 𝜎
FGC
behaves like a standard normal random variable for large 𝑛, where 𝜇 and 𝜎N are the mean and
variances of 𝑈F, respectively; 𝑈á is the sample mean
(a) (6p) What is 𝜇 and what is 𝜎N?
(b) (6p) Set 𝑛 = 50 and draw a random sample from a uniform distribution and compute 𝑍” for this
sample
(c) (6p) Repeat part (b) 1000 times and plot the histogram of 𝑍”′𝑠. Attach your .do file to the
problem set.
Solution:
(a) 𝜇=C(𝑎+𝑏)=0.5, 𝜎N = C (𝑏−𝑎)N = C
N CN CN
(b) See .do file (c) See .do file
7. (27p) Verifying CLT and LLN with Stata:
(a) (7p) Generate a sample of 150 observations from chi-squared distribution with 10 degrees of freedom. Calculate its mean and variance
(b) (10p) Repeat exercise (a) for 300 observations and for 3000 observations. Does LLN hold? Why?
(c) (10p) By the CLT, the random variable 𝑍” is defined as
𝑍 ” = √ 𝑛 ( 𝑋t − 𝜇 ) 𝜎
behaves like a standard normal random variable under large n (n ≥ 50). To verify this with simulation, generate a large number of observations from 𝑍” and plot the histogram of the 1000 observations. Discuss whether the histogram is approximately normal density.
Solutions: See .do file
8. (Practice question, this will not be graded, so do not need to submit solutions. Answer to this question will be covered in recitations this week) Verifying Central Limit Theorem (CLT) and Law of Large Numbers (LLN) with Stata: Let 𝑋C, 𝑋N, … , 𝑋” be i.i.d. Bernoulli with 𝑝 = 0.74. Recall that the variance of a Bernoulli random variable is
𝜎N =𝑝(1−𝑝) (a) By the CLT, the random variable 𝑍” is defined as
1″
𝑍” = ä𝑛𝑝(1 − 𝑝) D(𝑋F − 𝑝) FGC
behaves like a standard normal random variable under large 𝑛 (𝑛 ≥ 50). To verify this with simulation, generate a large number of observations from 𝑍” and plot the histogram of the 𝑚 observations. Discuss whether the histogram is approximately normal density.
(b)
Solution: See .do file
Generate a sample of 150 observations from Bernoulli distribution. Calculate its mean and
variance. Repeat exercise for 100 observations. Does LLN hold?