Columbia University MA in Economics
GR 5411 Econometrics I Seyhan Erden
Solutions Problem Set 3
due on Oct.26th at 10am through Gradescope (please specify the page number of each question when you are submitting your problem set to Gradescope)
___________________________________________________________________
1. (12p) Consider the regression model 𝑌 = 𝛽 𝑋 + 𝛽 𝑊 + 𝑢 where for simplicity the “%”(“”
intercept is omitted and all variables are assumed to have a mean of 0. Suppose that 𝑋” is distributed independently of (𝑊 , 𝑢 ) but 𝑊 and 𝑢 might be correlated, and let 𝛽/ and 𝛽/ be
with
“””” %(
the OLS estimators for this model.
(a) (5p) Show that whether or not 𝑊 and 𝑢 correlated, 𝛽/ →0 𝛽 .
“” %%
(b) (2p) Show that if 𝑊 and 𝑢 are correlated, then 𝛽/ is inconsistent. “”(
(c) (5p) Let 𝛽/2 be the OLS estimator from the regression of 𝑌 on 𝑋, the restricted regression %
that excludes 𝑊. Will 𝛽/ have a smaller asymptotic variance than 𝛽/2, allowing for the %%
possibility that 𝑊 and 𝑢 are correlated? Explain. “”
Solutions:
(a) We write the regression model, Yi = b1Xi + b2Wi + ui, in the matrix form as
Y=Xb1 +Wb2 +U
⎛ Y1 ⎞ ⎛ X1 ⎞ ⎛ W1 ⎞ ⎛ u1 ⎞ ⎜ Y ⎟ ⎜ X ⎟ ⎜ W ⎟ ⎜ u ⎟
Y=⎜2⎟, X=⎜ 2⎟, W=⎜ 2⎟, U=⎜2⎟, ⎜ Y! ⎟ ⎜ X! ⎟ ⎜ W! ⎟ ⎜ u! ⎟
⎝n⎠ ⎝n⎠ ⎝n⎠ ⎝n⎠ The OLS estimator is
𝛽/
3 𝛽/ % 4
(
1
æˆö -1 çb1÷=æX¢X X¢Wö æX¢Yö
ç ˆ ÷ çW¢X W¢W÷ çW¢Y÷ èb2øè øèø
=æb1 ö+æ X¢X X¢W ö-1 æ X¢U ö çb ÷ çW¢X W¢W÷ çW¢U÷
è2øè øè ø æbö æ1X¢X 1X¢Wö-1æ1X¢Uö
=ç1÷+çn n ÷çn ÷ b 1W¢X1W¢W 1W¢U
è2øèn n øèn ø
æbö æ1ån X2 1ån XWö-1æ1ån Xuö
=ç1÷+çn i=1 i n i=1 i i÷çn i=1 i i÷ b 1ånWX 1ånW2 ç1ånWu÷
è2øèn i=1 i i n i=1 i øèn i=1 iiø By the law of large numbers
180
7𝑋( →𝐸(𝑋() 𝑛””
“9%
180
7𝑊( →𝐸(𝑊()
𝑛 “9%
“”
180
7 𝑋 𝑊 → 𝐸(𝑋 𝑊 ) 𝑛””””
“9%
(because X and W are independent with means of zero); 180
𝑛7𝑋”𝑢” →𝐸(𝑋”𝑢”) “9%
(because X and u are independent with means of zero); 180
7𝑊𝑢 → 𝐸(𝑊𝑢)≠0
“” “” (because 𝑊 and 𝑢 are not independent)
Thus,
𝑛 “9%
𝛽
𝛽/ 𝛽 ?% 0 ⎡ % ⎤ %0%𝐸(𝑋()0 ⎢ ⎥
=>→=>+3” (4@ A= 𝐸(𝑊𝑢)
0𝐸(𝑊) ⎢””⎥ /𝛽 “𝐸(𝑊𝑢)𝛽+
𝛽( ( “” ⎣( 𝐸(𝑊()⎦ ”
2
(b) From the answer to (a), 𝛽/ →0 𝛽 + H(IJKJ) ≠ 𝛽 if 𝐸(𝑊 𝑢 ) ≠ 0is nonzero. ( (HLIMN ( “”
J
(c) Consider the population linear regression ui onto Wi: ui =lWi +ai
where l = E(Wu)/E(W2). In this population regression, by construction, E(aW) = 0. Using this equation for ui rewrite the equation to be estimated as
𝑌=𝑋𝛽+𝑊𝛽+𝑢 “”%”(”
=𝑋𝛽 +𝑊𝛽 +𝜆𝑊+𝑎 “%”(“”
=𝑋𝛽 +𝑊(𝛽 +𝜆)+𝑎 “%”(”
=𝑋𝛽 +𝑊𝜃+𝑎 “%””
where 𝜃 = 𝛽( + 𝜆. A calculation like that used in part (a) can be used to show that 18 18 ?%18
7𝑋𝑊⎤⎡ 7𝑋𝑎⎤ 𝑛 “”⎥⎢√𝑛 “”⎥ “9% ⎥⎢ “9% ⎥
⎡7𝑋( √𝑛L𝛽/%−𝛽%N⎢𝑛 ”
3
4=⎢”9% √𝑛L𝛽/(−𝜃N⎢18
18 ⎥⎢18 ⎥
⎢7𝑊𝑋 ⎣𝑛 ” ”
7𝑊(⎥⎢ 7𝑊𝑎⎥
“9%
where S1 is distributed 𝑁L0, 𝜎(𝐸(𝑋()N. Thus by Slutsky’s theorem Z
T 𝜎( √𝑛L𝛽/%−𝛽%N→𝑁[0, Z \
Now consider the regression that omits W, which can be written as: 𝑌=𝑋𝛽+𝑑
where di = Wiq + ai. Calculations like those used above imply that
“9%
𝑛
” ⎦ ⎣√𝑛
” “⎦ “9%
T 𝐸(𝑋() 0 ?% 𝑆 →3 ” 4U%W
0 𝐸(𝑊() 𝑆( ”
“”%”
T 𝜎( √𝑛L𝛽/2−𝛽N→𝑁[0, T \
𝐸(𝑋() ”
% % 𝐸(𝑋() ”
3
Since 𝜎( = 𝜎( + 𝜃(𝐸(𝑊(), the asymptotic variance of 𝛽/2 is never smaller than the TZ”%
asymptotic variance of 𝛽/ . %
2. (43p) Consider the problem of minimizing the sum of squared residuals, 𝜀, subject to the constraint that 𝑅𝑏 = 𝑟 where 𝑅 is 𝑞 × 𝑘 with rank 𝑞. Let 𝛽e be the value of 𝑏 that solves the constraint minimization problem.
(a) (3p) Write the Lagrangian for the minimization problem is
(b) (3p) Show that 𝛽e = 𝛽/ − (𝑋f𝑋)?%𝑅′[𝑅(𝑋f𝑋)?%𝑅′]?%L𝑅𝛽/ − 𝑟N
(c) (4p) Show that
L𝑌 − 𝑋𝛽eNfL𝑌 − 𝑋𝛽eN − L𝑌 − 𝑋𝛽/NfL𝑌 − 𝑋𝛽/N = L𝑅𝛽/ − 𝑟Nf[𝑅(𝑋f𝑋)?%𝑅f]?%L𝑅𝛽/ − 𝑟N
(d) (3p) Show that 𝐹e is equivalent to the homoskedasticity – only 𝐹 statistic, where
𝐹e = L𝑅𝛽/ − 𝑟Nf[𝑅(𝑋f𝑋)?%𝑅′]?%L𝑅𝛽/ − 𝑟N/𝑞
and
Solution: (a)
𝐹 =
𝑠( m
(𝑆𝑆𝑅2nop2″qpnT − 𝑆𝑆𝑅K82nop2″qpnT)/𝑞 𝑆𝑆𝑅K82nop2″qpnT/(𝑛 − 𝑘K82nop2″qpnT − 1)
𝐿(𝑏,𝜆)=1(𝑌−𝑋𝑏)f(𝑌−𝑋𝑏)+ 𝜆f(𝑅f𝑏−𝑟) 2
where 𝜆 is 𝑞 × 1 vector of Lagrange multipliers. (b) The first order conditions are
and
( ∗ ) 𝑋 f L 𝑌 − 𝑋 𝛽e N − 𝑅 𝜆u = 0 ( ∗ ∗ ) 𝑅 f 𝛽e − 𝑟 = 0
4
Solving (*) yields
𝑋 f 𝑌 − 𝑋 f 𝑋 𝛽e − 𝑅 𝜆e = 0 (𝑋′𝑋)?%𝑋f𝑌 − (𝑋f𝑋)?%𝑋f𝑋𝛽e − (𝑋f𝑋)?%𝑅𝜆e = 0
Multiply by (𝑋′𝑋)?%
(∗∗∗) Multiplying by 𝑅′
and using (**) yields
so that
𝛽e = 𝛽v − (𝑋f𝑋)?%𝑅𝜆e
𝑅′𝛽e = 𝑅′𝛽v − 𝑅′(𝑋f𝑋)?%𝑅𝜆e
𝑟 = 𝑅f𝛽/ − 𝑅f w𝑋′𝑋x−1 𝑅𝜆u 𝑅′(𝑋f𝑋)?%𝑅𝜆e = 𝑅′𝛽v − 𝑟
𝜆e = y𝑅′(𝑋f𝑋)?%𝑅z?%L𝑅′𝛽v − 𝑟N
Substituting this into (***) yields the result:
e / ′ −1 f ′ −1 −1 f/ 𝛽=𝛽−w𝑋𝑋x 𝑅U𝑅 w𝑋𝑋x 𝑅W L𝑅𝛽−𝑟N
(c) Using the result in (b),
𝑌 − 𝑋𝛽e = (𝑌 − 𝑋𝛽v) − 𝑋(𝑋f𝑋)?%𝑅y𝑅′(𝑋f𝑋)?%𝑅z?%L𝑅′𝛽v − 𝑟N
so that
(𝑌 − 𝑋𝛽e)f(𝑌 − 𝑋𝛽e) = {(𝑌 − 𝑋𝛽v) − 𝑋(𝑋f𝑋)?%𝑅y𝑅′(𝑋f𝑋)?%𝑅z?%L𝑅′𝛽v − 𝑟N|f
{(𝑌 − 𝑋𝛽v) − 𝑋(𝑋f𝑋)?%𝑅y𝑅′(𝑋f𝑋)?%𝑅z?%L𝑅′𝛽v − 𝑟N|
5
or
(𝑌 − 𝑋𝛽e)f(𝑌 − 𝑋𝛽e) = (𝑌 − 𝑋𝛽v)′(𝑌 − 𝑋𝛽v)
f/ f f ′ −1 −1 f ′ −1 f ′ −1 f ′ −1 −1 f/
+L𝑅 𝛽−𝑟N U𝑅 w𝑋𝑋x 𝑅W 𝑅 w𝑋𝑋x 𝑋 𝑋w𝑋𝑋x 𝑅U𝑅 w𝑋𝑋x 𝑅W L𝑅 𝛽−𝑟N
/ ′ −1 f ′ −1 −1 f/ −2L𝑌−𝑋𝛽N′𝑋w𝑋𝑋x 𝑅U𝑅 w𝑋𝑋x 𝑅W L𝑅 𝛽−𝑟N
The last term is zero because L𝑌 − 𝑋𝛽/Nf𝑋 = 𝜀̂f𝑋 = (𝑀𝑌)f𝑋 = 𝑌f𝑀𝑋 = 0 Thus,
(𝑌 − 𝑋𝛽e)f(𝑌 − 𝑋𝛽e) = (𝑌 − 𝑋𝛽v)′(𝑌 − 𝑋𝛽v) + L𝑅′𝛽v − 𝑟N′y𝑅′(𝑋f𝑋)?%𝑅z?%L𝑅′𝛽v − 𝑟N
(d) The result in (c) shows that
f/ f f ′ −1 −1 f/ ′ ′
L𝑅𝛽−𝑟N U𝑅 w𝑋𝑋x 𝑅W L𝑅𝛽−𝑟N=𝜀𝜀−𝜀Ä𝜀Ä=𝑆𝑆𝑅𝑟𝑒𝑠𝑡𝑟𝑖𝑐𝑡𝑒𝑑−𝑆𝑆𝑅𝑢𝑛𝑟𝑒𝑠𝑡𝑟𝑖𝑐𝑡𝑒𝑑
Also,
Then,
𝐹e = L𝑅f𝛽/ − 𝑟Nf[𝑅f(𝑋f𝑋)?%𝑅]?%L𝑅f𝛽/ − 𝑟N/𝑞 =
𝑠( m
𝑠( = 𝑆𝑆𝑅K82nop2″qpnT
m 𝑛 − 𝑘K82nop2″qpnT − 1
(𝑆𝑆𝑅2nop2″qpnT − 𝑆𝑆𝑅K82nop2″qpnT)/𝑞
𝑆𝑆𝑅 /(𝑛 − 𝑘 − 1) K82nop2″qpnT K82nop2″qpnT
6
(e) (5p) Use California School data1 set (caschool.dta) to answer the rest of the parts of this question. The description of the data set can be found in the following table
Regressor name:
Description:
dist_code
District code
read_scr
Average reading score
math_scr
Average math score
county
County
district
District
enrl_tot
Total enrollment
teachers
Number of teachers
computer
Number of computers
testscr
Average test score(= (𝑟𝑒𝑎𝑑_𝑠𝑐𝑟 + 𝑚𝑎𝑡h_𝑠𝑐𝑟)/2)
comp_stu
Computers per student (= 𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝑟/𝑒𝑛𝑟𝑙_𝑡𝑜𝑡)
expn_stu
Expenditures per student ($’s)
str
Student teacher ratio (enrl_tot/teachers)
el_pct
Percent of English learners
meal_pct
Percent qualifying for reduced-price lunch
calw_pct
Percent qualifying for Calworks
avging
District average income (in $1,000’s)
Regress test score average on student teacher ratio, percent English learners and percent qualifying for reduced-price lunch (hint: Stata code: reg testscr str el_pct meal_pct). Copy/paste your regression results (use Bold Courier New size 9 for Word document) Find the SSR(Sum of squared residuals), ESS (Explained sum of squared and TSS (Total sum of squares) on the output and using these, calculate R-squared and adjusted R-squared, compare them to the R-squared and adjusted R-squared reported on the output.
. reg testscr str el_pct meal_pct
Source | SS df MS ————-+———————————- Model | 117811.294 3 39270.4312 Residual | 34298.3001 416 82.4478368 ————-+———————————- Total | 152109.594 419 363.030056
Number of obs F(3, 416) Prob>F R-squared Adj R-squared Root MSE
= 420 = 476.31 = 0.0000 = 0.7745 = 0.7729 = 9.0801
—————————————————————————— testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval] ————-+—————————————————————- str | -.9983092 .2387543 -4.18 0.000 -1.467624 -.528994 el_pct | -.1215733 .0323173 -3.76 0.000 -.1850988 -.0580478 meal_pct | -.5473456 .0215988 -25.34 0.000 -.589802 -.5048891 _cons| 700.15 4.685687 149.42 0.000 690.9394 709.3605 ——————————————————————————
𝑆𝑆𝑅 = 34298.3001, 𝐸𝑆𝑆 = 117811.294, 𝑇𝑆𝑆 = 152109.594
𝑅( = 𝐸𝑆𝑆 = 117811.294 = 0.7745 𝑇𝑆𝑆 152109.594
1 obtained from the California Department of Education (www.cde.ca.gov)
7
or (note to grader: either one gets full credit)
𝑅( =1−𝑆𝑆𝑅=1−34298.3001=0.7745 𝑇𝑆𝑆 152109.594
𝑅ì( =1−𝑛−1(1−𝑅()=1−420−1(1−0.7745)=0.77287 𝑛 − 𝑘 420 − 4
(f) (15p) Now, assume you want to test whether per unit effect of str on test scores are significantly different from the per unit effect of English learner percentage on test scores or not. (i)(3p) Set up the appropriate test using 𝑅′𝛽 = 𝑞 . That is write what 𝑅 and 𝛽 and 𝑞 are in matrix/vector notation. (ii) (2p) Write the Wald test (iii) (7p) Calculate the Wald test, note that you will need to use matrix function of Stata here just like you did for problems set #2 questions 1 and 2 (iv) (3p) Run the same test using the “test” command in Stata, compare your F result to (iii).
(i)
𝐻ï: 𝑅𝛽 = 𝑞 𝐻ï: 𝑅𝛽 ≠ 𝑞
⎡𝛽q ⎤ 𝛽 = ⎢𝛽op2 ⎥ ⎢𝛽nó_0qp ⎥
⎣𝛽ònZó_0qp⎦ 𝑞 = [0]
(ii)
The test statistic is
𝑅=[0 1 −1 0]
𝐹 = L𝑅′𝛽/ − 𝑟Nf[𝑅′(𝑋f𝑋)?%𝑅]?%L𝑅′𝛽/ − 𝑟N/𝑞
𝑠( m
~ 𝐹 ö,ö%õ
(iii)
. * generate the column of ones for intercept . gen c=1
. *create X matrix, X’X and Inv(X’X)
8
. mkmat c str el_pct meal_pct, . matrix XX=X’*X
. mat list XX
symmetric XX[4,4]
matrix(X)
c
str
el_pct
meal_pct
el_pct
c str 420
8248.9786 163513.03 6622.6252 132790.99 18776.199 371679.4
meal_pct 431781.2 1147643.4
. matrix
. mat list XX_inv symmetric XX_inv[4,4]
c str c .26629757
XX_inv=syminv(XX)
str -.01333622 .00069139
el_pct .00028772 -.00001239 .00001267
meal_pct -.00014594 -1.065e-06 -5.460e-06 5.658e-06 * create R vector
. matrix R=(0,1,-1,0)
. matrix list R
R[1,4]
c1 c2 c3 c4
r1 0 1 -1 0
. * create beta hat vector
. matrix b_hat=[_b[_cons]\_b[str]\_b[el_pct]\_b[meal_pct]]
. matrix list b_hat
b_hat[4,1]
c1
r1 700.14996 r2 -.99830922 r3 -.12157329 r4 -.54734555
. * create q . matrix q=[0]
. * create (Rb-q)
. matrix Rbq=(R*b_hat)-q . matrix list Rbq
symmetric Rbq[1,1] c1
r1 -.87673593
244529.77
el_pct
meal_pct
9
(iv)
. * create inverse of R’inv(X’X)R
. matrix inv_RXXR=syminv(R*XX_inv*R’) . matrix list inv_RXXR
symmetric inv_RXXR[1,1]
r1
r1 1372.0458
. * create the numerator of F test . matrix num=Rbq’*inv_RXXR*Rbq
. matrix list num
symmetric num[1,1] c1
c1 1054.6448
. * Residuals (denominator) . mkmat testscr, matrix(Y)
. matrix e=Y-X*b_hat
. * Residual sum of squares (sigma squared) . mat ss=(e’*e)/(420-3-1)
. mat list ss
symmetric ss[1,1] c1
c1 82.447837
. matrix F=num*syminv(ss) . matrix list F
symmetric F[1,1] c1
c1 12.791661
Running the test directly in Stata
. test str=el_pct
(1) str-el_pct=0
F( 1, 416) = 12.79 Prob > F = 0.0004
10
(g) (5p) Run the same regression in part (e) with heteroskedasticity – robust standard errors and report (copy/paste) your results. (hint: Stata code: reg testscr el_pct meal_pct, r) (use Bold Courier New size 9 for Word document). Is str significantly different from zero? Set up your null and alternative hypothesis and answer this question by (i) using critical values with significance level 0.05 (ii) using p-value (iii) using confidence intervals
. reg testscr str el_pct meal_pct, r Linear regression
Number of obs F(3, 416) Prob>F R-squared Root MSE
= 420 = 453.48 = 0.0000 = 0.7745 = 9.0801
—————————————————————————— | Robust
testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval] ————-+—————————————————————- str | -.9983092 .2700799 -3.70 0.000 -1.529201 -.4674178 el_pct | -.1215733 .0328317 -3.70 0.000 -.18611 -.0570366 meal_pct | -.5473456 .0241072 -22.70 0.000 -.5947328 -.4999583 _cons| 700.15 5.56845 125.74 0.000 689.2042 711.0958 ——————————————————————————
𝐻ï:𝛽op2 =0 𝐻%:𝛽op2 ≠0
𝑡=𝛽/op2 −𝛽op2 =−.998−0=−3.7 𝑠. 𝑒. L𝛽/op2 N 0.27
(i) Reject 𝐻ï, because −3.7 < −1.96 (ii) 𝑝−𝑣𝑎𝑙𝑢𝑒=0.000122
(iii) −1.529 < 𝛽op2 < −.467
All three results support rejecting the null hypothesis hence it is significantly different from zero.
(h) (5p) Regress test score average on student teacher ratio, expenditure per student, percent English learners and percent qualifying for reduced-price lunch and (hint: Stata code: reg testscr str expn_stu el_pct meal_pct, r) Copy/paste your results. (use Bold Courier New size 9 for Word document). Why is str not statistically different from zero in this regression? Why is there a difference in the significance of str
11
in this regression and in the regression part (g). If you want to continue with this research, what would your next step be?
. reg testscr str expn_stu el_pct meal_pct, r Linear regression
Number of obs F(4, 415) Prob>F R-squared Root MSE
= 420 = 364.67 = 0.0000 = 0.7834 = 8.9096
—————————————————————————— | Robust
testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval] ————-+—————————————————————- str | -.2353884 .3248148 -0.72 0.469 -.8738757 .4030989 expn_stu| .003622 .0009447 3.83 0.000 .001765 .0054791 el_pct | -.1283415 .0324397 -3.96 0.000 -.1921082 -.0645748 meal_pct | -.5463929 .0231685 -23.58 0.000 -.5919351 -.5008506 _cons| 665.9882 10.37683 64.18 0.000 645.5905 686.3859 ——————————————————————————
Str is not statistically significant anymore because it is very likely that str and exp_stu are highly correlated. We must check the correlation between them:
. corr str expn_stu (obs=420)
| str expn_stu ————-+——————
str | 1.0000 expn_stu | -0.6200 1.0000
It is likely that due to multicollineary standard error of str is over estimated, this caused over-rejection of significance test for 𝛽op2
We must check if they are jointly significant:
𝐻ï: 𝛽op2 = 𝛽nü08_opK = 0
. test str expn_stu
(1) str=0
( 2) expn_stu = 0
𝐻%: 𝑛𝑜𝑡𝐻ï
F( 2, 415) = 14.94 Prob > F = 0.0000
Reject 𝐻ï, so they are jointly significant. We have an “excuse” to keep both of them in the regression.
12
3. (Practice Question for recitation, you do not need to hand this in, Restricted estimation) When we have a set of linear restrictions, instead of estimating the unrestricted model and then doing a hypothesis test to see if they hold or not, we could also impose them, and choose the best least-squares estimate that satisfies the equations exactly.
The minimization problem is
𝑚𝑖𝑛†(𝑦 − 𝑋𝑏)′(𝑦 − 𝑋𝑏) subject to 𝑅f𝑏 = 𝑞 The Lagrangean is
The solutions are
𝐿(𝑏, 𝜆) = (𝑦 − 𝑋𝑏)f(𝑦 − 𝑋𝑏) + 2𝜆′(𝑅f𝑏 − 𝑞)
𝑏∗ = 𝛽/¢óo − (𝑋f𝑋)?%𝑅[𝑅f(𝑋f𝑋)?%𝑅]?%L𝑅f𝛽/¢óo − 𝑞N 𝜆∗ = [𝑅f(𝑋f𝑋)?%𝑅]?%L𝑅f𝛽/¢óo − 𝑞N
(a) The expression for the restricted coefficient vector may be written in the form 𝑏∗ = (𝐼 − 𝐶𝑅′)𝛽/¢óo + 𝑤 where 𝑤 does not involve 𝑏.
What is the matrix 𝐶?
(b) Show that the covariance matrix of the restricted least squares estimator is 𝜎((𝑋f𝑋)?% − 𝜎((𝑋f𝑋)?%𝑅[𝑅f(𝑋f𝑋)?%𝑅]?%𝑅f(𝑋f𝑋)?%
and that this can be written as
𝑉𝑎𝑟L𝛽/¢óo|𝑋Nw𝑉𝑎𝑟L𝛽/¢óo|𝑋N?% −𝑅𝑉𝑎𝑟L𝑅𝛽/¢óo|𝑋N?%𝑅fx𝑉𝑎𝑟L𝛽/¢óo|𝑋N
(c) Prove the result that the restricted least squares estimator never has a larger variance covariance matrix than the unrestricted least squares estimator.
(d) Prove the result that the 𝑅( associated with a restricted least squares estimator is never larger than that associated with the unrestricted least squares estimator, and thus never improves the fit of the regression.
Solution:
(a) Using the formula above we have 𝐶 = (XfX)?%𝑅[𝑅f(XfX)?%𝑅]?% Meanwhile, 𝑤 = 𝐶𝑞
(b) The variance of 𝑏∗ is (𝐼 − 𝐶𝑅′)𝑣𝑎𝑟(𝛽/¢óo)(𝐼 − 𝐶𝑅′)′ which is 𝑣𝑎𝑟(𝛽/¢óo) − 𝐶𝑅′𝑣𝑎𝑟(𝛽/¢óo) − 𝑣𝑎𝑟(𝛽/¢óo)𝑅𝐶′ + 𝐶𝑅′𝑣𝑎𝑟(𝛽/¢óo)𝑅𝐶′
= 𝑣𝑎𝑟(𝛽/¢óo) − 𝜎((𝑋f𝑋)?%𝑅[𝑅f(𝑋f𝑋)?%𝑅]?%𝑅f(𝑋f𝑋)?% − 𝜎((𝑋f𝑋)?%𝑅[𝑅f(𝑋f𝑋)?%𝑅]?%𝑅f(𝑋f𝑋)?% +𝜎((𝑋f𝑋)?%𝑅[𝑅f(𝑋f𝑋)?%𝑅]?%𝑅f(𝑋f𝑋)?%𝑅[𝑅f(𝑋f𝑋)?%𝑅]?%
= 𝜎((𝑋f𝑋)?% − 𝜎((𝑋f𝑋)?%𝑅[𝑅′(𝑋f𝑋)?%𝑅]?%𝑅f(𝑋f𝑋)?%
= 𝜎((𝑋f𝑋)?%((𝜎((𝑋f𝑋)?%)?% − 𝜎?(𝑅[𝑅(𝑋f𝑋)?%𝑅f]?%𝑅f)𝜎((𝑋f𝑋)?%
13
=𝑉𝑎𝑟L𝛽/¢óo|𝑋Nw𝑉𝑎𝑟L𝛽/¢óo|𝑋N?% −𝑅𝑉𝑎𝑟L𝑅𝛽/¢óo|𝑋N?%𝑅′x𝑉𝑎𝑟L𝛽/¢óo|𝑋N
(c) The difference between the two is
𝑣𝑎𝑟(𝛽/¢óo) − 𝑣𝑎𝑟(𝑏∗)= 𝜎((𝑋f𝑋)?%𝑅f[𝑅(𝑋f𝑋)?%𝑅f]?%𝑅(𝑋f𝑋)?%
which is a positive semidefinite matrix, so therefore 𝑣𝑎𝑟(𝛽/¢óo) ≥ 𝑣𝑎𝑟(𝑏∗) (since any matrix written as 𝐴′𝐴 for another matrix A is positive (semi)definite and so is its inverse if invertible)
(d) That the 𝑅( is lower under the restriction: We can write out (𝑦 − 𝑋𝑏)f(𝑦 − 𝑋𝑏) with any other b as 𝑏 = (𝑏 − 𝛽/) + 𝛽/,
(𝑦 − 𝑋𝑏)f(𝑦 − 𝑋𝑏)
= w 𝑦 − 𝑋 𝛽/ − 𝑋 L 𝑏 − 𝛽/ N x f w 𝑦 − 𝑋 𝛽/ − 𝑋 L 𝑏 − 𝛽/ N x
= L 𝑦 − 𝑋 𝛽/ N f L 𝑦 − 𝑋 𝛽/ N − 2 L 𝑦 − 𝑋 𝛽/ N f w 𝑋 L 𝑏 − 𝛽/ N x + L 𝑏 − 𝛽/ N f 𝑋 f 𝑋 L 𝑏 − 𝛽/ N = 𝑢Ä f 𝑢Ä − 2 × 0 + L 𝑏 − 𝛽/ N f 𝑋 f 𝑋 L 𝑏 − 𝛽/ N
Therefore, the difference in 𝑅( will be:
𝑅( −𝑅( =1−L𝑦−𝑋𝛽/NfL𝑦−𝑋𝛽/N−1+(y−Xb)f(y−Xb)
K 2 𝑦f𝑀ï𝑦 𝑦f𝑀ï𝑦
= 𝑢Äf𝑢Ä + Lb − 𝛽/NfXfXLb − 𝛽/N − 𝑢Äf𝑢Ä 𝑦f𝑀ï𝑦
= L b − 𝛽/ N f X f X L b − 𝛽/ N ≥ 0 𝑦f𝑀ï𝑦
and the equality is only if 𝑏 = 𝛽/.
Nothing about the particular restricted 𝑏∗ was used here. Any b from the same class
must have a lower 𝑅( since the way 𝛽/¢óo is chosen is equivalent to maximizing the 𝑅(
4. (10p) (Attempt this question after you attend a recitation because it is related to the question before this one) Using the same notation as in the previous question, prove that under the hypothesis that 𝑅f𝛽 = 𝑞, the estimator 𝑠 = (𝑦 − 𝑋𝑏∗)f(𝑦 − 𝑋𝑏∗)/(𝑛 − 𝐾 + 𝐽), where 𝐽 is the number of restrictions, is unbiased for σ(. (assume 𝐸[𝜖𝜖′] = 𝜎(𝐼) (hint: it helps to show that
in general 𝑒f𝑒 = 𝑒f𝑒 + (𝑅f𝑏 − 𝑞)f[𝑅f(𝑋f𝑋)?%𝑅]?%(𝑅f𝑏 − 𝑞) ) ∗∗
14
Solution:
𝑏∗ =(𝐼−𝐶𝑅)𝛽/±23 +𝐶𝑞 𝑒∗ = 𝑦 − 𝑋𝑏∗ /
=𝑦−𝑋(𝐼−𝐶𝑅)𝛽±23 −𝑋𝐶𝑞
= 𝑦 − 𝑋𝛽/±23 + 𝑋𝐶L𝑅f𝛽/±23 − 𝑞N = 𝑒 + 𝑋𝐶L𝑅f𝛽/±23 − 𝑞N
𝑒f𝑒 =w𝑒f+L𝑅f𝛽/ −𝑞Nf𝐶f𝑋fxw𝑒+𝑋𝐶L𝑅f𝛽/ −𝑞Nx ∗∗ ±23 ±23
= 𝑒f𝑒 + 𝑒f𝑋𝐶L𝑅f𝛽/±23 − 𝑞N + L𝑅f𝛽/±23 − 𝑞Nf𝐶f𝑋f𝑒 + L𝑅f𝛽/±23 − 𝑞Nf𝐶f𝑋f𝑋𝐶L𝑅f𝛽/±23 − 𝑞N
= 𝑒f𝑒 + L𝑅f𝛽/±23 − 𝑞Nf𝐶f𝑋f𝑋𝐶L𝑅f𝛽/±23 − 𝑞N𝑋f𝑒 =0
Hence,
𝑒f𝑒 =𝑒f𝑒+L𝑅f𝛽/ −𝑞Nf[𝑅f(𝑋f𝑋)?%𝑅]?%𝑅f(𝑋f𝑋)?%𝑋f𝑋(𝑋f𝑋)?%𝑅[𝑅f(𝑋f𝑋)?%𝑅]?%L𝑅f𝛽/ −𝑞N ∗∗/±23f/ ±23
= 𝑒f𝑒 + L𝑅f𝛽±23 − 𝑞N [𝑅f(𝑋f𝑋)?%𝑅]?%L𝑅f𝛽±23 − 𝑞N
Now the result is that
𝐸[𝑒f𝑒] = (𝑛 − 𝐾)σ(, f
so E[𝑒f𝑒 ]=(𝑛 − 𝐾)σ( + 𝐸[L𝑅𝛽/ − 𝑞N [𝑅(𝑋f𝑋)?%𝑅f]?%(𝑅𝛽/ − 𝑞)] ∗∗ ±23 ±23
Now 𝛽/±23 = β + (𝑋f𝑋)?%𝑋fε, so 𝑅𝛽/±23 − 𝑞 = 𝑅′β − 𝑞 + 𝑅′(𝑋f𝑋)?%𝑋fε, but 𝑅 β − 𝑞 = 0, so under this hypothesis 𝑅′𝑏 − 𝑞 = 𝑅′(𝑋f𝑋)?%𝑋fε .
Insert this in the result above to obtain
E[𝑒f𝑒 ]=(𝑛 − 𝐾)σ( + 𝐸[𝜖′𝑋((𝑋f𝑋)?%R[𝑅f(𝑋f𝑋)?%𝑅]?%𝑅(𝑋f𝑋)?%𝑋f𝜖].
The quantity in square bracket is a scalar, so it is equal to its trace.
Permute ε’X((𝑋f𝑋)?%R’ in the trace to obtain
E[𝑒f𝑒 ]=(𝑛 − 𝐾)σ( + 𝐸[𝑡𝑟{[𝑅f(𝑋f𝑋)?%𝑅]?%𝑅(𝑋f𝑋)?%X’ εε′𝑋(𝑋f𝑋)?%𝑅f}]. ∗∗
∗∗
We may now carry the expectation inside the trace and use E[εεf]=σ(I to obtain E[𝑒f𝑒 ]=(𝑛 − 𝐾)σ( + 𝑡𝑟{[𝑅f(𝑋f𝑋)?%𝑅]?%𝑅(𝑋f𝑋)?%X’ σ(I𝑋(𝑋f𝑋)?%𝑅f}.
∗∗
Carry the σ( outside the trace operator, and after cancellation of the products of matrices
times their inverse, we obtain E[𝑒f𝑒 ]=(𝑛 − 𝐾)σ( + 𝑡𝑟y𝐼 z𝜎( = 𝜎( (𝑛 − 𝐾 + 𝐽) ∗∗1
5. (4p) Show that in the multiple regression of y on a constant, 𝑥%, and 𝑥(, while imposing the restriction β% + β( = 1 leads to the regression of 𝑦 − 𝑥%on a constant and 𝑥( − 𝑥%.
15
Solution: For convenience, we put the constant term last instead of the first in the parameter vector. The constraint is 𝑅𝑏 − 𝑞 = 0 where 𝑅 = [1 1 0 ] so 𝑅% = [1] and 𝑅( = [1,0]. Then ?%
β% =[1] [1−β(]=1−β( Thus𝑦=(1−β()𝑥% +β(𝑥( +α” +εor𝑦−𝑥% =β((𝑥( −𝑥%)+α” +ε
6. (Practice question, this question will not be graded, so do not need to submit solutions. Answer to this question will be covered in recitations this week)
Prove that
(𝑛 − 𝑘)𝑠( ( 𝜎( ~𝜒8?1⁄2
where 𝑠( sample variance for errors in 𝑦 = 𝑋𝛽 + 𝜀.
If 𝑍 is normal and 𝑀 is idempotent with size 𝑛 − 𝑘 then 𝑍f𝑀𝑍~ 𝜒(
𝑛−𝑘 𝜀f𝑀𝜀 1 f 1 8?1⁄2 𝜎( (𝑛−𝑘)={𝜎𝜀| 𝑀{𝜎𝜀|=𝑍f𝑀𝑍
7. (15p) In matrix form, the model is: where 𝐸(𝑒|𝑋) = 0.
𝑦 = 𝑋𝛽 + 𝑒
𝜀~𝑁(0, 𝜎(𝐼)
𝜀 ~𝑁(0, 𝐼) 𝜎
M is idempotent and rank of 𝑀 is 𝑛 − 𝑘, then
(𝑛 − 𝑘)𝑠( (
𝜎( ~𝜒8?1⁄2
(a) (3p) Prove that 𝑋f𝑒̂ = 0, where 𝑒̂ = 𝑦 − 𝑋𝛽/, where 𝛽/ is the least squares estimator. (b) (3p) Taking the same model as in part (a), prove 𝑒̂ = 𝑀𝑒, where 𝑀 = 𝐼 − 𝑋(𝑋f𝑋)?%𝑋′ (c) (5p) Prove that 𝑦Ä′𝑒̂ = 0 given the model in part (a) and 𝑦Ä = 𝑋𝛽/. Also prove that
(𝑦Ä − 𝑖 ∗ 𝑦ì)f𝑒̂ = 0, where 𝑖 is 𝑛 × 1 vector of ones, and 𝑦ì is the sample mean of 𝑦”, 𝑖 =
1,2,…,𝑛.
(d) (4p) Assume that true model is
𝑦 = 𝑋%𝛽% + 𝑋(𝛽( + 𝑢, (1) where 𝐸(𝑢|𝑋%, 𝑋() = 0, but we mistakenly fit the following
𝑦 = 𝑋%𝛽% + 𝑒, (2)
16
where error term in (2) is 𝑒 = 𝑋(𝛽( + 𝑢
Is 𝛽/ from (2) an unbiased estimate for all scenarios? Prove.
Solutions:
(a) To prove 𝑋f𝑒̂ = 0, note that
𝑋 f 𝑒 ̂ = 𝑋 f L 𝑦 − 𝑋 𝛽/ N
%
(b) To prove 𝑒̂ = 𝑀𝑒, note that 𝑒 ̂ = = = = = =
since 𝑀𝑋 = 0
(c) To prove 𝑦Ä′𝑒̂ = 0, note that
= 𝑋f𝑦 − 𝑋f𝑋𝛽/
= 𝑋f𝑦 − 𝑋f𝑋[(𝑋f𝑋)?%𝑋f𝑦] = 𝑋f𝑦 − 𝑋f𝑦
=0
𝑦 − 𝑋 𝛽/
𝑦 − 𝑋[(𝑋f𝑋)?%𝑋f𝑦] [𝐼 − 𝑋(𝑋f𝑋)?%𝑋f]𝑦 𝑀𝑦
𝑀(𝑋𝛽 + 𝑒)
𝑀𝑒
then
𝑒̂ = 𝑀𝑒 = 𝑀𝑦
𝑦Äf𝑒̂ = 𝑦Äf𝑀𝑒 = 𝑦Äf𝑀𝑦 = L𝑋𝛽/Nf𝑀𝑦 = 𝛽/f𝑋f𝑀𝑦 = 0
since 𝑀𝑋 = 0
Also, to prove that (𝑦Ä − 𝑖 ∗ 𝑦ì)f𝑒̂ = 0, note that
𝑦Ä% 1
( 𝑦Ä − 𝑖 ∗ 𝑦ì ) = ¿ ¡ 𝑦Ä ( Ã − ¡ 1 Ã ƒ 𝑦ì ⋮⋮
𝑦Ä8 1 (𝑦Ä−𝑖∗𝑦ì)f=[𝑦Ä%−𝑦ì … 𝑦Ä8−𝑦ì]
/ 𝑦 % − 𝑦Ä % 𝑒̂=𝑦−𝑋𝛽=𝑦−𝑦Ä=@ … A
𝑦 − 𝑦Ä 88
17
Then
f 𝑦 % − 𝑦Ä % (𝑦Ä−𝑖𝑦ì)𝑒̂=[𝑦Ä%−𝑦ì … 𝑦Ä8−𝑦ì]@ … A
(d) If we estimate (2) using OLS,
𝛽/ = ( 𝑋 f 𝑋 ) ? % 𝑋 f 𝑌
Using LIE,
𝑦 − 𝑦Ä 88
=7𝑦Ä”(𝑦” −𝑦Ä”)−𝑦ì7(𝑦” −𝑦Ä”) “9% “9%
fì = 𝑦Ä 𝑒 ̂ − 𝑦ì L 𝑛 𝑦ì − 𝑛 𝑦Ä N
=0
%%%%
8
= 7 ( 𝑦Ä ” − 𝑦ì ) ( 𝑦 ” − 𝑦Ä ” ) “9%
88
=(𝑋f𝑋)?%𝑋f(𝑋𝛽 +𝑋𝛽 +𝑢) %%%%%((
= 𝛽 + (𝑋f𝑋 )?%𝑋f𝑋 𝛽 + (𝑋f𝑋 )?%𝑋f𝑢 %%%%((%%%
𝐸y𝛽/ z = 𝛽 + 𝐸[(𝑋f𝑋 )?%𝑋f𝑋 ]𝛽 + 𝐸[(𝑋f𝑋 )?%𝑋f𝑢] %% %%%(( %%%
𝐸[(𝑋f 𝑋 )?%𝑋f 𝑢] = 𝐸[𝐸{(𝑋f 𝑋 )?%𝑋f 𝑢|𝑋 , 𝑋 }] %%% %%%%(
= 𝐸[(𝑋f𝑋 )?%𝑋f𝐸{𝑢|𝑋 ,𝑋 }] %%%%(
= 𝐸[(𝑋f𝑋 )?%𝑋f0] %%%
=0
However, 𝐸[(𝑋f 𝑋 )?%𝑋f 𝑋 ] ≠ 0. Therefore, 𝛽/ is biased. %%%( %
8. (4p) Consider the population regression of test scores against income and the square of income in non-matrix form as follows:
18
𝑇𝑒𝑠𝑡𝑆𝑐𝑜𝑟𝑒 = 𝛽 + 𝛽 𝐼𝑛𝑐𝑜𝑚𝑒 + 𝛽 𝐼𝑛𝑐𝑜𝑚𝑒( + 𝜀 “ï%”(“”
Explain how to test the null hypothesis that the relationship between test scores and income is linear against the alternative that it is quadratic. Write the null hypothesis in matrix form and define 𝑅, 𝑞 and 𝑗.
Solutions:
The null hypothesis is
versus
with
𝐻ï: 𝑅′𝛽=𝑞
𝐻%: 𝑅′𝛽≠𝑞
𝑅′=(0 0 1) 𝑞=0
𝛽ï 𝛽 = @𝛽%A
𝛽(
The heteroskedasticity-robust F-statistic testing the null hypothesis is 𝐹 = L𝑅′𝛽/ − 𝑞Nfy𝑅′Σv«» 𝑅z?%L𝑅′𝛽/ − 𝑞N
𝑗
with 𝑗 = 1. Under the null hypothesis,
𝐹 →T 𝐹
We reject the null hypothesis if the calculated F-statistic is larger than the critical value of the 𝐹 distribution at a given significance level.
9. (12p) Suppose that a sample of 𝑛 = 20 households has the sample means and sample covariances below for a dependent variable and two regressors:
…,
…,
. Sample Covariances .
Sample Means
𝑌
𝑋%
𝑋(
𝑌
6.39
0.26
0.22
0.32
𝑋%
7.24
0.80
0.28
𝑋(
4.00
2.40
19
(a) Calculate the OLS estimates of 𝛽 , 𝛽 and 𝛽 . Calculate the sample variance of the ï%(
residuals, 𝑠( . Calculate the 𝑅( of the regression. (You have answered this part in problem À»
set #2 already, you will need this answer to be able to answer part (b))
(b) (12p) Suppose that all six least squares assumptions are satisfied. Test the hypothesis that
𝛽 =0atthe5%significancelevel. %
Solutions:
The sample size n = 20. We write the regression in the matrix from:
with
Y = Xb + U
⎛1XX⎞ ⎜ 1,1 2,1⎟
X=⎜ 1 X1,2 X2,2 ⎟
⎜!!!⎟
⎜1XX⎟ ⎝ 1,n 2,n⎠
⎛Y1⎞
⎜Y⎟ Y=⎜ !2 ⎟
,
⎜Y⎟ ⎝n⎠
⎛ u1 ⎞ ⎜ u ⎟
⎛ β ⎞ ⎜ 0 ⎟
U=⎜!2⎟, β=⎜β1⎟ ⎜ u ⎟ ⎝ β2 ⎠
⎝n⎠
The OLS estimator the coefficient vector is
with
and
βˆ = ( X ′ X ) − 1 X ′ Y . æ n ån X
çån X ån XX è i=1 1i i=1 1i 2i
ån X ö i=1 2i ÷
ç i=1 1i
X¢X=ånX ånX2 ånXX,
ç i=1 1i i=1 1i
i=1 1i 2i÷ ån X2 ÷
i=1 2i ø
20
Note
æ ån Y ö ç i=1 i ÷
X¢Y=ån XY. çi=1 1ii÷
çån XY÷ èi=1 2iiø
= 20 ́7.24 =144.8, åX =nX =20 ́4.00=80.0,
s2 = Y
1 ån n-1 i=1
(Y -Y)2 = i
1 ån n-1 i=1
Y2 – i
n n-1
Y2,
n
åX i=1
i=1 ån
= nX 1i 1
n
2i 2
Y = nY = 20 ́6.39 =127.8. i
i=1
By the definition of sample variance
we know
ån
Y2 =(n-1)s2 +nY2. iY
i=1
Thus using the sample means and sample variances, we can get
ån X2 =(n-1)s2 +nX2
i=1
=(20-1) ́0.80+20 ́7.242 =1063.6,
ån X2 =(n-1)s2 +nX2
i=1
and
1i X 1 1
2,i X 2 2
=(20-1) ́2.40+20 ́4.002 =365.6. By the definition of sample covariance
21
sXY =
1n1nn å(X-X)(Y-Y)= åXY- XY,
we know
n-1 i i n-1 ii n-1
ån
i=1
i=1
XY=(n-1)s +nXY. ii XY
i=1
Thus using the sample means and sample covariances, we can get
n
åX Y =(n-1)s 1 +nXY
1ii XY1 i=1
and
=(20-1) ́0.22+20 ́7.24 ́6.39=929.45, ån X2iYi =(n-1)sX2Y +nX2Y
i=1
=(20-1) ́0.32+20 ́4.00 ́6.39=517.28,
ån X1iX2i =(n-1)sX1X2 +nX1X2 i=1
=(20-1) ́0.28+20 ́7.24 ́4.00=584.52.
Therefore we have
20.0 144.8 80.0 𝑋f𝑋 = @144.8 1063.6 584.52A,
80.0 584.52 365.6
127.8 𝑋f𝑌 = @929.45A
517.28
The inverse of matrix 𝑋¢𝑋 is
(𝑋′𝑋)?% = @−0.4631 0.0684 −0.0080A
−0.0337 −0.0080 0.0229
The OLS estimator of the coefficient vector is
3.5373 −0.4631 −0.0337
22
/ ?% f 3.5373 𝛽 = (𝑋′𝑋) 𝑋 𝑌 = @−0.4631 −0.0337
−0.0337 127.8 4.2063
−0.0080A @929.45A = @0.2520A −0.0080 0.0229 517.28 0.1033
KŒ
and
Therefore the sum of squared residuals
−0.4631 0.0684
That is, 𝛽/ = 4.2063, 𝛽/ = 0.2520, and 𝛽/ = 0.1033 ï%Õ
With the number of slope coefficients 𝑘 = 2, the squared standard error of the regression 𝑠( is
1∑n 1
s2 = uˆ = U′U.
ˆˆˆ TheOLSresiduals U=Y-Y=Y-Xb,so
We have
uˆ n−k−1 i n−k−1 i=1
ˆˆ
ˆˆˆˆˆˆˆ U¢U = (Y – Xβ)¢ (Y – Xβ) = Y¢Y – 2β ‘X¢Y + β ‘X¢Xβ.
n
Y¢Y=åY2 =(n-1)s2 +nY2
iY i=1
=(20-1) ́0.26+20 ́6.392 =821.58, æ4.2063ö¢æ 127.8 ö
ç÷ç÷
β’X¢Y= ç0.2520÷ ç929.45÷=825.22,
ˆ
ç0.1033÷ ç517.28÷
èøèø
æ 4.2063 ö¢ æ 20 144.8 80.0 ö æ 4.2063 ö ç÷ç÷ç÷
ˆˆ
β’X¢Xβ=ç0.2520÷ ç144.8 1063.6 584.52÷ç0.2520÷=832.23.
ç 0.1033 ÷ ç 80.0 584.52 365.6 ÷ ç 0.1033 ÷ èøèøèø
ån
ˆˆˆˆˆ uˆ2 =U¢U=Y¢Y-2b¢X¢Y+b¢X¢Xb
SSR=
= 821.58 – 2 ́ 825.22 + 832.23 = 3.37.
The squared standard error of the regression 𝑠( is KŒ
i i=1
23
1ˆˆ1
s2 = U¢U=
́3.37=0.1982.
uˆ
n – k -1 With the total sum of squares
20 – 2 -1
n
TSS=å(Y-Y)2 =(n-1)s2 =(20-1) ́0.26=4.94,
iY i=1
the R2 of the regression is
R2 =1- SSR =1- 3.37 = 0.3178.
TSS 4.94
(b) When all six assumptions in Key Concept 16.1 hold, we can use the homoskedasticity-
only estimator Σe«» of the covariance matrix of 𝛽/, conditional on X, which is ⎛ 3.5373 −0.4631 −0.0337 ⎞
∑!β =(X′X)−1s2 =⎜ −0.4631 0.0684 −0.0080 ⎟×0.1982
ˆ uˆ
⎝ −0.0337 −0.0080 0.0229 ⎠
⎛ 0.7011 −0.09179 −0.0067 ⎞ = ⎜ −0.09179 0.0136 −0.0016 ⎟ .
⎝ −0.0067 −0.0016 0.0045 ⎠ The homoskedasticity-only standard error of 𝛽/ is
!ˆ1
SE (β ) = 0.01362 = 0.1166.
The t-statistic testing the hypothesis b1 = 0 has a tn–k–1 = t17 distribution under the null hypothesis. The value of the t-statistic is
t! = βˆ1 = 0.2520 = 2.1612, ” ˆ 0.1166
%
1
SE(β1)
and the 5% two-sided critical value is 2.11. Thus we can reject the null hypothesis b1 = 0 at the 5% significance level.
24