CS代考程序代写 The Multivariate Linear Regression Analysis and Inference

The Multivariate Linear Regression Analysis and Inference
Zhenhao Gong University of Connecticut

Welcome 2
This course is designed to be:
1. Introductory
2. Leading by interesting questions and applications 3. Less math, useful, and fun!
Most important:
Feel free to ask any questions! ‡
Enjoy! 

Multivariate regression analysis

Unbiasedness and Consistency 4
There are two terms that are often used to decide whether an estimator is good or not:
􏰀 Unbiasedness: An estimator is unbiased if, the mean of the sampling distribution of the estimator is equal to the true parameter value.
􏰀 Consistency: An estimator is consistent if, as the sample size increases, the sampling distribution of the estimator becomes increasingly concentrated at the true parameter value.

Omitted Variable Bias 5
The OLS estimator will have omitted variable bias when two conditions are true:
􏰀 When the omitted variable is a determinant of the dependent variable
􏰀 When the omitted variable is correlated with the included regressor
Remark: Omitted variable bias means that the first least squares assumption, E(ui|Xi) = 0, is incorrect.

Bias Formula 6
Let the correlation between Xi and ui be corr(Xi, ui) = ρxu. Then the OLS estimator has the limit
βˆ →p β + ρ σ u . 1 1 xuσx
That is, as the sample size increases, βˆ1 is close to β1 + ρxu σu
with increasingly high probability (βˆ1 is biased and inconsistent).
σx

Summary 7
􏰀 Omitted variable bias is a problem whether the sample size is large or small.
􏰀 Whether this bias is large or small in practice depends on the correlation ρxu between the regressor and the error term. The larger |ρxu| is, the larger the bias.
􏰀 The direction of the bias in βˆ1 depends on whether X and u are positively or negatively correlated.
Question: What can we do about omitted variable bias?

The Multiple Regression Model 8
Consider the case of two regressor: Yi=β0+β1X1i+β2X2i+ui, i=1,2,···,n
􏰀 X1, X2 are the two independent variables (regressors)
􏰀 β0 = unknown population intercept
􏰀 β1 = effect on Y of a change in X1, holding X2 constant 􏰀 β2 = effect on Y of a change in X2, holding X1 constant 􏰀 ui = the regression error (omitted factors)

The OLS Estimators 9
The OLS estimators βˆ0, βˆ1, and βˆ2 solves: n
min 􏰃u2i β0,β1,β2 i=1
n
= min 􏰃[Yi − (β0 + β1X1i + β2X2i)]2. β0,β1,β2 i=1

Measures of Fit 10
Three commonly used summary statistics in multiple regression are the standard error of the regression (SER), the regression R2, and the adjusted R2 (as known as R ̄2)
􏰀 R ̄2 = R2 with a degrees-of-freedom correction that adjusts for estimation uncertainty; R ̄2 < R2 Remark: The R2 always increases when you add another regressor. It’s a bit of a problem for a measure of “fit”. The Least Squares Assumptions 11 For the multiple linear regression model Yi =β0+β1X1i+β2X2i+···+βkXki+ui, i=1,2,···,n 􏰀 Assumption 1: The conditional distribution of ui given X1i,··· ,Xki has a mean of zero: E(ui|X1i,··· ,Xki) = 0; Failure of this condition leads to omitted variable bias. 􏰀 Assumption 2: (Xi, Yi), i = 1, · · · , n, are independent and identically distributed (i.i.d.); 􏰀 Assumption 3 : Large outliers are unlikely 􏰀 Assumption 4 : No Perfect Multicollinearity Statistical Inference in Multiple Regression Inference for a single coefficient 13 Hypothesis tests and confidence intervals for a single coefficient in multiple regression follow the same logic and recipe as for the slope coefficient in a single-regressor model. 􏰀 When n is large, βˆ1 is normally distributed (CLT), βˆ 1 − E ( βˆ 1 ) That is, √ ˆ ∼ N(0,1). Var(β1 ) 􏰀 Thus hypotheses on β1 can be tested using the usual tstatistic, and 95% confidence intervals are constructed as {βˆ1 ± 1.96 × SE(βˆ1)}. 􏰀 So too for β2,··· ,βk! Tests of Joint Hypotheses 14 Consider the multiple regression model: W agei = β0 + β1Educi + β2Experi + ui The null hypothesis that “Education and work experience don’t matter,” and the alternative that they do, corresponds to: H0 : β1 = 0 and β2 = 0 vs.H1 : either β1 ̸=0 or β2 ̸=0 or both F-statistic 15 The F-statistic tests all parts of a joint hypothesis at once. Formula for the special case of the joint hypothesis β1 = 0 and β2 = 0 in a regression with two regressors: 1 􏰇 t 2 + t 2 − 2 ρˆ t t 􏰈 F= 1 2 t1,t212, 2 2 1−ρˆt1,t2 where ρˆt1,t2 estimates the correlation between t1 and t2, and t1= βˆ1−0, t2= βˆ2−0. S E ( βˆ 1 ) S E ( βˆ 2 ) F-statistic 16 Under the null, t1 and t2 are independent, so ρˆt1,t2 →p 0; in large samples the formula becomes 1 􏰇 t 2 + t 2 − 2 ρˆ t t 􏰈 1 F=12 t1,t212≈(t21+t2), 2 2 1−ρˆt1,t2 2 􏰀 The average of two squared standard normal random variables, which is defined as the chi-squared distribution with 2 degrees of freedom. F-statistic 17 In large samples, F is distributed as χ2q/q. (n ≥ 100) Extension: nonlinear regression models 18 If a relation between Y and X is nonlinear: 􏰀 The effect on Y of a change in X depends on the value of X – that is, the marginal effect of X is not constant. 􏰀 A linear regression is mis-specified – the functional form is wrong. 􏰀 The estimator of the effect on Y of X is biased. 􏰀 The solution to this is to estimate a regression function that is nonlinear in X. Difference in slopes 19 􏰀 In Figure (a), the population regression function has a constant slope; 􏰀 In Figure (b), the slope of the population regression function depends on the value of X1. Real data 20 The TestScore – Average district income relation looks like it is nonlinear, so the linear OLS regression line does not adequately describe the relationship between these variables: Quadratic Regression Model 21 A quadratic population regression model relating test scores and income can be written as TestScorei = β0 + β1Incomei + β2Income2i + ui, 􏰀 β0, β1 and β2 are coefficients 􏰀 Income2i is the square of income in the i-th district 􏰀 ui is the error terms, represents all the other factors 􏰀 the population regression function f(X1,X2,...,Xk)=β0 +β1Incomei +β2Income2i. The quadratic OLS regression function fits the data better than the linear OLS regression function. Quadratic Regression Model 23 The quadratic population regression model above is in fact a version of the multiple regression model with two regressors: The first regressor is Income, and the second regressor is I ncome2 . So like the multiple regression model, we can estimate and test β0,β1 and β2 in the quadratic population regression model using the OLS methods we have learned before. Log function 24 Log regression models 25 􏰀 The coefficients can be estimated by OLS method. 􏰀 Hypothesis tests and confidence intervals are computed as usual. 􏰀 Choice of specification should be guided by judgment (which interpretation makes the most sense in your application?), tests, and plotting predicted values. Linear-log model 26 Linear-log regression model: Yi =β0 +β1ln(Xi)+ui,i=1,...,n Interpretation: a 1% change in X (∆X/X = 0.01) is associated with a change in Y of 0.01β1. ∆Y = [β0 +β1 ln(X +∆X)]−[β0 +β1 ln(X)] ∼ ∆X Remark: ln(x + ∆x) − ln(x) ∼= ∆x 􏰅 when ∆x is small 􏰆. xx = β1 X = 0.01β1. Log-linear model 28 Log-linear regression model: ln(Yi)=β0 +β1Xi +ui,i=1,...,n Interpretation: a one-unit change in X(∆X = 1) is associated with a (100 × β1)% change in Y. ln(Y +∆Y)−ln(Y)∼= ∆Y =β1∆X, Y Remark: Log-linear regression model is a linear regression model! Log-log model 29 Log-log regression model: ln(Yi)=β0 +β1ln(Xi)+ui,i=1,...,n Interpretation: a 1% change in X(∆X/X = 1%) is associated with a β1% change in Y. ln(Y +∆Y)−ln(Y)∼= ∆Y =β1∆X, Y Thus in the log-log specification β1 is the ratio of the percentage change in Y associated with the percentage change in X (Price Elasticity!). The log-log specification fits better than the log-linear specification.