程序代写代做 graph go html BIA 652

BIA 652
Simple + Multiple Linear Regression

Simple Regression
2

Introduction to Regression Analysis
• Regression analysis is used to:
– Predict the value of a dependent variable based
on the value of at least one independent variable
– Explain the impact of changes in an independent variable on the dependent variable
Dependent variable: the variable we wish to explain (also called the endogenous variable)
Independent variable: the variable used to explain the dependent variable
(also called the exogenous variable)
Copyright © 2013 Pearson
Education, Inc. Publishing as Ch. 11-3 Prentice Hall

Aims
• Describe the relationship between an independent variable X, and a continuous dependent variable Y as a straight line in R2
– Two Cases:
• Fixed X: values of X are preselected by investigator • Variable X: a random sample of (X,Y) pairs
• Draw inferences regarding the relationship
• Predict the value of Y for a given X
4

Simple Linear Regression Model
The population regression model:
Population Y intercept
Population Slope Coefficient
Independent Variable
Random Error term
Dependent Variable
yi β0 β1xi εi
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall
Linear component
Ch. 11-5
Random Error component

Linear Regression Assumptions
• The true relationship form is linear (Y is a linear function of X, plus random error)
• The error terms, εi are independent of the x values
• The error terms are random variables with mean 0 and
constant variance, σ2
(the uniform variance property is called homoscedasticity)
• The random error terms, εi, are not correlated with one another, so that
E[ε]0 and E[ε2]σ2 for(i1,,n) ii
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall
E[εiεj]0
forall ij Ch. 11-6

Graphically (p 85)
7

Y
Observed Value of Y for xi
Predicted Value of Y for xi
Intercept = β0
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall
εi
Slope = β1
Simple Linear Regression Model
(continued)
YββXε i01ii
Random Error for this Xi value
xi
X
Ch. 11-8

α and β (p 86)
9

Estimated (or predicted) y value for observation i
Estimate of the regression intercept
Estimate of the regression slope
Value of x for observation i
Simple Linear Regression Equation
The simple linear regression equation provides an estimate of the population regression line
ˆ
y b bx i01i
The individual random error terms ei have a mean of zero
Copyright © 2013 Pearson
Education, Inc. Publishing as Ch. 11-10 Prentice Hall
ˆ e(y-y)y-(b bx)
iiii01i

11.3
Least Squares Coefficient Estimators
• b0 and b1 are obtained by finding the values of b0 and b1 that minimize the sum of the squared residuals (errors), SSE:
n min SSEmine2
i i1

ˆ
min (yy)2
ii
min[y(b bx)]2 i01i
Differential calculus is used to obtain the coefficient
Copyright © 2013 Pearson estimators b0 and b1 that minimize SSE Education, Inc. Publishing as Ch. 11-11
Prentice Hall

11.6
Prediction
• The regression equation can be used to predict a value for y, given a particular x
• For a specified value, xn+1 , the predicted value is
Copyright © 2013 Pearson
Education, Inc. Publishing as Ch. 11-12 Prentice Hall
ˆ
y bbx n1 0 1 n1

Least Squares Coefficient Estimators
• The slope coefficient estimator is
(continued)
n
(x x)(y y) Cov(x,y) sy ii
x
b1  i1  r
n s2s
(x x)2 x
i i1
• And the constant or y-intercept is b0 yb1x
• The regression line always goes through the mean x, y
Copyright © 2013 Pearson
Education, Inc. Publishing as Ch. 11-13 Prentice Hall

Y
yi
_ SST = (yi – y)2

y
_ y
(continued)

y
Explained variation
Analysis of Variance
Unexplained variation

SSE = (yi – yi )2
 _ SSR = (yi – y)2
_
y
X
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall
xi
Ch. 11-14

11.4
Explanatory Power of a Linear Regression Equation
• Total variation is made up of two parts: SST  SSR  SSE
Total Sum of Squares
SST  (y  y)2 i
Regression Sum of Squares
SSR 

ˆ
where:
y = Average value of the dependent variable
(y  y)2 i
SSE 

ˆ
Error (residual) Sum of Squares
(y  y )2 ii
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall
yi = Observed values of the dependent variable
yˆ = Predicted value of y for the given x value i Ch. 11-15 i

Proof
16

Hypothesis Test for Population Slope Using the F Distribution
• F Test statistic:
where
F  MSR MSE
MSR  SSR k
MSE  SSE nk1
where F follows an F distribution with k numerator and (n – k – 1) denominator degrees of freedom
(k = the number of independent variables in the regression model)
Copyright © 2013 Pearson
Education, Inc. Publishing as Ch. 11-17 Prentice Hall

Results:
Computer Analysis
• estimates of slope (β1) and intercept (β0 ), using least squares
• residual mean square = estimate of variance ( S2 ) • testifβ = β0
• Usually, test β = 0, i.e. X has no effect on Y
18

Hypothesis Test for Population Slope
Using the F Distribution
(continued)
• An alternate test for the hypothesis that the slope is zero:
• Use the F statistic
• The decision rule is
reject H0 if F ≥ F1,n-2,α
Copyright © 2013 Pearson
Education, Inc. Publishing as Ch. 11-19 Prentice Hall
H0:β1 =0 H1: β1  0
F  MSR  SSR MSE s2e

Steps in Simple Regression
1. State the research hypothesis.
2. State the null hypothesis
3. Gather the data
4. Assess each variable separately first (obtain measures of central tendency and dispersion; frequency distributions; graphs); is the variable normally distributed?
5. Calculate the regression equation from the data
6. Calculate and examine appropriate measures of association and tests of statistical significance for each coefficient and for the equation as a whole
7. Accept or reject the null hypothesis
8. Reject or accept the research hypothesis
9. Explain the practical implications of the findings
20

Effect of Outliers (p 102)
21

Leverage
22

Influence Measures
• Cook’s distance: “distance” between B with and without the ith observation
• DFFITS: “distance” between Ŷ with and without the ith observation
23

Cook’s Distance
24

Influential observations
An observation is influential if:
– It is an outlier in X and Y
– Cook’s distance > F0.5(P+1, N-P-1)
2 𝑃+1 𝑁−𝑃 −1
Try analysis with and without influential observations and compare results.
– DFFITS >
25

11.4
Explanatory Power of a Linear Regression Equation
• Total variation is made up of two parts: SST  SSR  SSE
Total Sum of Squares
SST  (y  y)2 i
Regression Sum of Squares
SSR 

ˆ
where:
y = Average value of the dependent variable
(y  y)2 i
SSE 

ˆ
Error (residual) Sum of Squares
(y  y )2 ii
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall
yi = Observed values of the dependent variable
yˆ = Predicted value of y for the given x value i Ch. 11-26 i

Confidence & Prediction Intervals • Confidence interval (CI) for mean of Y
• Prediction interval (PI) for individual Y PI is wider than CI
27

Confidence Interval for the Average Y, Given X
Confidence interval estimate for the expected value of y given a particular xi
Confidence interval for E(Y | X ) : n1 n1
1 (x x)2  ytsn1 
ˆ
n1 n2,α/2 e n (x x)2
i 
 x)2 so the size of interval varies according to the distance xn+1 is
Notice that the formula involves the term (x
n1
from the mean, x
Copyright © 2013 Pearson
Education, Inc. Publishing as Ch. 11-28 Prentice Hall

Prediction Interval for an Individual Y, Given X
Confidence interval estimate for an actual observed value of y given a particular xi
Confidenceinterval for yn1 :
ˆ
ˆ
 1 (x x)2 
y t s 1  n1
n1 n2,α/2 e  n (xx)2
i 
This extra term adds to the interval width to reflect the added uncertainty for an individual case
Copyright © 2013 Pearson
Education, Inc. Publishing as Ch. 11-29 Prentice Hall

Estimating Mean Values and Predicting Individual Values
Goal: Form intervals around y to express uncertainty about the value of y for a given xi
Y

y
Confidence Interval for the expected value of y, given xi

y = b0+b1xi
Prediction Interval for
an single observed y,
given xi
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall
Ch. 11-30
xi
X

Relevant Data Range
• When using a regression model for prediction, only predict within the relevant range of data
Relevant data range
450
400
350
300
250
200
150
100
50 0
Risky to try to extrapolate far beyond the range of observed x values
0 500
1000
1500 2000
2500 3000
Ch. 11-31
Copyright © 2013 Pearson
Education, Inc. Publishing as Square Feet Prentice Hall
House Price ($1000s)

Multiple Regression

Multiple Regression
33

Adjusted R-Sqr
34

VIF
35

VIF
36

37

Correlation Coefficient – ρ
• Correlation coefficient measures the strength of linear association between X and Y in the population (ρ).
• it is estimated by sample ( r )
38

11.7
Correlation Analysis
• Correlation analysis is used to measure strength of the association (linear relationship) between two variables
– Correlation is only concerned with strength of the relationship
– No causal effect is implied with correlation – Correlation was first presented in Chapter 4
Copyright © 2013 Pearson
Education, Inc. Publishing as Ch. 11-39 Prentice Hall

Correlation Analysis
• The population correlation coefficient is
denoted ρ (the Greek letter rho)
• The sample correlation coefficient is
r
sxy sxsy
where
sxy 
(xi x)(yi y) n1
Copyright © 2013 Pearson
Education, Inc. Publishing as Ch. 11-40 Prentice Hall


Calculating the value of ρ
100 (1 – ρ2)1/2 = % of Standard Deviation NOT “explained” by X
σ2 = σ y 2 (1 – ρ2) => σ = σ y 1 − ρ2
=> ρ2 = σy 2−σ2 σy2
41

Graphically (p 92)
42


Calculating the value of ρ
100 (1 – ρ2)1/2 = % of Standard Deviation NOT “explained” by X
σ2 = σ y 2 (1 – ρ2) => σ = σ y 1 − ρ2
=> ρ2 = σy 2−σ2 σy2
43

Interpretation of ρ
• ρ2 = reduction in variance of Y associated with knowledge of X/original variance of Y
• 100ρ2 = % of variance of Y “explained by X” Caveat: correlation vs causation
44

Estimating the value of ρ (Pearson’s Correlation Coefficient)
ρ= σXY σXσY
r= SXY SXSY
SXY = (𝑋 − 𝑚(𝑋))(𝑌 − 𝑚(𝑌))/(𝑁 − 1) 45

Interpretation of ρ
ρ
% of variance “explained”
% of variance not “explained”
% of SD “explained”
% of SD not “explained”
±0.3
9%
91%
5%
95%
±0.5
25%
75%
13%
87%
±0.71
50%
50%
29%
71%
±0.95
90%
10%
69%
31%
46

Test for Zero Population Correlation
• To test the null hypothesis of no linear association,
H0 :ρ0
the test statistic follows the Student’s t
distribution with (n – 2 ) degrees of freedom:
t
r (n2) (1r2)
Copyright © 2013 Pearson
Education, Inc. Publishing as Ch. 11-47 Prentice Hall

Example from Text: Lung Function
• Data from an epidemiological study of households

– living in four areas with different amounts and types of air pollution (Appendix A)
Data only on non-smoking fathers
– X = height in inches
– Y = forced expiratory volume in 1 second (FEV1)
48

Scatter Plot (p 83)
49

Example Results
• Least Squares Equation: Y = -4.087 + 0.118X
• Correlation r = 0.504
• Test p = 0,
– t = 7.1 (p 94), ρ < 0.0001 – t test can be one or two sided 50 Analysis of Variance • SST = total sum of squares – Measures the variation of the yi values around their mean, y • SSR = regression sum of squares – Explained variation attributable to the linear relationship between x and y • SSE = error sum of squares – Variation attributable to factors other than the linear relationship between x and y Copyright © 2013 Pearson Education, Inc. Publishing as Ch. 11-51 Prentice Hall 11.4 Explanatory Power of a Linear Regression Equation • Total variation is made up of two parts: SST  SSR  SSE Total Sum of Squares SST  (y  y)2 i Regression Sum of Squares SSR   ˆ where: y = Average value of the dependent variable (y  y)2 i SSE   ˆ Error (residual) Sum of Squares (y  y )2 ii Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall yi = Observed values of the dependent variable yˆ = Predicted value of y for the given x value i Ch. 11-52 i Y yi _ SST = (yi - y)2  y _ y (continued)  y Explained variation Analysis of Variance Unexplained variation  SSE = (yi - yi )2  _ SSR = (yi - y)2 _ y X Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall xi Ch. 11-53 Coefficient of Determination, R2 • The coefficient of determination is the portion of the total variation in the dependent variable that is explained by variation in the independent variable • The coefficient of determination is also called R- squared and is denoted as R2 R2 SSRregressionsumofsquares SST total sum of squares Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall note: 0R2 1 Ch. 11-54 Correlation and R2 • The coefficient of determination, R2, for a simple regression is equal to the simple correlation squared R2 r2 Copyright © 2013 Pearson Education, Inc. Publishing as Ch. 11-55 Prentice Hall Y Examples of Approximate r2 Values X r2 = 1 Perfect linear relationship between X and Y: 100% of the variation in Y is explained by variation in X Y r2 = 1 Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-56 r2 = 1 X Y Examples of Approximate r2 Values X 0 < r2 < 1 Weaker linear relationships between X and Y: Some but not all of the variation in Y is explained by variation in X Y X Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-57 Examples of Approximate r2 Values Y r2 = 0 No linear relationship between X and Y: The value of Y does not depend on X. (None of the variation in Y is explained by variation in X) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch. 11-58 r2 = 0 X Estimation of Model Error Variance • An estimator for the variance of the population model error is ˆ SSE n2 n2 σ 2  s 2e  i  1  n e2 i • Division by n – 2 instead of n – 1 is because the simple regression model uses two estimated parameters, b0 and b1, instead of one is called the standard error of the estimate Copyright © 2013 Pearson Education, Inc. Publishing as Ch. 11-59 Prentice Hall s  s2 ee Comparing Standard Errors se is a measure of the variation of observed y values from the regression line YY small s X large s X ee The magnitude of se should always be judged relative to the size of the y values in the sample data Copyright © 2013 Pearson Education, Inc. Publishing as Ch. 11-60 Prentice Hall 11.5 Statistical Inference: Hypothesis Tests and Confidence Intervals • The variance of the regression slope coefficient (b1) is estimated by s2s2e s2e b1 (x x)2 (n1)s2 ix where: sb1 = Estimate of the standard error of the least squares slope se  SSE n2 = Standard error of the estimate Ch. 11-61 Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Example Results • Least Squares Equation: Y = -4.087 + 0.118X • Correlation r = 0.504 • Test p = 0, – t = 7.1 (p 94), ρ < 0.0001 – t test can be one or two sided 62 Hypothesis Test for Population Slope Using the F Distribution • F Test statistic: where F  MSR MSE MSR  SSR k MSE  SSE nk1 where F follows an F distribution with k numerator and (n – k - 1) denominator degrees of freedom (k = the number of independent variables in the regression model) Copyright © 2013 Pearson Education, Inc. Publishing as Ch. 11-63 Prentice Hall Hypothesis Test for Population Slope Using the F Distribution (continued) • An alternate test for the hypothesis that the slope is zero: • Use the F statistic • The decision rule is reject H0 if F ≥ F1,n-2,α Copyright © 2013 Pearson Education, Inc. Publishing as Ch. 11-64 Prentice Hall H0:β1 =0 H1: β1  0 F  MSR  SSR MSE s2e ANOVA Overview 65 Test β = 0 • From ANOVA table: F – 50.5 – Gives 2-sided test, p-value < 0.0001 • One sided test is: t = F1/2 = 7.1 Same as test for ρ = 0 66 Outliers • Outlier in Y is studentized (or deleted studentized) residual >2
• Leverage = h = 1 + 𝑋 −𝑋 2 𝑁 𝑋 −𝑋 2
– X’s far from the mean of X have large leverage (h)
– Observations with large leverage have large effect on
the slope of the line.
• Outlier in X if h > 4/N
67

Residual Analysis • Residual = e = Y – Ŷ
• Studentized residual = e/S(1 – h)1/2 – h called “leverage”
• Deleted studentized residual = studentized residual with observation for computing regression and S deleted.
68

Influential observations
An observation is influential if: – It is an outlier in X and Y
– Cook’s distance > F0.5(2,N-2)
22 𝑁−2
Try analysis with and without influential observations and compare results.
– DFFITS >
69

Observations
• Point 1 is an outlier in Y with low leverage – impacts estimate of intercept but not slope – Tends to increase the estimates of S & SE of B
• Point 2 has high leverage; not an outlier in Y – doesn’t impact estimate of B or A
• Point 3 has high leverage and is an outlier in Y – impacts the values of B, A, and S
70

Assumptions
• Homogeneity of variance (same σ2)
– Not extremely serious
– Can be achieved through transformations if necessary
• Normal residuals
– Slightdeparturesok
– Can use transformations to achieve it
• Randomness – Serious
– Can use hierarchical models for clustered samples
71

Checking Assumptions
• Plot residuals vs X or vs the predicted Y to check linearity and homogeneity of variance
• Create normal probability plots of residuals to check for normality
72

Residual Plots (p 98)
73

Transformations (p 105 )
74

Weighted Regression
• If σ2 are not equal, use weight for each residual in the sum of squares used in Least Squares process.
• Weight = 1/ σ2
• Gives unbiased estimate with smaller variance
75

Weighted Regression – Caveat
• Solution,, standardize weight (w) to add up to the sample size (N)
– e.g. N = 5, w = 4,1,8,2,4, sum of w = 19
– define standardized weight (sw) = w*5/19
– sum of sw = 5
– = 1.05 + .26 + 2.11 + .53 + 1.05 = 5
76

What to watch for
• Need representative sample
• Range of prediction should match observed range in X in sample
• Use of nominal or ordinal, rather than interval or ration data
• Errors in variables
• Correlation does not imply causation
• Violation of assumptions
• Influential points
• Appropriate model
77

Multiple Linear Regression
78

Keywords for OUTPUT Statement
Keyword
COOKD=names COVRATIO=names
DFFITS=names H=names
LCL=names
LCLM=names
PREDICTED | P=names PRESS=names
RESIDUAL | R=names
RSTUDENT=names
STDI=names
STDP=names
STDR=names STUDENT=names
UCL=names UCLM=names
Description
Cook’s influence statistic
standard influence of observation on covariance of betas
standard influence of observation on predicted value
leverage,
lower bound of a % confidence interval for an
individual prediction. This includes the variance of the error, as well as the variance of the parameter estimates.
lower bound of a % confidence interval for the expected value (mean) of the dependent variable
predicted values
th residual divided by , where is the leverage,
and where the model has been refit without the th observation
residuals, calculated as ACTUAL minus PREDICTED
a studentized residual with the current observation deleted standard error of the individual predicted value
standard error of the mean predicted value
standard error of the residual
studentized residuals, which are the residuals divided by their standard errors
upper bound of a % confidence interval for an individual prediction
upper bound of a % confidence interval for the expected value (mean) of the dependent variable
79

Aims
• Extend simple linear regression to multiple dependent variables.
• Describe a linear relationship between: – A single continuous Y variable, and
– Several X variables
• Draw inferences regarding the relationship
• Predict the value of Y from X1, X2, …, Xp.
• Research Questions: To what extent does some combination of the IVs predict the DV?
• E.g. To what extent does age, gender, type/amount of food consumption predict low density lipid level.
80

Assumptions
• Level of Measurement:
– IVs – two or more, Continuous or dichotomous – DV – continuous
• Sample Size – Enough cases per IV
• Linearity: Are bivariate relationships linear
• Constant Variance (about line of best fit) – Homoscedasticity
• Multicollinearity: Between the IVs
• Multivariateoutliers
• Normality of residuals about predicted value
81

Approaches
• Direct: All IVs entered simultaneously
• Forward: IVs entered one by one until there are no significant IVs to be entered.
• Backward: IVs removed one by one until there are no significant IVs to be removed.
• Stepwise: Combination of Forward and Backward
• Hierarchical: IVs entered in steps.
82

Write ups
• Assumptions: How tested, extent met
• Correlations: What are they, what conclusions • Regression coefficients: Report and interpret
• Conclusions and Caveats
83

Steps in Multiple Regression
1. State the research hypothesis.
2. State the null hypothesis
3. Gather the data
4. Assess each variable separately first (obtain measures of central tendency and dispersion; frequency distributions; graphs); is the variable normally distributed?
5. Assess the relationship of each independent variable, one at a time, with the dependent variable (calculate the correlation coefficient; obtain a scatter plot); are the two variables linearly related?
6. Assess the relationships between all of the independent variables with each other (obtain a correlation coefficient matrix for all the independent variables); are the independent variables too highly correlated with one another?
7. Calculate the regression equation from the data
8. Calculate and examine appropriate measures of association and tests of statistical significance for each coefficient and for the equation as a whole
9. Accept or reject the null hypothesis
10. Reject or accept the research hypothesis
11. Explain the practical implications of the findings
84

Example (p 121)
85

Example (p 122)
86

Mathematical Model • The mean of Y values at a given X is:
a+ b1X1 +β2X2 +…+βpXp
• Variance of Y values at any set of X’s is σ2
(For all X)
• Y values are normally distributed at each X (needed for inference)
87

Types of X (independent) variables
• Fixed: selected in advance
• Variable: as in most studies
• X’s can be continuous or discrete (categorical)
• X’s can be transformations of other X’s, e.g., polynomial regression.
88

Computer Analysis
• Estimates of: α, β1, β2, …, βp using least- squares.
• Residual mean square ( S2 ) is estimate of variance σ2
• Confidence intervals for mean of Y • Prediction intervals for individual Y
89

Example of Bonferroni
• Test 3 hypotheses
• P-values are: 0.014, 0.036, 0.075
• Let nominal significance level = 0.15 – ∴ first 2 are significant
• Bonferroni Adjusted p-values: multiply by 3, giving: 0.042, 0.108, 0.225
– Only first is significant
– Probablility of at rejecting at least 1 out of m hypotheses
90

Analysis of variance (p 132)
• Does regression plane help in predicting values of Y?
• Test hypothesis that all 𝛽I’s = 0
91

Example: Reg of FEV1 on height and weight (p 132)
• F = 36.81; df = 2, 147; p-value <0.0001 • Use percentile link from web site: http://faculty.vassar.edu/lowry/tabs.html#f 92 Venn Diagrams • Multiple R2 • Bivariate Correlation between IV1 and DV • Bivariate Correlation between IV2 and DV • Correlation between IV1 and IV2 • Target: IV’s that highly correlate with the DV, but don’t highly correlate with each other DV IV2 IV1 93 Correlation Coefficient • The multiple correlation coefficient (R) measures the strength of association between Y, and the set of X’s in the population. • It is estimated as the simple correlation coefficient between the Y’s and their predicted values ( Ŷ’s ) 94 Coefficient of Determination • R2 = Coefficient of determination = SS due to regression/SS total • R2 = (reduction in variance of Y due to X’s) / (original variance of Y). • Therefore 100R2 = % of variance of Y “explained by X’s”. • And 100(1 - 𝜌 2 )1/2 = % of Standard Deviation NOT “explained” by X’s 95 Regression 96 Standard Deviation of bet1 97 Confidence Interval Mean Value 98 Confidence Interval for prediction 99 Adjusted R-square 100 101 102 Sequential SS vs. Partial SS 103 104 VIF 105 Interpretation of R R % of variance “explained” % of variance not “explained” % of SD “explained” % of SD not “explained” ±0.3 9% 91% 5% 95% ±0.5 25% 75% 13% 87% ±0.71 50% 50% 29% 71% ±0.95 90% 10% 69% 31% 106 Partial Correlation • The correlation coefficient measuring the degree of dependence between two variables – after adjusting for the linear effect of one or more of the other X variables Example: T1 and T2 are test scores • Find partial R between T1 and T2 after adjusting for IQ 107 Visually ( p 130) • Partial R = simple R between the two residuals 108 Interpretation of regression coefficients • In the model: a + b1X1 + β2X2 + ... + βpXp if 𝜌 is the partial correlation between Y and X1 , given X1, X2, ..., Xp, then • Testing that b1 = 0, is equivalent to testing that 𝜌 = 0 Hence, bi is called the partial regression coefficient of Y on X1 , given X1, X2, ..., Xp 109 Values of regression coefficients • Problem: Values of bi ‘s are not directly comparable • Hence: Standardized coefficients: – Standardized bi = bi * (SD (Xi) / SD (Y)) • Standardized bi are directly comparable. 110 Multicollinearity • The case where some of the X variables are highly correlated • This will impact estimates, and their SE’s (p 143) • Consider Tolerance, and its inverse, Variance Inflation Factor • Target Tolerance < 0.01, or VIF > 100
• Remedy: use variable selection to delete some X variables, or a dimension reduction techniques such as Principal Components.
111

Misleading Correlations
• Example (Lung Function data, Appendix A): FEV1 vs height and age
• Depends on gender
112

Total vs Stratified Correlation
Gender
Correlation between FEV1 and:
Height
Age
Total
0.739
-0.073
Male
0.504
-0.310
Female
0.465
-0.267
113

FEV1 vs height
114

FEV1 vs height – Regression lines
115

FEV1 vs age
116

FEV1 vs age– Regression lines
117

Residual Analysis • Residual = e = Y – Ŷ
• Studentized residual = e/S(1 – h)1/2 – h called “leverage”
• Deleted studentized residual = studentized residual with observation for computing regression and S deleted.
118

Outliers
• Outlier in Y is studentized (or deleted studentized) residual >2 (same as simple case)
• Outlier in X if h > 2(P+1)/N
119

Some Caveats
• See list for simple regression
• Needrepresentativesample
• Violations of assumptions, outliers
• Multicollinearity: coefficient of any one variable can vary widely, depending on what others are included in the model
• Missing values
• Number of observations in the sample should be large enough relative to number of variables in the model.
120

Outline
• Matrix Review: (A – λ I) X = 0; Eigenvalues
• Simple linear regression
• Visit http://www.ats.ucla.edu/stat/sas/output/reg.htm
• Assign HW 6.1,2,5 for next week
• If we get to Chapter 7, assign HW 7.2, 7.4, 7.5, 7.6
(Hand in 7.2,4,5) 7.7 Will be assigned next week.
• Start Multiple Regression Lecture
• Go over Multiple Regression Example – 7.1
121

Quick Matrix Review (A – λ I) X = 0
A = 3 1 (A – λ I ) = 3 − λ 1 22 22−λ
λ = 1,4
λ = 1 => y = -2x
λ = 4 => y = x
122

Quick Matrix Review
(A – λ I) X = 0
A = (A – λ I ) =
(3-λ)(2-λ)(2-λ)+1*5*1+3*3*2 – ((1*(2-λ)*3) + (2*1*(2-λ)) + ((3-λ)*5*3) = 0 (12 -16λ + 7λ^2 – λ^3 +5 +18) – ((6 – 3λ) + (4 – 2λ) + (45 – 15λ)) = 0
-20 + 4λ +7λ^2 – λ^3 = 0
λ = 7.17, -1.76, 1.59
3-λ
1
3
2
2-λ
5
1
3
2-λ
3
1
3
2
2
5
1
3
2
123

Analysis of Variance Y
Observed Value of Y for three Groups

125

126