程序代写代做代考 data science Introduction to information system

Introduction to information system

Linear Regression

Deema Abdal Hafeth

Bowei Chen

CMP3036M/CMP9063M Data Science

2016 – 2017

Today’s Objectives

• Simple Linear Regression

– Formulation

– Parameters Estimation: Least Square Estimation (LSE)

• Multiple linear Regression

• Appendix: Derivation of LSE

References

• James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013). An introduction

to statistical learning. Springer. (Chapters 3 and 4)

• Hastie, T., Tibshirani, R., and Friedman, J. (2001). The elements of

statistical learning. Springer. (Chapter 3)

Price House size

1 420 5850

2 385 4000

3 495 3060

4 605 6650

5 610 6360

6 660 4160

7 660 3880

8 690 4160

9 838 4800

10 885 5500

… … …

Dataset

For this new house with size 4050 (sq ft), can we predict what is it the rent price?

Simple Linear Regression

Price House size

1 420 5850

2 385 4000

3 495 3060

4 605 6650

5 610 6360

6 660 4160

7 660 3880

8 690 4160

9 838 4800

10 885 5500

Response variable

Independent variable (x): Ppredictors
variable, feature or explanatory variable

Predictor variable

4050

𝑦 ≔ 𝑓 𝑥 = 𝛽0 + 𝛽1𝑥 + ε

Slop Intercept

𝑥

𝑦

𝑦

Simple Linear Regression Line

Effect of 𝛽0 Effect of 𝛽1

𝑦 ≔ 𝑓 𝑥 = 𝛽0 + 𝛽1𝑥 + 𝜀

Which Line Fits the Data “Best”?

Fig.(1) Fig.(2) Fig.(3)

Error

𝜀𝑖 = 𝑦𝑖 − 𝑦𝑖

𝑖-th data point

Sum of squared errors (SSE) = 2.3 − 2.8 2 + 4 − 2.9 2 + 2.8 − 3.4 2 = 1.82

Sum of Squared Errors (SSE)

How to find the line with the smallest SSE?

That’s the “best” line?!

And this is called the Least Square estimation (LSE) method

Index (𝒊) Price (𝑦) House size (𝑥)

1 420 5850

2 385 4000

3 495 3060

4 605 6650

5 610 6360

6 660 4160

7 660 3880

8 690 4160

9 838 4800

10 885 5500

(𝑥𝑖 , 𝑦𝑖)

(420, 5850)

(385, 4000)

(885,5500)

𝑦 𝑖 = 𝛽 0 + 𝛽 1𝑥𝑖

Expression of SSE

SSE = 𝜀1
2 + 𝜀2

2 + … + 𝜀𝑛
2 = 𝜀𝑖

2,

𝑛

𝑖

where 𝜀𝑖 = 𝑦𝑖 − 𝑦 𝑖 = 𝑦𝑖 − 𝛽0 − 𝛽1𝑥𝑖. Then

SSE = (𝑦𝑖 − 𝛽0 − 𝛽1𝑥𝑖)
2

𝑛

𝑖=1

We can consider SSE is a

function of 𝛽0 and 𝛽1

We see SSE is a quadratic

function of 𝛽0 and 𝛽1

Estimation of 𝛽0 and 𝛽1

Taking derivative of SSE with respect to 𝛽0 and 𝛽1 then gives

𝜕SSE

𝜕𝛽0
= 0,

𝜕SSE

𝜕𝛽1
= 0.

Solving this system of linear equations, we have

𝛽1 =
𝑦𝑖𝑥𝑖 −

𝑦𝑖 𝑥𝑖
𝑛
𝑖=1

𝑛
𝑖=1

𝑛
𝑛
𝑖=1

𝑥𝑖
2 −

𝑥𝑖
𝑛
𝑖=1

2

𝑛
𝑛
𝑖=1

, 𝛽0 = 𝑦 − 𝛽1 𝑥 .

Please see Appendix for detailed derivation.

For this new house with size 4050 (sq ft), can we predict what is it the rent price?

Price prediction:
608.61 = 141.36 + 0.065 × 4050

Simple Linear Regression Solution

For this new house with size 4050 (sq ft), 4 bedrooms and 2 bathrooms,

can we predict what is it the rent price?

There Are Other Features of Houses

Price House size Bedrooms Bathrms Stories Driveway Recroom Fullbase

1 420 5850 3 1 2 1 0 1

2 385 4000 2 1 1 1 0 0

3 495 3060 3 1 1 1 0 0

4 605 6650 3 1 2 1 1 0

5 610 6360 2 1 1 1 0 0

6 660 4160 3 1 1 1 1 1

7 660 3880 3 2 2 1 0 1

8 690 4160 3 1 3 1 0 0

9 838 4800 3 1 1 1 1 1

10 885 5500 3 2 4 1 1 0

… … … … … … … … …

Multiple Linear Regression

Simple expression:

𝑦𝑖 = 𝛽0 + 𝛽1𝑥𝑖,1 + 𝛽2𝑥𝑖,2 + ⋯ + 𝛽𝑝𝑥𝑖,𝑝 + 𝜀𝑖,

Matrix expression:

𝒚 = 𝒙𝜷 + 𝜺,
where

𝐲 =

𝑦1

𝑦𝑛
, 𝜷 =

𝛽0
𝛽1

𝛽𝑝

, 𝒙 =

1 𝑥1,1 ⋯ 𝑥1,𝑝
⋮ ⋮ ⋱ ⋮
1 𝑥𝑛,1 ⋯ 𝑥𝑛,𝑝

, 𝜺 =

𝜀1

𝜀𝑛

Multiple Linear Regression Solution

823.047 = −24.18293 + 0.05411 × 4050 + 58.26 × 4 + 197.5 × 2

Price prediction:

Conclusion

• Simple Linear Regression

– Formulation

– Parameters Estimation: Least Square Estimation (LSE)

• Multiple linear Regression

• Appendix: Derivation of LSE

Thank You!

Appendix: Derivation of LSE (1/2)

The SSE can be obtained as follows:

SSE = (𝑦𝑖 − 𝛽0 − 𝛽1𝑥𝑖)
2

𝑛

𝑖=1

Taking the partial derivative of SSE with respect to the 𝛽0 and 𝛽1 then gives

𝜕𝑆𝑆𝐸

𝜕𝛽0
=

𝜕

𝜕𝛽0
𝑦𝑖 − 𝛽0 − 𝛽1𝑥𝑖

2

𝑛

𝑖=1

= −2 (𝑦𝑖 − 𝛽0 − 𝛽1𝑥𝑖)

𝑛

𝑖=1

= 0,

𝜕𝑆𝑆𝐸

𝜕𝛽1
=

𝜕

𝜕𝛽0
𝑦𝑖 − 𝛽0 − 𝛽1𝑥𝑖

2

𝑛

𝑖=1

= −2 𝑥𝑖(𝑦𝑖 − 𝛽0 − 𝛽1𝑥𝑖)

𝑛

𝑖=1

= 0.

Appendix: Derivation of LSE (2/2)

Solving the system of linear equations then gives

𝛽1 =
𝑦𝑖𝑥𝑖 −

𝑦𝑖 𝑥𝑖
𝑛
𝑖=1

𝑛
𝑖=1

𝑛
𝑛
𝑖=1

𝑥𝑖
2 −

𝑥𝑖
𝑛
𝑖=1

2

𝑛
𝑛
𝑖=1

, 𝛽0 = 𝑦 − 𝛽1 𝑥 .