程序代写代做代考 data science algorithm chain Introduction to information system

Introduction to information system

Linear Algebra

Gerhard Neumann

School of Computer Science

University of Lincoln

CMP3036M/CMP9063M Data Science

Today‘s Agenda!

• Make you remember Linear Algebra

• Mostly easy but we probably have forgotten it

• Introduction to:

– Vectors

– Matrices

– Matrix Calculus

Revisiting Linear Regression

Why do you hate us???

• Uff… not again math!

– Well, math is important and we can not fully avoid it

– We only cover stuff that we can directly apply to derive our algorithms

– We will do it step by step such that you can really follow

– Who knows… maybe you even like it 

• Ok… but why linear algebra?

– Most data is represented as matrix

– Talking about matrix operations is talking about manipulating data!

– Algorithms are often easier to understand in matrix form

– Linear Regression is one of the most basic algorithms for data science!

Ask questions!!!

• Eventhough I am Austrian, I am actually a nice guy…

And give feedback!

Feed the feedbag!

• If it is not clear… tell me!!

• If it is too fast… tell me!!

• If you can not understand „austrian english“ … tell me!

Vectors

• A vector is a multi-dimensional quantity

• Each dimension contains different information (Age, Height, Weight…)

Some notation

• Vectors will always be represented as bold symbols

• A vector is always a column vector

• A transposed vector is always a row vector

What can we do with vectors?

• Multiplication by scalars

• Addition of vectors

Scalar products and length of vectors

• Scalar (Inner) products:

– Sum the element-wise products

• Length of a vector

– Square root of the inner product with itself

Matrices

• A matrix is a rectangular array of numbers arranged in rows and columns.

– is a 3 x 2 matrix and a 2 x 4 matrix

– Dimension of a matrix is always num rows times num columns

– Matrices will be denoted with bold upper-case letters (A,B,W)

– Vectors are special cases of matrices

Matrices in Data Science

• Our data set can be represented as matrix, where single samples are vectors

• Most typical representation:

– Each row represent a data sample (e.g. Joe)

– Each column represents a data entry (e.g. age)
X is a num samples x num entries matrix

What can you do with matrices?

• Multiplication with scalar

• Addition of matrices

• Matrices can also be transposed

Multiplication of a vector with a matrix

• Matrix-Vector Product:

• Think of it as:

– Hence:

– We sum over the columns of W weighted by

• Vector needs to have same dimensionality then number of columns!

Multiplication of a matrix with a matrix

• Matrix-Matrix Product:

• Think of it as:

– Hence: Each column in U can be computed by a matrix-vector product

Multiplication of a matrix with a matrix

• Dimensions:

– Number of columns of left matrix must match number of rows of right matri

• Non-commutative (in general):

• Associative:

• Transpose Product:

Important special cases

• Scalar (Inner) product:

– The scalar product can be written as vector-vector product

Important special cases

• Compute row/column averages of matrix

– Vector of row averages (average over all entries per sample)

– Vector of column averages (average over all samles per entry)

Matrix Inverse

• Definition:

• Unit Element: Identity matrix, e.g., 3 x 3:

• Verify it!

• Note: We can only invert quadratic matrices (num rows = num cols)

scalar matrices

Linear regression models revisited

• In linear regression, the output y is modelled as linear function of the input xi

Effect of 𝛽0 Effect of 𝛽1

Linear regression models in matrix form

• Equation for the i-th sample

• Equation for full data set

– is a vector containing the output for each sample

– is the data-matrix containing a vector of ones as the first
column as bias

Linear regression models in matrix form

• Error vector:

• Sum of squared errors (SSE)

• We have now written the SSE completely in matrix form!

Deriving Linear Regression

• How do we obtain the optimal ? (which minimizes the SSE)

At a minimal value of a

function, its derivative is zero

I.e., find a where

Calculus

Ok, we need to talk about derivatives…

“The derivative of a function of a real variable measures the sensitivity to change of a

quantity (a function value or dependent variable) which is determined by another

quantity (the independent variable)” (Wikipedia)

Function:

Derivative:

Minimum:

Derivatives and Gradients

Function:

Derivative:

Minimum:

scalar vector

is also called the gradient of function f at point x

Matrix Calculus

• How do we compute ?

• We need to know some rules from Matrix Calculus (see wikipedia)

– Linear:

– Quadratic:

– Chain rule:

scalar vector

Derivation of the SSE

How do we compute ?

– Chain rule:

– 1st derivative:

– 2nd derivative:

Putting it together…

• Chain rule:

• Set it to zero: Cancel constant factors

Multiply out brackets

Bring on the other side

(Left-)multiply with the inverse

Transpose on both sides:

General solution to the least-squares problem!