CS代考 CS 189 (CDSS offering)

Lecture 2: Math review CS 189 (CDSS offering)
2022/01/21

Today’s lecture

Copyright By PowCoder代写 加微信 powcoder

• If today’s lecture has you feeling completely lost, this course will be difficult
• But you don’t necessarily need to know everything off the top of your head
• Today’s lecture is not comprehensive, and other mathematical concepts will pop up throughout the course
• The content in this lecture should help with HW0, which is already out
• Today we will review topics in probability, linear algebra, and vector calculus

Probability review

Random variables
• A random variable takes on values based on the outcome of a random event • E.g., outcome of a coin flip, dice roll, random sample from a population, …
• Random variables are typically denoted with a capital letter
• A sample (realization) of a random variable is typically lower case
• We will often talk about independent and identically distributed (i.i.d.) samples
• We will often consider quantities such as expected value and variance
• We will often rely on formulas such as Bayes’ rule, Jensen’s inequality, … 4

Random variables
let X be a R.V. denoting the outcome of flipping a biased coin
is distributed according to
X 1 0.75 P X heads
here X Bernoulli
flipping the coin N times:
expected value: variance:
x Pl X x 0.75 XE 0,1
IEEXI IE X

Information theory
• In this class, we will also use some basic concepts from information theory
• For example, the entropy of a random variable, sometimes referred to as the “expected surprise”
• Other relevant concepts include cross-entropy and Kullback-Leibler (KL) divergence (relative entropy)

Information theory
H X EPlX x logP X X IETlogP x x for X Bernoulli 0.75 H X 0.75 log0.75
we often write the entropy of the probability distribution H(P)
0 cross-entropy: HIP Q EPl xx logQix x IE E logQ xx
0.25 log 0.25
KL divergence: Du P1 Q EP Xx log an aside — Monte Carlo estimation:
IE If X IÉf xi for X 7
H P Q H P Xn d P

Linear algebra review

Vectors, matrices, and tensors
• A vector is a “one dimensional” row or column of (usually) numbers
• A matrix is a “two dimensional” table
• Sometimes, “higher dimensional” objects are called tensors
• These are all typically denoted with bold and non cursive letters, when possible
• Vectors are typically lower case, matrices and tensors are typically upper case
• In this class, we are going to use subscripts for a lot of different purposes

Vectors, matrices, and tensors
d dimensional 11 ta if v
k dimensional columnvector
l norm E it nil square matrix: II
!p-norm: A
Ei lui P aim
trace of a square matrix: tr A
x A X 20 for all vectors X
A At a square, symmetric matrix A for which
symmetric matrix:
positive semidefinite (PSD) matrix:
Frobenius norm: HAVE
NE.it FAT 10

Eigenvalues and eigenvectors
• An important concept involving square matrices are the eigenvalues and corresponding eigenvectors of that matrix
• Though non square matrices have neither eigenvalues nor eigenvectors, we will also commonly discuss their singular values and singular vectors

Eigenvalues and eigenvectors
eigenvector
for a square matrix A: Art
eigenvalues are typically in non increasing order: X is typically largest
for any matrix A: F A NAH Final i th eigenvalue
i thsingularvalue t I an
and V are orthogonal matrices is a diagonal matrix with EET
also12 HAVE FiricAT
the very useful singular value decomposition (SVD) for any A:
of ATA AAT
UEVT whereU the spectral norm:and I

Vector calculus review

Gradients and Hessians
• For functions with vector inputs and scalar outputs, the (column) vector of partial derivatives of the function with respect to each entry of the input is the gradient
• If the function has vector outputs, then we have a matrix of partial derivatives, which is referred to as the Jacobian matrix
• The Jacobian of the gradient function is the Hessian matrix — this is the matrix of all partial second derivatives for a vector-input scalar-output function
• A useful resource is Section 2 of the matrix cookbook:
https://www.math.uwaterloo.ca/~hwolkowi/matrixcookbook.pdf

Gradients and Hessians
gradient: Rt derivative: commonly
symmetric matrix

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com