程序代写 MIE1624H – Introduction to Data Science and Analytics Lecture 4

Lead Research Scientist, Financial Risk Quantitative Research, SS&C Algorithmics Adjunct Professor, University of Toronto
MIE1624H – Introduction to Data Science and Analytics Lecture 4 – Linear Algebra and Matrix Computations
University of Toronto February 1, 2022

Lecture outline
Matrix computations
▪ Matrix operations
▪ Computing determinants and eigenvalues
Linear algebra
▪ Solving systems of linear equations
▪ Solving non-linear equations (Bisection method, Newton’s method) ▪ Solving systems of non-linear equations
▪ Solving unconstrained non-linear optimization problems
Derivatives
▪ Gradients and Hessians ▪ Taylor series expansion
Functions and convexity
▪ Convex and concave functions ▪ Checking convexity
▪ Properties of convex functions

Math for Data Science

Functions, variables, equations, graphs
What: basic stuff like the equation of a line to binomial theorem and its properties
▪Logarithm, exponential, polynomial functions, rational numbers
▪Basic geometry and theorems, trigonometric identities
▪Real and complex numbers and basic properties
▪Series, sums, and inequalities
▪Graphing and plotting, Cartesian and polar co-ordinate systems, conic sections
Online resources:
❑Data Science Math Skills – Coursera ❑Introduction to Algebra – edX
❑ Algebra
Usage examples: how a search runs faster on a million item database after you sorted it, you will come across the concept of binary search; to understand the dynamics of it, logarithms and recurrence equations need to be understood; if you want to analyze a time series you may come across concepts like periodic functions and exponential decay.
Source: “Essential Math for Data Science – ‘Why’ and ‘How’”

Statistics
What: solid grasp over essential concepts of statistics and probability, many practitioners in the field call classical (non neural network) machine learning nothing but statistical learning.
▪Data summaries and descriptive statistics, central tendency, variance, covariance, correlation
▪Basic probability: basic idea, expectation, probability calculus, Bayes theorem, conditional probability
▪Probability distribution functions – uniform, normal, binomial, chi-square, student’s t-distribution, Central limit theorem (CLT)
▪Sampling, measurement, error, random number generation ▪Hypothesis testing, A/B testing, confidence intervals, p-values ▪ANOVA, t-test, chi-square test
▪Linear regression, regularization
Online resources:
❑Statistics with R specialization – Coursera ❑Statistics and Probability in Data Science
using Python – edX
❑Business Statistics and Analysis
Usage examples: in interviews, as a prospective data scientist, if you can master all of the concepts mentioned above, you will impress the other side of the table really fast; and you will use some concept or other pretty much every day of your job as data scientist.
Specialization – Coursera
Source: “Essential Math for Data Science – ‘Why’ and ‘How’”

Linear algebra
What: friend suggestion on Facebook, song recommendation in Spotify, transferring your selfie to a portrait drawing style using Deep Transfer learning – matrices and matrix algebra in all of them; this is an essential branch of mathematics to study for understanding how most machine learning algorithms work on a stream of data to create insight.
▪Basic properties of matrices and vectors – scalar multiplication, linear transformation, transpose, conjugate, rank, determinant
▪Matrix computations – inner and outer products, matrix multiplication rule and various algorithms, matrix inverse
▪Special matrices – square matrix, identity matrix, triangular matrix, idea about sparse and dense matrices, unit vectors, symmetric matrix, Hermitian, skew-Hermitian and unitary matrices
▪Matrix factorization concept/LU decomposition, Gaussian/Gauss-Jordan elimination, solving systems of linear equations (Ax=b)
▪Vector space, basis, span, orthogonality, orthonormality, linear least squares ▪Eigenvalues, eigenvectors, and diagonalization, singular value
decomposition (SVD)
▪Solving systems of nonlinear equations, bisection and Newton algorithms
Source: “Essential Math for Data Science – ‘Why’ and ‘How’”

Linear algebra (continued)
Online resources:
❑Linear Algebra: Foundation to Frontier
❑Mathematics for Machine Learning:
Linear Algebra – Coursera
Usage examples: if you have used a dimensionality reduction technique Principal Component Analysis (PCA), then you have likely used the singular value decomposition to achieve a compact dimension representation of your dataset with fewer parameters, all neural network algorithms use linear algebra techniques to represent and process the network structures and learning operations.
Source: “Essential Math for Data Science – ‘Why’ and ‘How’”

What: concepts and applications of calculus pop-up in numerous places in the field of data science or machine learning; it is behind the simple looking analytical solution of ordinary least square problem in linear regression, or it is embedded in every back-propagation your neural network makes to learn a new pattern.
▪Functions of single variable, limit, continuity and differentiability ▪Mean value theorems, indeterminate forms and L’Hospital rule ▪Maxima and minima
▪Product and chain rule
▪Taylor’s series, infinite series summation/integration concepts ▪Fundamental and mean value-theorems of integral calculus,
evaluation of definite and improper integrals
▪Beta and Gamma functions
▪Functions of multiple variables, limit, continuity, partial derivatives, gradient vector, Hessian matrix
▪Basics of ordinary and partial differential equations (not too advanced)
Source: “Essential Math for Data Science – ‘Why’ and ‘How’”

Calculus (contunued)
Online resources:
❑Pre-University Calculus – edX ❑ Calculus all content ❑Mathematics for Machine Learning:
Multivariable Calculus – Coursera
Usage examples: ever wondered how exactly a logistic regression algorithm is implemented, there is a high chance it is using a method called ‘gradient descent’ to find the minimum loss function, and to understand how it is working, you need to use concepts from calculus – gradient, derivatives, limits, and chain rule.
Source: “Essential Math for Data Science – ‘Why’ and ‘How’”

Discrete mathematics
What: all modern data science is done with the help of computational systems and discrete math is at the heart of such systems; a refresher in discrete math will imbue the learner with concepts critical to daily use of algorithms and data structures in analytics project.
▪Sets, subsets, power sets
▪Counting functions, combinatorics, countability
▪Basic Proof Techniques – induction, proof by contradiction
▪Basics of inductive, deductive, and propositional logic
▪Basic data structures – stacks, queues, graphs, arrays, hash tables, trees
▪Graph properties – connected components, degree, maximum flow/minimum cut concepts, graph coloring
▪Recurrence relations and equations ▪Growth of functions and O(n) concept
Usage examples: in any social network analysis you need to know properties of graph and fast algorithm to search and traverse the network; to choose an algorithm you need to understand the time and space complexity, i.e., how the running time and space requirements grow with input data size, by using O(n) (Big-Oh) notation.
Source: “Essential Math for Data Science – ‘Why’ and ‘How’”

Discrete mathematics (continued)
Online resources:
❑Introduction to Discrete Mathematics
for Computer Science Specialization
– Coursera
❑Introduction to Mathematical Thinking
– Coursera
❑Master Discrete Mathematics: Sets,
Math Logic, and More – Udemy
Usage examples: in any social network analysis you need to know properties of graph and fast algorithm to search and traverse the network; to choose an algorithm you need to understand the time and space complexity, i.e., how the running time and space requirements grow with input data size, by using O(n) (Big-Oh) notation.
Source: “Essential Math for Data Science – ‘Why’ and ‘How’”

Optimization, operation research topics
What: these topics are little different from the traditional discourse in applied mathematics as they are mostly relevant and most widely used in specialized fields of study – theoretical computer science, control theory, or operation research, however, a basic understanding of these powerful techniques can be immensely fruitful in the practice of machine learning; virtually every (supervised) machine learning algorithm/technique aims to minimize some kind of estimation error subject to various constraints and that is an optimization problem.
▪Basics of optimization – how to formulate the problem, unconstrained vs. constrained optimization, nonlinear vs. linear/quadratic optimization
▪Maxima, minima, convex functions, local and global optimum ▪Linear, quadratic and second-order conic optimization (programming),
simplex algorithm, interior-point method (IPM)
▪Nonlinear optimization – gradient descent algorithm, Newton and quasi- Newton algorithm, derivative-free optimization
▪Integer optimization, mixed-integer optimization ▪Constraint programming, Knapsack problem
▪Randomized optimization techniques – hill climbing, simulated annealing,
12 Geneticalgorithms
Source: “Essential Math for Data Science – ‘Why’ and ‘How’”

Optimization, operation research topics (continued)
Online resources: ❑Optimization Methods in
Business Analytics – edX ❑Discrete Optimization – Coursera ❑Deterministic Optimization – edX
Usage examples: simple linear regression problems using least-square loss function often have an exact analytical solution, but logistic regression problems don’t; to understand the reason, you need to know the concept of convexity in optimization; this line of thinking will also explain why we have to remain satisfied with ‘approximate’
solutions in most machine learning problems. Source: “Essential Math for Data Science – ‘Why’ and ‘How’”

Solving Linear Equations

Systems of linear equations
◼ System of linear equations
◼ To solve this system of equations we express one of the variables through the other from one of the equations and plug into the other equation:
◼ Therefore
◼ Matrix notation:
◼ System of linear equations in matrix form: 15

Gaussian elimination
Gaussian elimination
Back substitution

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts