代写代考 ML 101: Polynomial

… SML lecture
https://xkcd.com/605/
be starting
On the topic of extrapolation and train-test mismatch, see https://www.youtube.com/watch?v=es6p6NuxOnY and http://ciml.info/dl/v0_99/ciml-v0_99-ch08.pdf

Plan for Today
ML 101: Polynomial
Model selection
→ and how this helps curve-fitting
Gaussians (multidimensional)
various matrix
fitting: model, loss/error function, over-fitting, regularisation
Probabilities: sum rule, product rule,
Gaussians – 1D, maximum likelihood estimates (MLE), bias-variance
identities, geometric intuitions
Bernoulli, Binomial, Exponential family distributions – will be in assignment
eigenvectors
probabilities, derivatives and finding stationary
points, eigenvalues and

about the book

The machine sees:
guess, M-th order polynomials

Test error and learning curves
Training set: 10 points
Separate test set of 100 points

1: More data

regularisation
Minimize regularised error function
(more in Bayesian regression next week)

Model selection (an
Minimizing square error / maximizing data likelihood can
performance
empirical view)
w data (generalisation) – Cause: overfitting
In the curve-fitting example: the order of the polynomial controls the number of free parameters in the model and thereby governs the model complexity.
Training set
Training set
generalise
Testing set
Validation Testing set
a poor indication
How reliable are the estimates
for validation and
performance?
generalisation

[source: MML book]

Model selection
Probabilities: sum rule, product rule,
Polynomial curve fitting: model, loss/error function, over-fitting, regularisation
Gaussians – 1D, maximum likelihood estimates (MLE), bias-variance → and how this helps curve-fitting
Gaussians (multidimensional)
various matrix identities, geometric intuitions
Bernoulli, Binomial, Exponential family distributions
Review: probabilities, derivatives
and finding stationary points, eigen values and eigen vectors

Bayes Theorem

Continuous

Bayes Theorem, restated (Sec

Expectations,
variance, covariance
For review
what is the expectation taken over? probability p is
often implicit.
Question: for a random
variable x ~ p(x), do E[x] and var[x] always exist?

The Gaussian
Distribution

Maximum likelihood for univariate Gaussian

Maximum likelihood =
statistics
, the bias (or
the difference between this estimator’s
bias function
stimator is
expected value
and the true value of the parameter being estimated. An
estimator or decision rule with zero bias is called
an estimator.
. In statistics, “bias” is an
property of
“Bias” is not necessarily bad!

Q: does high
bias/variance means that the model is overfitted, or vice versa?

Bringing it together:
Curve fitting
maximum likelihood

estimate \beta

curve-fitting: predictive distribution
(will cover
next week in Bayesian linear
regression)

ML 101: Polynomial
fitting: model, loss/error function, over-fitting, regularisation
Probabilities: sum rule, product rule, Gaussians – 1D, MLE, bias-variance
→ and how this helps curve-fitting
Bernoulli, binomial
Gaussians (multidimensional)
various matrix identities, geometric intuitions
Exponential family
Review: vectors
probabilities, derivatives and finding stationary
points, eigen values and

Bernoulli to

binomial for increasing large N

Gaussians again: why
n coin tosses with p
CLT – central limit theorem

Gaussians – multidimensional

Eigen decomposition of the cov
Mahalanobis distance

of general 2-D Gaussians –
rotated ellipse

Bernoulli to

binomial for increasing large N

The Exponential family
Beyond Gaussians: What is a class of ‘nice’ distributions for statistical machine
● More expressive
● “Easy” to estimate

normalisation

normalisation
MLE and sufficient stats

Exponential family: a note
h(x)=1, g(η) = exp(-ψ(η))
Assignment 1

About these lecture notes:
● They are designed to be visual aid
for that).
● They are generally focused on derivations + plots
of the model.
● I do not aim to help you learn
about data/plots in the
but not reading material (you have
produce new equations nor new plots (they don’t necessarily 🙂
● Reasoning about ML models on toy data
● Designing appropriate toy data is a core
and less on the “story”
is a core skill of a good ML engineer. research skill in ML.

Polynomial curve fitting: model, loss/error function, over-fitting, regularisation
Model selection
Probabilities: sum rule, product rule, Gaussians – 1D, MLE, bias-variance → and how this helps curve-fitting
Gaussians (multidimensional)
various matrix identities, geometric intuitions
Bernoulli, Binomial, Exponential family distributions
Review: probabilities, derivatives
and finding stationary points, eigen values and eigen vectors

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts