https://xkcd.com/780/
Class update:
Exam info posted. Friday “week 13”.
Sampling can be hard to get right …
Copyright By PowCoder代写 加微信 powcoder
Week 12 Monday – no lecture, finish your video assignment + good luck with all end-of-S1 deadlines!
Week 12 Wednesday – Q&A, Talk about selected questions in
2021 Exam.
“Week 13” – drop-in sessions will be announced.
Sampling methods
Motivation, why? Sampling and ML
Basic sampling algorithms
● Sampling standard distributions
● Rejection sampling
● Importance
chain Monte Carlo (MCMC)
● Markov chains
● Metropolis- sampling
from U (0, 1)
Bishop Chap
Intro, 11.1.1, 11.1.2, 11.1.4, 11.1.6
11.2, 11.2.1, 11.2.2, 11.2.3 11.3
Estimating 𝛑 – Buffon’s needle
https://mathworld.wolfram.com/BuffonsNeedlePro
need sampling?
Distributions can be quite complex, e.g. posterior distributions, mixture graphical models (coming next).
Many ML tasks can be stated as estimating the expectation of functions under a distribution. e.g. posterior mean and variance, moments.
distributions,
logistic regression
Laplace approximation
Goal: estimate
● the accuracy of the estimator does not depend on the dimensionality of z
● in principle, high accuracy may be achievable with a relatively small number of
● samples {z(l)} might not be independent, and so the effective sample size might be smaller than the apparent sample size.
● the accuracy of the estimator does not depend on the dimensionality of
achievable with a relatively small number of
● in principle, high accuracy may be The bad:
● samples {z(l)} might not be independent, and so the effective sample size might be smaller than the apparent sample size.
f(z) is small in regions where p(z) is large, and vice versa, then the expectation may be dominated by regions of small probability, implying that relatively large sample sizes will be required to achieve sufficient accuracy.
from uniform distributions
For this class: assume we can generate z ~ U(0, 1)
z ~ U(0, 1)
obtain z ~
distribution p(y)
● Generate z
● Use inverse CDF y=h-1(z) to obtain
Multiple variables
exponential
distribution
https://en.wikipedia.org/wiki/Exponential_distribution
Typo in the book (online pdf version, fixed in
the latest print version)
Full derivation: through change of variable to the polar coordinates (r, 𝛉), somewhat involved
sampling example
Gamma distribution with a>1
Proposal distribution: Cauchy
u~ U[0, 1]
z = b tan u +c
sampling – challenges
Proposal distribution need to have heavier tails than the target distribution.
Acceptance rate can
decreases exponentially as dimensionality increases.
Importance
All samples retained.
Importance weights
Estimating the (ratio
of) normalization constants
evaluate distribution up to a normalising
constant, but hard to know what
the constant is.
Using the same
Importance
sampling – comments
EM algorithm
Sample this
Monte Carlo EM
Being Bayesian
Sampling methods
Motivation, why? Sampling and ML
Basic sampling algorithms
● Sampling standard distributions
● Rejection sampling
● Importance
chain Monte Carlo (MCMC)
● Markov chains
● Metropolis- sampling
from U (0, 1)
hain Monte Carlo (MCMC)
Motivation:
Cover high-probability regions of p(z) — can we move towards it?
Scale better with the the sample space.
dimensionality
Metropolis Algorithm
Metropolis Algorithm – illustration
Random walk
as a Markov Chain
after τ steps, the random walk has only travelled a distance that on average is
proportional to the square root of τ
→ Random walks are very inefficient in
exploring the state space.
Design goal of MCMC algorithms: avoid
random-walk behaviour.
First order Markov Chain
series of random
z(1), … , z(M) such that
MCMC – why it works
MCMC – why it works
The Metropolis-
Metropolis-Hastings: why does it work?
Gibbs sampling
does it work?
Sampling methods
Motivation, why? Sampling and ML
Basic sampling algorithms
● Sampling standard distributions
● Rejection sampling
● Importance
chain Monte Carlo (MCMC)
● Markov chains
● Metropolis- sampling
from U (0, 1)
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com