Probabilistic modelling
How could have this data been generated?
Can you fit models to the data?
Copyright By PowCoder代写 加微信 powcoder
Gaussian mixture models (GMMs)
§ Assume data was generated by a set of Gaussian distributions.
§ The probability density is a mixture of them.
§ Find the parameters of the Gaussian distributions and how much each distribution contributes to the data.
§ This is a mixture model of Gaussian.
Visualizing GMMs – 1D Gaussians
If you fit one Gaussian
Now we try a GMM with 2 Gaussians
(each contribute 50%)
Generative Models
§ In supervised learning, we model the joint distribution
§ In unsupervised learning, we do not have labels z, we model
Hidden/Latent variables
Gaussian mixture models (GMMs)
§ A GMM represents a distributions as § with !k the mixing coefficients, where
• GMM is a density estimator.
• GMM is universal approximators of densities (if you have enough
Gaussians)
Fitting GMMs: Maximum likelihood and EM
§ To have a model best fit data, we need to maximize the (log) likelihood
§ Expectation
if we knew !k, μ and ∑ , we can get “soft” Zk P(z(n)|x) – responsibility
§ Maximization
if we know Zk, we can get !k, μ and ∑
Expectation-Maximization (EM Algorithm)
An optimization process that alternates between 2 steps:
1. E-step: compute the posterior probability over z given the current model.
responsibility
Ø Which Gaussian generate each data point with how much possibility?
Expectation-Maximization (EM Algorithm)
2. M-step: Assuming data was really generated this way, change the parameters of each Gaussian to maximize the probability that it would generate the data it is currently responsible for.
Expectation-Maximization (EM Algorithm)
§ A general algorithm for optimizing many latent variable models (not just for GMMs).
§ Iteratively computes a lower bound then optimizes it.
§ Converges but maybe to a local minima.
§ Can use multiple restarts.
Elements of Statistical Learning (2nd edition)
§ Clustering
group similar data points need a distance measure
§ Agglomerative hierarchical clustering successively merges similar groups of points
build a dendrogram (binary tree)
different ways to measure distance between clusters.
§ GMM using EM
build a generative model based on Gaussian distributions need to pre-define k (number of clusters)
Using EM to find the best fit of the model.
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com