程序代写代做代考 data mining Hidden Markov Mode algorithm GMM Data Mining and Machine Learning

Data Mining and Machine Learning
Statistical Modelling of Sequences (2) Peter Jančovič
Slide 1
Data Mining and Machine Learning

Objectives
 So far, we introduced Markov models
 Hidden Markov models (HMMs)
 Calculating the probability of an observation sequence
 The Forward Probability calculation
 HMM training
Slide 2
Data Mining and Machine Learning

Hidden Markov Models (HMMs)

Slide 3
Data Mining and Machine Learning

HMMs (continued)
 In a HMM we assume that the current item purchased depends only on the current state (shop) and not on items previously purchased or shops previously visited
 Suppose xt is the state and yt is the item purchased at time t
 The diagram indicates the dependencies:
Visible Hidden
Slide 4
Data Mining and Machine Learning

Markov Model
a11
a22
a33 a12
a24
a44
a23
a34
X={xt} Y={yt}
Slide 5
Data Mining and Machine Learning

Hidden Markov Model
 In a hidden Markov model, the relationship between the observation sequence and the state
sequence is ambiguous.
a11 a22
a12 a23
a33 a24
a44
X={xt} Y={yt}
a34
Slide 6
Data Mining and Machine Learning

HMMs Continued

Slide 7
Data Mining and Machine Learning

Formal definition of a HMM
 An N state HMM with observations {I1,…,IM} comprises:
 An underlying N state Markov model, defined by an initial state probability vector P0 and N×N state transition probability matrix A, where:
– P0(n) = P(x1 = Sn)
– Anm = P(xt = Sm | xt-1 = Sn)
 An N×M state output probability matrix B where
Slide 8
– Bnm = Bn(m) = P(yt = Im | xt = Sn) Data Mining and Machine Learning

Example HMM Probability Calculation

0.5 0.5 0.2 0.3 P  0.2, A  0.1 0.1 0.8
0.3 0.4 0 0.6 
0
Slide 9
Data Mining and Machine Learning

Example (Continued)

Slide 10
Data Mining and Machine Learning

Example (Continued)

Slide 11
Data Mining and Machine Learning

Example (Continued)

Slide 12
Data Mining and Machine Learning

Calculating the probability of an observed sequence
 Even in our simple example with 3 states and 2 observations there are 9 terms in the summation
 In general, if the Markov model is fully connected and has N states, and we have T observations, then the number of state sequences (and therefore the number of terms in the summation) is NT. This makes direct calculation of P(I) computationally impractical.
 However, there is an efficient solution….
Slide 13
Data Mining and Machine Learning

The Forward Probability calculation

Slide 14
Data Mining and Machine Learning

Slide 15
Data Mining and Machine Learning


Slide 16
Data Mining and Machine Learning

Example
Slide 17
Data Mining and Machine Learning

Example
Slide 18
Data Mining and Machine Learning

Example
Slide 19
Data Mining and Machine Learning

Example
Slide 20
Data Mining and Machine Learning

Example
Slide 21
Data Mining and Machine Learning

HMM Parameter Estimation
 Given a HMM and a sequence y we can calculate P(y)
 But where does the HMM come from? In other words how do we estimate the HMM’s parameters?
 This is done from data, using an algorithm similar to the E-M algorithm for estimating the parameters of a GMM
 The HMM training algorithm is called the Baum- Welch algorithm
 Like the E-M algorithm, it involves making an initial estimate and then iteratively improving the estimate until convergence. Hence it is only locally optimal
Slide 22
Data Mining and Machine Learning

HMM training
1. Make an initial estimate of the HMM – M0
2. Obtain a large set of training data Y
3. Set i=1
4. Apply the Baum-Welch algorithm to Y and Mi-1 to get a new model Mi such that P(Y|Mi) ≥P(Y|Mi-1)
5. If |P(Y|Mi) – P(Y|Mi-1)| ≤ ε then stop, else 1. i=i+1
2. Go back to step 4.
Slide 23
Data Mining and Machine Learning

Local optimality P(Y|M)
Global maximum
Local maximum
M0 M1…Mn
Slide 24
Data Mining and Machine Learning

Summary
 Hidden Markov Models
 Calculating the probability of an observation sequence
 The forward probability calculation
 HMM training
Slide 25
Data Mining and Machine Learning