程序代写代做代考 data mining Hidden Markov Mode chain Data Mining and Machine Learning

Data Mining and Machine Learning
Statistical Modelling of Sequences (1) Peter Jančovič
Slide 1
Data Mining and Machine Learning

Objectives
 Extension of dynamic programming to statistical modelling of sequences
 Introduction to Markov models through example
 Calculation of probability of a state sequence
 State distribution
 Relationship to Page Rank
Slide 2
Data Mining and Machine Learning

Sequence retrieval using DP
Corpus of sequential data
…… AAGDTDTDTDD AABBCBDAAAAAAA BABABABBCCDF GGGGDDGDGDGDGDTDTD DGDGDGDGD AABCDTAABCDTAABCDTAAB CDCDCDTGGG GGAACDTGGGGGAAA …….
…….
‘query’ sequence Q
…BBCCDDDGDGDGDCDTCDTTDCCC…
Calculate ad(S,Q) for each sequence S in corpus
Dynamic Programming Distance Calculation
ˆ
S argminadS,Q
S
Slide 3
Data Mining and Machine Learning

Limitations of ‘template matching’
 This type of analysis is sometimes referred to as template matching
 The ‘templates’ are the sequences in the corpus
 Can think of each template as representing a ‘class’
 Problem: determine which class best fits the query
 Performance will depend on precisely which template is used to represent the class
Slide 4
Data Mining and Machine Learning

Markov Chains
 Discussed briefly in the lecture on Page Rank
 Suppose that you want to understand the habits of shoppers on a High Street
 Suppose that:
– There are N shops: S1, S2,…, SN
– Probability that the next shop that a shopper visits is Sj depends only on the shop Si that the shopper is currently visiting – this is the Markov Property
Slide 5
Data Mining and Machine Learning

Markov Chains (continued)  In other words, if xn is the nth shop visited:
Pxn Sj |xn1 Si,xn2,xn3,…,x0Pxn Sj |xn1 Si
 In a Markov chain S1,.., SN are called states
 The behaviour of the shopper is completely described by two factors: the initial state probability vector P0 and state transition probability matrix A:
P1 a a …a 0  11 12 1N
0 2122 2N PP2,Aa a … a 
:::a:
0
Slide 6
N1N2 NN Data Mining and Machine Learning
 ij 
PN a a … a

0


Markov Chains (continued)

Slide 7
Data Mining and Machine Learning

Example
 Suppose: 0.5 0.5 0.2 0.3
 N = 3, P  0.2, A  0.1 0.1 0.8 0
Slide 8
– Half of all shoppers start in shop S1
– Half of all shoppers return to S1 immediately after leaving it (!?)
– Shoppers never visit S2 immediately after S3 Data Mining and Machine Learning
0.3 0.4 0 0.6 
 This means (for example):

Transition network representation
0.5
S1
0.2
0.5
S2
0.3 0.6
0.1
0.8
S3
0.4
Data Mining and Machine Learning
Slide 9

A simple probability calculation
 Suppose that x = x1,…, xT is a sequence of states (shops)
 Then the probability P(x) is given by:
𝑃 𝑥 =𝑃 𝑥𝑇|𝑥𝑇−1 ×𝑃 𝑥𝑇−1|𝑥𝑇−2 ×⋯×𝑃 𝑥2|𝑥1 ×𝑃 𝑥1
=𝑎𝑥𝑇−1,𝑥𝑇 ×𝑎𝑥𝑇−2,𝑥𝑇−1 ×⋯×𝑎𝑥1,𝑥2 ×𝑃0 𝑥1
 Example: given the previous 3 state Markov model, suppose
𝑥 = 𝑆1𝑆3𝑆1𝑆1𝑆2
𝑃𝑥 =𝑎12×𝑎11×𝑎31×𝑎13×𝑃0 1
= 0.2 × 0.5 × 0.4 × 0.3 × 0.5 = 0.006
Slide 10
Data Mining and Machine Learning

State distribution at time t

Slide 11
Data Mining and Machine Learning

State distribution at time t (cont.)
 In particular P1 = ATP0
 Hence P2 = ATP1 = AT(ATP0) = (AT)2P0  In general Pt = (AT)tP0
 Example: For our 3 state Markov model
0.5 0.5 0.1 0.4 0.5 0.39
P  0.2, P  AT P  0.2 0.1 0 0.2  0.12, etc
010
0.3 0.3 0.8 0.60.3 0.49
      
Slide 12
Data Mining and Machine Learning

Page Rank revisited

Slide 13
Data Mining and Machine Learning

Hidden Markov Models (HMMs)

Slide 14
Data Mining and Machine Learning

Summary
 Markov processes and Markov models
 Hidden Markov models
Slide 15
Data Mining and Machine Learning