程序代写代做代考 data mining algorithm Data Mining and Machine Learning

Data Mining and Machine Learning
Speech Recognition using HMMs – Viterbi Decoding
Peter Jančovič Slide 1
Data Mining and Machine Learning

Viterbi Decoding
 Viterbi Decoding is the algorithm which is used to find the sequence of HMM states (or HMMs) which is most likely to have generated a given observation sequence
 Similar to the Forward Probability calculation
Slide 2
Data Mining and Machine Learning

Viterbi Decoding (1)
y1 y2 y3 y4 y5 yt-1 yt yT
Q: How can M have generated Y? A: Via a state sequence of length T
Slide 3
Data Mining and Machine Learning

Basic Probability Calculation
y1 y2 y3 y4 y5
b1(y2) b1(y3) b2(y4) b2(y5)
yt-1 yt yT
b1(y1)
b2(yt-1)
b3(yt)
a23
b5(yT)
a11 a11
a22
a12
a35
Slide 4
Data Mining and Machine Learning

Viterbi Decoding (2)
 Construction of ‘state-time trellis’
y1 y2 y3 y4 y5 yt-1 yt yT
Slide 5
Data Mining and Machine Learning

Viterbi Decoding (3)
 Let X ={x1,…,xT} be a state sequence of length T
 The joint probability of Y and X is given by:
p(Y,X)b (y)T a b (y)
x1
1 t2t1tt
xxxt
 i.e. the product of the state-output and state transition probabilities along the state sequence
 The optimal state sequence is the sequence X such that p(Y,X) is maximized
 p(Y) is the sum of p(Y,X) over all sequences X
Slide 6
Data Mining and Machine Learning

Viterbi Decoding (4)
y1 y2 y3 y4 y5 yt-1 yt yT
(i,t)
αt(i) = Prob(y1,…,yt , opt sequence to (i,t)) αt(i) = max {αt-1(i-1)ai-1,i , αt-1(i)ai,i } bi(yt)
Slide 7
Data Mining and Machine Learning

Viterbi Decoding – example
y1 y2 y3 y4 y5 y6 y7 y8
Slide 8
Data Mining and Machine Learning

Viterbi Decoding – example
y1 y2 y3 y4 y5 y6 y7 y8
Slide 9
Data Mining and Machine Learning

Viterbi Decoding – example
y1 y2 y3 y4 y5 y6 y7 y8
Slide 10
Data Mining and Machine Learning

Viterbi Decoding – example
y1 y2 y3 y4 y5 y6 y7 y8
Slide 11
Data Mining and Machine Learning

Viterbi Decoding – example
y1 y2 y3 y4 y5 y6 y7 y8
Slide 12
Data Mining and Machine Learning

Viterbi Decoding – example
y1 y2 y3 y4 y5 y6 y7 y8
Slide 13
Data Mining and Machine Learning

Viterbi Decoding – example
y1 y2 y3 y4 y5 y6 y7 y8
Slide 14
Data Mining and Machine Learning

Viterbi Decoding – example
y1 y2 y3 y4 y5 y6 y7 y8
Slide 15
Data Mining and Machine Learning

Viterbi Decoding – example
y1 y2 y3 y4 y5 y6 y7 y8
Slide 16
Data Mining and Machine Learning

Viterbi Decoding – example
y1 y2 y3 y4 y5 y6 y7 y8
Slide 17
Data Mining and Machine Learning

Viterbi Decoding – example
y1 y2 y3 y4 y5 y6 y7 y8
Slide 18
Data Mining and Machine Learning

Viterbi Decoding – example
y1 y2 y3 y4 y5 y6 y7 y8
Slide 19
Data Mining and Machine Learning

Viterbi Decoding – example
y1 y2 y3 y4 y5 y6 y7 y8
Slide 20
Data Mining and Machine Learning

Viterbi Decoding – Trace-back
y1 y2 y3 y4 y5 y6 y7 y8
Slide 21
Data Mining and Machine Learning

Viterbi Decoding – Trace-back
y1 y2 y3 y4 y5 y6 y7 y8
Slide 22
Data Mining and Machine Learning

Viterbi Decoding – Trace-back
y1 y2 y3 y4 y5 y6 y7 y8
Slide 23
Data Mining and Machine Learning

Isolated Speech Recognition
y1 y2 y3 y4 y5 yt-1 yt yT
Slide 24
Data Mining and Machine Learning

Connected Speech Recognition
New transitions connect end of every model to start of every model
y1 y2 y3 y4 y5 yt-1 yt yT
Slide 25
Data Mining and Machine Learning

Connected Speech Recognition
y1 y2 y3
y4 y5
yt-1 yt yT
Minimum score – traceback from here
Slide 26
Data Mining and Machine Learning

Partial Traceback (1)
 In continuous speech recognition, cannot trace-back from the end of the utterance (there is no end!)
 Instead partial traceback operates as follows:
– For each time t and state i a word link record describes the sequence of words on the best path to (t,i).
– At regular intervals all active paths are traced back to see if they converge at some time s in the past
– If so, the best path up to time s cannot change, and the sequence of words up to s can be output
Slide 27
Data Mining and Machine Learning

Partial Traceback (2)
r
st
Paths to here already output
Slide 28
Data Mining and Machine Learning

Beam Pruning
 Choose threshold T
 The active states at time t-1 are those whose partial path scores are within T of the best path score at time t-1
yt-1 yt
‘highest probability’ state
active state
states to process at time t
 At time t, only process those states which link back to an active state
Slide 29
Data Mining and Machine Learning

Partial Traceback, Beam Pruning & Recognition ‘Speed’
 Partial traceback introduces a ‘lag’ into recognition process – not due to inadequate processor speed
 Lag worse when models are poor
 Beam Pruning less effective for ambiguous input
 Severe Beam Pruning will degrade performance
 Proper management of Partial Traceback and Beam Pruning is essential for optimal performance
Slide 30
Data Mining and Machine Learning

Summary
 Speech recognition using HMMs
– Viterbi decoding
– Isolated & Connected/Continuous speech recognition – Beam pruning
– Partial traceback
Slide 31
Data Mining and Machine Learning