程序代写代做代考 data mining Hidden Markov Mode Data Mining and Machine Learning

Data Mining and Machine Learning
HMMs for Automatic Speech Recognition
Peter Jančovič Slide 1
Data Mining and Machine Learning

Objectives To understand
 Application of HMMs for automatic speech recognition
 HMM assumptions
Slide 2
Data Mining and Machine Learning

Pattern Recognition
 Suppose we have a finite number of classes, w1,…,wC and the goal is to decide which class has given rise to the measurement x
 The probability of the class w given that the measurement x has been observed is called posterior probability of the class w – denoted by P(w|x)
Slide 3
Data Mining and Machine Learning

Bayes’ Theorem
 The form of Bayes’ Theorem which we need for pattern recognition is:
Class-conditional density
px|wPw px
Prior probability
P w | x   Posterior probability
Slide 4
Data Mining and Machine Learning

Automatic Speech Recognition
 Given a sequence of acoustic feature vectors Y = {y1,…,yT}
we want to find the sequence of words W = {w1,…,wL}
such that the probability P(W |Y)
is maximized.
 If M = {M1,…,MK} is the sequence of HMMs which
represents W, then P( W | Y ) = P( M | Y )
Slide 5
Data Mining and Machine Learning

Bayes’ Theorem
 Computation of the probability P( M | Y ) is made
possible using Bayes’ Theorem
p(Y |W)P(W) p(Y)
P(W |Y) 
 P(W) is the “language model probability”
 p( Y | W ) is the “acoustic model probability”
Slide 6
Data Mining and Machine Learning

Mathematical Modelling
 Mathematical modelling for speech recognition  Two conflicting requirements:
– Faithful model of human speech production/perception
– Mathematically tractable & Computationally Useful
 HMMs are one of the best compromise at the
moment
Mathematics & Computing Speech Science
HMMs
Slide 7
Data Mining and Machine Learning

‘Divide and Conquer’
 One possible approach to ASR is sequential ‘divide and conquer’, e.g.
– classify speech vectors as ‘acoustic features’ – classify sequences of acoustic features as
phonemes
– classify sequences of phonemes as words – classify sequences of words …
DISASTER!!
Slide 8
Data Mining and Machine Learning

Delayed Decision Making
 Another name for this might be non-recoverable error propagation!
 Better to avoid all classification decisions until all sources of information are available. Then perform classification as a single, integrated process – delayed decision making
 Delayed Decision Making underlies HMM success
Slide 9
Data Mining and Machine Learning

The ‘HMM Compromise’ Assume that :
 A spoken utterance is a time-varying sequence which moves through a sequence of ‘segments’ – (yes)
 Underlying structure of the segments is constant w.r.t time – (no!)
 Durations of segments vary – (yes)
 All variations between different realizations of
the segments are random – (no!) Data Mining and Machine Learning
Slide 10

Hidden Markov Model
 In a hidden Markov model, the relationship between the observation sequence and the state
sequence is ambiguous.
a11 a22
a12 a23
a33 a24
a44
X={xt} Y={yt}
a34
Slide 11
Data Mining and Machine Learning

Hidden Markov Models
A HMM consists of
 A set of states S = {s1,…,sN}
 A state transition probability matrix A = [aij]i,j=1,…N,,
where aij =Prob(sj at time t | si at time t-1)
 For each state si, a PDF bi defined on the set of possible
observations O s.t.
bi(o) = Prob(yt=o | xt=si)
 bi is called the state output PDF for state i (or the ith state
output PDF) Slide 12
Data Mining and Machine Learning

10 state HMM of the digit ‘zero’
Slide 13
Data Mining and Machine Learning

6 state HMM of the digit ‘zero’
Slide 14
Data Mining and Machine Learning

HMM Assumptions
 Temporal Independence – the observation yt depends on the state xt but is otherwise independent of the rest of the observation sequence Y = {yt}!
… so, the position of the vocal tract at time t is independent of its position at time t-1!
 Piecewise stationarity – the underlying structure of speech is a sequence of stationary segments
 Random variability – variations from this underlying structure are random
Slide 15
Data Mining and Machine Learning

HMM State Duration Model  Constant segments correspond to the HMM states
aii
Pi (D) } (1-aii)
D
 Probability of state duration D is given by Pi (D) = (1 – aii)aii(D – 1)
Slide 16
Data Mining and Machine Learning

Summary
 Introduction to application of HMMs for speech recognition – HMM assumptions
Slide 17
Data Mining and Machine Learning