程序代写代做代考 decision tree data mining Hidden Markov Mode Data Mining and Machine Learning

Data Mining and Machine Learning
Decision trees Peter Jančovič
Slide 1
Data Mining and Machine Learning

Slide 2
– Types of question
– Automatic construction of DTs from data – Example from Speech Recognition: Phone
Decision Trees
Data Mining and Machine Learning
Outline of lecture
 Introduction to Decision Trees (DTs)
– A third approach to partitioning data – Basic principles
– Decision tree issues

A simple binary decision tree
5 23
y >-3? NY
1
2
y > x + 1.5
NY
45
4
x > 2.5? NY
N
x > -1.5? Y
1
3
Slide 3
Data Mining and Machine Learning

Binary decision trees
 A binary decision tree consists of:
– A binary tree (non-terminal nodes have 2 ‘children’) – For each node s:
– A “yes/no” question qs which can be applied to any data sample
– A rule that determines which link is traversed if the answer to qs is “YES” (then the other link is traversed if the answer is “NO”)
 Questions need not be numerical
Slide 4
Data Mining and Machine Learning

Binary decision trees
 The decision tree on the previous slide is a simple ‘hand-crafted’ example, but what if the data was in a much higher dimensional vector space?
 How do we decide the best question to associate with each node of the tree?
 How do we know when the tree is complete?
 What are the alternatives to numerical questions?
Slide 5
Data Mining and Machine Learning

Example from speech recognition
 In speech recognition a word is modelled as a sequence of phones: e.g. “six” = /s/ /I/ /k/ /s/
 Each phone is modelled as a statistical Hidden Markov Model (HMM). Each model has many parameters which must be estimated from data.
 In English there are approximately 45 phones
 Fundamental problem is “co-articulation” – the acoustic signal corresponding to a particular phone depends on preceding and following phones
Slide 6
Data Mining and Machine Learning

Example (continued)
 Positions of articulators when a phone is produced depend on positions before and after it is produced
 Context-sensitive phone modelling
 Triphones – separate phone model for each preceding and following phone (so the /I/ in s I k s would be a model of /I/ preceded by /s/ and followed by /k/
 Potentially huge model set (453 = 91,125 models)
 Phone decision trees combine contexts that result in
similar coarticulation
Data Mining and Machine Learning
Slide 7

Phone decision trees
 Collect very large set of acoustic patterns, each corresponding to a phone in context (triphones)
– “Acoustic pattern” means a sequence of feature vectors that capture the salient short-term properties of the speech signal
– Vectors are high-dimensional – typically ~40D
 This is a “big data” problem. Identifying contexts that
induce similar coarticulation is a clustering problem. Data Mining and Machine Learning
Slide 8

Decision tree “questions”
 Each node associated with a “question”
 Knowledge from phonetics and linguistics defines a set of potential decision-tree node “questions”
 For example:
– Q: “Is the left context one of {/p/, /b/, /t/, /d/}?” – Q: “Is the right-context {/m/}?”
 Call these questions q1,…, qN
Slide 9
Data Mining and Machine Learning

Phone decision tree construction
 Associate complete set of phones-in-context, S0, with root node
 Each question qn partitions S0 into two subsets:
– The set S0nY of samples for which qn is true, and – The set S0nN of samples for which qn is false
 Intuitively, if qn is a “good question” then:
– Samples in S0nY will be similar to each other, and – Samples in S0nN will be similar to each other
 Let P0n = P0nY + P0nN where P0nA measures similarity between samples in S0nA (A = Y or N)
Slide 10
Data Mining and Machine Learning

Phone decision tree construction
 P0nY could be a measure of how well S0nY is modelled by a single Gaussian PDF
 Calculate P0n for every question qn
 Let q0 be the question qn for which P0n is maximised
S0 q0
NY
S0,N S0,Y
 Repeat whole process for S0,N and S0,Y, and so on
Slide 11
Data Mining and Machine Learning

Phone decision tree construction
S0 q0
NY
q1 S1 =S0,Y
S2 = S0,N
S3 = S1,N S4 = S1,Y
S5 = S3,N
NY NY
S6 = S3,Y
 Stop if too few samples in Sn, or benefit of splitting Sn is too small
Slide 12
Data Mining and Machine Learning

Phone decision trees
 Separate HMM built for each terminal (leaf) node
 Given a new phone-in-context, negotiate the tree according to the answers to the questions until a terminal node is reached – the corresponding model is the one to use for this phone-in-context
 Phone decision trees illustrate two important points:
– Example of DT with non-numerical questions
– Illustration of how DT can be constructed automatically from data
Slide 13
Data Mining and Machine Learning

Summary
 Introduction to Decision Trees  Decision Tree issues
 Example: Phone decision Trees
Slide 14
Data Mining and Machine Learning