PowerPoint Presentation
Comp90042
Workshop
Week 4
23 March
1
Part-of Speech Tagging
Hidden Markov Model
2
Table of Content
Part-of Speech Tagging
Hidden Markov Model
3
Table of Content
What is Part-of-Speech (POS) tag?
Label of word’s grammatical (primarily syntactic) properties of in the sentence.
4
Definition
5
POS Tag Set
Penn Treebank Tag Set
6
POS Tagging
Pierre Vinken , 61 years old , will join the board as a nonexecutive director Nov. 29 .
POS tag (by hand) the following sentence:
7
Pierre
Vinken
,
61
years
old
,
Will
join
the
board
as
a
nonexecutive
director
Nov.
29
.
NNP
NNP
,
CD
NNS
JJ
,
MD
VB
DT
NN
IN
DT
JJ
NN
NNP
CD
.
POS Tagging
Tag set may vary in different datasets!
8
POS Tagging
What are some common approaches to POS tagging?
Unigram: Use the most common observation on each token.
N-gram: Use most common tag in the same sequence of tokens.
Rule-based: Write rules that disambiguate unigram tags.
Sequential: Learn a Hidden Markov Model in a tagged corpus.
Classifier: Treat as a supervised machine learning problem.
Part-of Speech Tagging
Hidden Markov Model
9
Table of Content
What are the assumptions in Hidden Markov Model (HMM)?
10
Hidden Markov Model
Markov assumption
the likelihood of transitioning into a given state depends only on the current
state, and not the previous state(s) (or output(s))
2. Output independence assumption
the likelihood of a state producing a certain word (as output) does not depend
on the preceding (or following) state(s) (or output(s)).
11
Hidden Markov Training
What are the parameters do we need to learn when training HMM?
Initial state probabilities
record the distribution of tags for the first token of each sentence
2. Transition probabilities
record the distribution of tags of the immediately following token
3. Emission probabilities
record the distribution of corresponding tokens
12
Hidden Markov Training
Estimate , and for POS tagging, based on the following corpus:
silver-JJ wheels-NNS turn-VBP
wheels-NNS turn-VBP right-JJ
right-JJ wheels-NNS turn-VBP
Initial state probabilities
record the distribution of tags of the first token of each sentence
13
Hidden Markov Training
Estimate , and for POS tagging, based on the following corpus:
silver-JJ wheels-NNS turn-VBP
wheels-NNS turn-VBP right-JJ
right-JJ wheels-NNS turn-VBP
Transition probabilities
record the distribution of tags of the immediately following token
JJ NNS VBP
(from) JJ
NNS
VBP
1
1
0
1
0
0
0
0
0
14
Hidden Markov Training
Estimate , and for POS tagging, based on the following corpus:
silver-JJ wheels-NNS turn-VBP
wheels-NNS turn-VBP right-JJ
right-JJ wheels-NNS turn-VBP
Emission probabilities
record the distribution of corresponding token
right silver turn wheels
(from) JJ
NNS
VBP
0
0
0
0
0
0
0
0
15
Hidden Markov Inference
Use the following Hidden Markov Model to tag the sentence
silver wheels turn
silver
wheels
turn
observed
unseen
Visualize the HMM
Hidden Markov Inference
Use the Viterbi algorithm with given parameters below
to find the optimal tag sequence.
What is Viterbi algorithm ?
a dynamic programming algorithm for finding the most likely sequence of
hidden states
16
Position 1:
Position :
At each cell (), for every possible tag , compute
2. Choose
Only optimal sub-sequences are stored, with the back pointer to previous tag of this cell
Repeat previous step until reach end of sentence
Hidden Markov Inference
What is the time complexity of Viterbi algorithm ?
For each word , possible previous tags and current tags
are considered.
20
Takeaways
Part-of-Speech tagging
– Definition (including common POS tag labels)
– Approaches
Hidden Markov Model (HMM)
– Assumptions
– Training (MLE)
– Inference (Viterbi)
21
Thank you
/docProps/thumbnail.jpeg