l5-pos-v2
COPYRIGHT 2021, THE UNIVERSITY OF MELBOURNE
1
COMP90042
Natural Language Processing
Lecture 5
Semester 1 2021 Week 3
Jey Han Lau
Part of Speech Tagging
COMP90042 L5
2
What is Part of Speech (POS)?
• AKA word classes, morphological classes, syntactic
categories
• Nouns, verbs, adjective, etc
• POS tells us quite a bit about a word and its
neighbours:
‣ nouns are often preceded by determiners
‣ verbs preceded by nouns
‣ content as a noun pronounced as CONtent
‣ content as a adjective pronounced as conTENT
COMP90042 L5
3
Information Extraction
• Given this:
‣ “Brasilia, the Brazilian capital, was founded in 1960.”
• Obtain this:
‣ capital(Brazil, Brasilia)
‣ founded(Brasilia, 1960)
• Many steps involved but first need to know nouns
(Brasilia, capital), adjectives (Brazilian), verbs
(founded) and numbers (1960).
COMP90042 L5
4
Outline
• Parts of speech
• Tagsets
• Automatic Tagging
COMP90042 L5
5
POS Open Classes
Open vs closed classes: how readily do POS
categories take on new words? Just a few open
classes:
• Nouns
‣ Proper (Australia) versus common (wombat)
‣ Mass (rice) versus count (bowls)
• Verbs
‣ Rich inflection (go/goes/going/gone/went)
‣ Auxiliary verbs (be, have, and do in English)
‣ Transitivity (wait versus hit versus give)
— number of arguments
COMP90042 L5
6
POS Open Classes
• Adjectives
‣ Gradable (happy) versus non-gradable (computational)
• Adverbs
‣ Manner (slowly)
‣ Locative (here)
‣ Degree (really)
‣ Temporal (today)
COMP90042 L5
7
POS Closed Classes (English)
• Prepositions (in, on, with, for, of, over,…)
‣ on the table
• Particles
‣ brushed himself off
• Determiners
‣ Articles (a, an, the)
‣ Demonstratives (this, that, these, those)
‣ Quantifiers (each, every, some, two,…)
• Pronouns
‣ Personal (I, me, she,…)
‣ Possessive (my, our,…)
‣ Interrogative or Wh (who, what, …)
COMP90042 L5
8
POS Closed Classes (English)
• Conjunctions
‣ Coordinating (and, or, but)
‣ Subordinating (if, although, that, …)
• Modal verbs
‣ Ability (can, could)
‣ Permission (can, may)
‣ Possibility (may, might, could, will)
‣ Necessity (must)
• And some more…
‣ negatives, politeness markers, etc
COMP90042 L5
9
• Noun
• Verb
• Adjective
• Adverb
PollEv.com/jeyhanlau569
Is POS universal? What open classes
are seen in all languages?
http://PollEv.com/jeyhanlau569
http://PollEv.com/jeyhanlau569
COMP90042 L5
10
COMP90042 L5
11
Ambiguity
• Many word types belong to multiple classes
• POS depends on context
• Compare:
‣ Time flies like an arrow
‣ Fruit flies like a banana
Time flies like an arrow
noun verb preposition determiner noun
Fruit flies like a banana
noun noun verb determiner noun
COMP90042 L5
12
POS Ambiguity in News Headlines
• British Left Waffles on Falkland Islands
‣ [British Left] [Waffles] [on] [Falkland Islands]
• Juvenile Court to Try Shooting Defendant
‣ [Juvenile Court] [to] [Try] [Shooting Defendant]
• Teachers Strike Idle Kids
‣ [Teachers Strike] [Idle Kids]
• Eye Drops Off Shelf
‣ [Eye Drops] [Off Shelf]
COMP90042 L5
13
Tagsets
COMP90042 L5
14
Tagsets
• A compact representation of POS information
‣ Usually ≤ 4 capitalized characters (e.g. NN = noun)
‣ Often includes inflectional distinctions
• Major English tagsets
‣ Brown (87 tags)
‣ Penn Treebank (45 tags)
‣ CLAWS/BNC (61 tags)
‣ “Universal” (12 tags)
• At least one tagset for all major languages
COMP90042 L5
15
Major Penn Treebank Tags
NN noun VB verb
JJ adjective RB adverb
DT determiner CD cardinal number
IN preposition PRP personal pronoun
MD modal CC coordinating conjunction
RP particle WH wh-pronoun
TO to
COMP90042 L5
16
Derived Tags (Open Class)
• NN (noun singular, wombat)
‣ NNS (plural, wombats)
‣ NNP (proper, Australia)
‣ NNPS (proper plural, Australians)
• VB (verb infinitive, eat)
‣ VBP (1st /2nd person present, eat)
‣ VBZ (3rd person singular, eats)
‣ VBD (past tense, ate)
‣ VBG (gerund, eating)
‣ VBN (past participle, eaten)
COMP90042 L5
17
Derived Tags (Open Class)
• JJ (adjective, nice)
‣ JJR (comparative, nicer)
‣ JJS (superlative, nicest)
• RB (adverb, fast)
‣ RBR (comparative, faster)
‣ RBS (superlative, fastest)
COMP90042 L5
18
Derived Tags (Closed Class)
• PRP (pronoun personal, I)
‣ PRP$ (possessive, my)
• WP (Wh-pronoun, what):
‣ WP$ (possessive, whose)
‣ WDT(wh-determiner, which)
‣ WRB (wh-adverb, where)
COMP90042 L5
19
Tagged Text Example
The/DT limits/NNS to/TO legal/JJ absurdity/NN
stretched/VBD another/DT notch/NN this/DT week/
NN
when/WRB the/DT Supreme/NNP Court/NNP
refused/VBD to/TO hear/VB an/DT appeal/VB from/
IN a/DT case/NN that/WDT says/VBZ corporate/JJ
defendants/NNS must/MD pay/VB damages/NNS
even/RB after/IN proving/VBG that/IN they/PRP
could/MD not/RB possibly/RB have/VB
caused/VBN the/DT harm/NN ./.
COMP90042 L5
20
Tagged Text Example
The/DT limits/NNS to/TO legal/JJ absurdity/NN
stretched/VBD another/DT notch/NN this/DT week/
NN
when/WRB the/DT Supreme/NNP Court/NNP
refused/VBD to/TO hear/VB an/DT appeal/VB from/
IN a/DT case/NN that/WDT says/VBZ corporate/JJ
defendants/NNS must/MD pay/VB damages/NNS
even/RB after/IN proving/VBG that/IN they/PRP
could/MD not/RB possibly/RB have/VB
caused/VBN the/DT harm/NN ./.
COMP90042 L5
21
Tag the following sentence with Penn
Treebank’s POS tagset:
CATS SHOULD CATCH MICE EASILY
PollEv.com/jeyhanlau569
http://PollEv.com/jeyhanlau569
http://PollEv.com/jeyhanlau569
COMP90042 L5
22
Automatic Tagging
COMP90042 L5
23
Why Automatically POS tag?
• Important for morphological analysis, e.g. lemmatisation
• For some applications, we want to focus on certain POS
‣ E.g. nouns are important for information retrieval, adjectives
for sentiment analysis
• Very useful features for certain classification tasks
‣ E.g. genre attribution (fiction vs. non-fiction)
• POS tags can offer word sense disambiguation
‣ E.g. cross/NN vs cross/VB cross/JJ
• Can use them to create larger structures (parsing;
lecture 14–16)
COMP90042 L5
24
Automatic Taggers
• Rule-based taggers
• Statistical taggers
‣ Unigram tagger
‣ Classifier-based taggers
‣ Hidden Markov Model (HMM) taggers
COMP90042 L5
25
Rule-based tagging
• Typically starts with a list of possible tags for each
word
‣ From a lexical resource, or a corpus
• Often includes other lexical information, e.g. verb
subcategorisation (its arguments)
• Apply rules to narrow down to a single tag
‣ E.g. If DT comes before word, then eliminate VB
‣ Relies on some unambiguous contexts
• Large systems have 1000s of constraints
COMP90042 L5
26
Unigram tagger
• Assign most common tag to each word type
• Requires a corpus of tagged words
• “Model” is just a look-up table
• But actually quite good, ~90% accuracy
‣ Correctly resolves about 75% of ambiguity
• Often considered the baseline for more complex
approaches
COMP90042 L5
27
Classifier-Based Tagging
• Use a standard discriminative classifier (e.g.
logistic regression, neural network), with features:
‣ Target word
‣ Lexical context around the word
‣ Already classified tags in sentence
• But can suffer from error propagation: wrong
predictions from previous steps affect the next
ones
COMP90042 L5
28
Hidden Markov Models
• A basic sequential (or structured) model
• Like sequential classifiers, use both previous tag and
lexical evidence
• Unlike classifiers, considers all possibilities of previous tag
• Unlike classifiers, treat previous tag evidence and lexical
evidence as independent from each other
‣ Less sparsity
‣ Fast algorithms for sequential prediction, i.e. finding the best
tagging of entire word sequence
• Next lecture!
COMP90042 L5
29
Unknown Words
• Huge problem in morphologically rich languages
(e.g. Turkish)
• Can use things we’ve seen only once (hapax
legomena) to best guess for things we’ve never
seen before
‣ Tend to be nouns, followed by verbs
‣ Unlikely to be determiners
• Can use sub-word representations to capture
morphology (look for common affixes)
COMP90042 L5
30
A Final Word
• Part of speech is a fundamental intersection
between linguistics and automatic text analysis
• A fundamental task in NLP, provides useful
information for many other applications
• Methods applied to it are typical of language tasks
in general, e.g. probabilistic, sequential machine
learning
COMP90042 L5
31
Reading
• JM3 Ch. 8-8.2