Part of Speech Tagging
COMP90042
Natural Language Processing
Lecture 5
Semester 1 2021 Week 3 Jey Han Lau
COPYRIGHT 2021, THE UNIVERSITY OF MELBOURNE
1
COMP90042
L5
What is Part of Speech (POS)?
• AKAwordclasses,morphologicalclasses,syntactic categories
• Nouns,verbs,adjective,etc
• POStellsusquiteabitaboutawordandits
neighbours:
‣ nouns are often preceded by determiners
‣ verbs preceded by nouns
‣ content as a noun pronounced as CONtent
‣ content as a adjective pronounced as conTENT
2
COMP90042
L5
Information Extraction Given this:
‣ “Brasilia, the Brazilian capital, was founded in 1960.”
•
•
•
Obtain this:
‣ capital(Brazil, Brasilia) ‣ founded(Brasilia, 1960)
Many steps involved but first need to know nouns (Brasilia, capital), adjectives (Brazilian), verbs (founded) and numbers (1960).
3
COMP90042
L5
• • •
Parts of speech Tagsets
Automatic Tagging
Outline
4
COMP90042
L5
POS Open Classes
Open vs closed classes: how readily do POS categories take on new words? Just a few open classes:
• Nouns
‣ Proper (Australia) versus common (wombat) ‣ Mass (rice) versus count (bowls)
• Verbs
‣ Rich inflection (go/goes/going/gone/went)
‣ Auxiliary verbs (be, have, and do in English)
‣ Transitivity (wait versus hit versus give)
— number of arguments
5
COMP90042
L5
POS Open Classes Adjectives
‣ Gradable (happy) versus non-gradable (computational)
•
•
Adverbs
‣ Manner (slowly) ‣ Locative (here)
‣ Degree (really)
‣ Temporal (today)
6
COMP90042
L5
POS Closed Classes (English) • Prepositions (in, on, with, for, of, over,…)
‣ on the table • Particles
‣ brushed himself off
• Determiners
‣ Articles (a, an, the)
‣ Demonstratives (this, that, these, those) ‣ Quantifiers (each, every, some, two,…)
• Pronouns
‣ Personal (I, me, she,…)
‣ Possessive (my, our,…)
‣ Interrogative or Wh (who, what, …)
7
COMP90042
L5
•
•
Conjunctions
‣ Coordinating (and, or, but)
‣ Subordinating (if, although, that, …)
•
And some more…
‣ negatives, politeness markers, etc
POS Closed Classes (English)
Modal verbs
‣ Ability (can, could)
‣ Permission (can, may)
‣ Possibility (may, might, could, will) ‣ Necessity (must)
8
COMP90042
L5
Is POS universal? What open classes are seen in all languages?
• Noun
• Verb
• Adjective • Adverb
PollEv.com/jeyhanlau569
9
COMP90042
L5
10
COMP90042
L5
•
•
•
Many word types belong to multiple classes
POS depends on context
Compare:
‣ Time flies like an arrow ‣ Fruit flies like a banana
Ambiguity
Time
flies
like
an
arrow
noun
verb
preposition
determiner
noun
Fruit
flies
like
a
banana
noun
noun
verb
determiner
noun
11
COMP90042
L5
POS Ambiguity in News Headlines
• BritishLeftWafflesonFalklandIslands
‣ [British Left] [Waffles] [on] [Falkland Islands]
• JuvenileCourttoTryShootingDefendant
‣ [Juvenile Court] [to] [Try] [Shooting Defendant]
• TeachersStrikeIdleKids
‣ [Teachers Strike] [Idle Kids]
• EyeDropsOffShelf
‣ [Eye Drops] [Off Shelf]
12
COMP90042
L5
Tagsets
13
COMP90042
L5
•
•
A compact representation of POS information
‣ Usually ≤ 4 capitalized characters (e.g. NN = noun) ‣ Often includes inflectional distinctions
•
At least one tagset for all major languages
Major English tagsets ‣ Brown (87 tags)
‣ Penn Treebank (45 tags) ‣ CLAWS/BNC (61 tags)
‣ “Universal” (12 tags)
Tagsets
14
COMP90042
L5
Major Penn Treebank Tags
NN noun
JJ adjective
VB verb RB adverb
DT
IN
MD
RP particle TO to
determiner preposition
CD
PRP
CC
WH wh-pronoun
modal
coordinating conjunction
cardinal number personal pronoun
15
COMP90042
L5
Derived Tags (Open Class)
• NN(nounsingular,wombat)
‣ NNS (plural, wombats)
‣ NNP (proper, Australia)
‣ NNPS (proper plural, Australians)
• VB(verbinfinitive,eat)
‣ VBP (1st /2nd person present, eat) ‣ VBZ (3rd person singular, eats)
‣ VBD (past tense, ate)
‣ VBG (gerund, eating)
‣ VBN (past participle, eaten)
16
COMP90042
L5
•
JJ (adjective, nice)
‣ JJR (comparative, nicer) ‣ JJS (superlative, nicest)
•
RB (adverb, fast)
‣ RBR (comparative, faster) ‣ RBS (superlative, fastest)
Derived Tags (Open Class)
17
COMP90042
L5
•
•
PRP (pronoun personal, I) ‣ PRP$ (possessive, my)
Derived Tags (Closed Class)
WP (Wh-pronoun, what):
‣ WP$ (possessive, whose)
‣ WDT(wh-determiner, which) ‣ WRB (wh-adverb, where)
18
COMP90042
L5
Tagged Text Example
The/DT limits/NNS to/TO legal/JJ absurdity/NN
stretched/VBD another/DT notch/NN this/DT week/ NN
when/WRB the/DT Supreme/NNP Court/NNP refused/VBD to/TO hear/VB an/DT appeal/VB from/ IN a/DT case/NN that/WDT says/VBZ corporate/JJ defendants/NNS must/MD pay/VB damages/NNS even/RB after/IN proving/VBG that/IN they/PRP could/MD not/RB possibly/RB have/VB
caused/VBN the/DT harm/NN ./.
19
COMP90042
L5
Tagged Text Example
The/DT limits/NNS to/TO legal/JJ absurdity/NN
stretched/VBD another/DT notch/NN this/DT week/ NN
when/WRB the/DT Supreme/NNP Court/NNP refused/VBD to/TO hear/VB an/DT appeal/VB from/ IN a/DT case/NN that/WDT says/VBZ corporate/JJ defendants/NNS must/MD pay/VB damages/NNS even/RB after/IN proving/VBG that/IN they/PRP could/MD not/RB possibly/RB have/VB
caused/VBN the/DT harm/NN ./.
20
COMP90042
L5
Tag the following sentence with Penn Treebank’s POS tagset:
CATS SHOULD CATCH MICE EASILY
PollEv.com/jeyhanlau569
21
COMP90042
L5
Automatic Tagging
22
COMP90042
L5
Why Automatically POS tag?
• Important for morphological analysis, e.g. lemmatisation
• For some applications, we want to focus on certain POS ‣ E.g. nouns are important for information retrieval, adjectives
for sentiment analysis
• Very useful features for certain classification tasks
‣ E.g. genre attribution (fiction vs. non-fiction)
• POS tags can offer word sense disambiguation
‣ E.g. cross/NN vs cross/VB cross/JJ
• Can use them to create larger structures (parsing; lecture 14–16)
23
COMP90042
L5
• •
Automatic Taggers
Rule-based taggers
Statistical taggers ‣ Unigram tagger
‣ Classifier-based taggers
‣ Hidden Markov Model (HMM) taggers
24
COMP90042
L5
•
•
•
•
Typically starts with a list of possible tags for each word
‣ From a lexical resource, or a corpus
Often includes other lexical information, e.g. verb
subcategorisation (its arguments)
Rule-based tagging
Apply rules to narrow down to a single tag
‣ E.g. If DT comes before word, then eliminate VB ‣ Relies on some unambiguous contexts
Large systems have 1000s of constraints
25
COMP90042
L5
•
•
•
•
•
Assign most common tag to each word type Requires a corpus of tagged words
“Model” is just a look-up table
But actually quite good, ~90% accuracy
‣ Correctly resolves about 75% of ambiguity
Unigram tagger
Often considered the baseline for more complex approaches
26
COMP90042
L5
•
Use a standard discriminative classifier (e.g. logistic regression, neural network), with features:
‣ Target word
‣ Lexical context around the word
‣ Already classified tags in sentence
•
But can suffer from error propagation: wrong predictions from previous steps affect the next ones
Classifier-Based Tagging
27
COMP90042
L5
Hidden Markov Models
• Abasicsequential(orstructured)model
• Likesequentialclassifiers,usebothprevioustagand lexical evidence
• Unlikeclassifiers,considersallpossibilitiesofprevioustag
• Unlikeclassifiers,treatprevioustagevidenceandlexical
evidence as independent from each other
‣ Less sparsity
‣ Fast algorithms for sequential prediction, i.e. finding the best tagging of entire word sequence
• Nextlecture!
28
COMP90042
L5
•
•
Huge problem in morphologically rich languages
(e.g. Turkish)
•
Can use sub-word representations to capture morphology (look for common affixes)
Unknown Words
Can use things we’ve seen only once (hapax legomena) to best guess for things we’ve never seen before
‣ Tend to be nouns, followed by verbs
‣ Unlikely to be determiners
29
COMP90042
L5
•
•
•
Part of speech is a fundamental intersection between linguistics and automatic text analysis
A Final Word
A fundamental task in NLP, provides useful information for many other applications
Methods applied to it are typical of language tasks in general, e.g. probabilistic, sequential machine learning
30
COMP90042
L5
•
JM3 Ch. 8-8.2
Reading
31