12-wordnet.pptx
Ling 131A
Introduction to NLP with Python
WordNet
Marc Verhagen, Fall 2018
Today
• Assignment 3 – questions?
• Assignment 2 – feedback
• Word Lists and WordNet
WordNet
• A lexical knowledgebase based on conceptual
lookup
• Organizing concepts in a semantic network
• Organize lexical information in terms of word
meaning, rather than word form
– WordNet can be used as a thesaurus
• http://wordnet.princeton.edu
WordNet
• It’s big
– 155,287 words and 117,659 synonym sets.
• It’s free.
• Originally designed as a model of human
semantic memory (Miller, 1985)
• Widely used in NLP
Synonymy
One of the main guiding principles in building WordNet
Distribution principle:
Words A and B are called ‘synonyms’ if their distribution is identical in a
corpus. That means they can replace each other in any context. (Strong
requirement – ideal)
Pure synonym:
If A and B are synonyms in all context (can replace in all contexts) they are
pure synonyms. It has been very difficult to find pure synonyms.
Question: How to ensure replaceability in
– Syntax
– Semantics
Lexical Matrix
Word
meanings
Word Forms
F1 F2 F3 F… Fn
M1 E1,1 E1,2
M2 E2,2
M3 E3,3
M…
Mm Em,n
Synonymous words Polysemous words
Lexical Matrix
Word
meanings
Word Forms
F1 F2 F3 F… Fn
M1 (depend) E1,1 (bank) E1,2 (rely) E1,3
M2 (bank) E2,2
M3 (bank)E3,2
M…
Mm Em,n
synset lemma
Psycholinguistic Theory
• Human lexical memory for nouns as a
hierarchy.
– Can a canary sing? – Pretty fast response.
– Can a canary fly? – Slower response.
– Does a canary have skin? – Slowest response.
Animal
Bird
Canary
(can move, has skin)
(can fly)
(can sing)
Wordnet as a lexical reference system
based on psycholinguistic theories of
human lexical memory.
Synsets
• Synset ID: a unique number identifying a synset
• Category: POS category of the words
• Name: name of the synset
• Definition: definition of the synset
• Example: One or more examples of the words in the
synset being used in sentences
• lemmas: The set of synonymous words comprised in the
synset
Synsets
{house} is ambiguous.
{house, home}
has the sense of a social unit living together;
Is this the minimal unit?
{family, household, house, home, menage}
will make the unit completely unambiguous.
ordered according to frequency.
House – all nouns
1. (n) house (a dwelling that serves as living quarters for one or more families) “he has a house on
Cape Cod”; “she felt she had to get out of the house”
2. (n) firm, house, business firm (the members of a business organization that owns or operates
one or more establishments) “he worked for a brokerage house”
3. (n) house (the members of a religious community living together)
4. (n) house (the audience gathered together in a theatre or cinema) “the house applauded”; “he
counted the house”
5. (n) house (an official assembly having legislative powers) “a bicameral legislature has two
houses”
6. (n) house (aristocratic family line) “the House of York”
7. (n) house (play in which children take the roles of father or mother or children and pretend to
interact like adults) “the children were playing house”
8. (n) sign of the zodiac, star sign, sign, mansion, house, planetary house ((astrology) one of 12
equal areas into which the zodiac is divided)
9. (n) house (the management of a gambling house or casino) “the house gets a percentage of
every bet”
10. (n) family, household, house, home, menage (a social unit living together) “he moved his family
to Virginia”; “It was a good Christian household”; “I waited until the whole house was asleep”;
“the teacher asked how many people made up his home”; “the family refused to accept his will”
11. (n) theater, theatre, house (a building where theatrical performances or motion-picture shows
can be presented) “the house was full”
12. (n) house (a building in which something is sheltered or located) “they had a large carriage
house”
Semantic relations in Wordnet
1. Synonymy (equality)
2. Hypernymy / Hyponymy (super/sub)
3. Antonymy (opposites)
4. Meronymy / Holonymy (part/whole)
5. Entailment (if-then)
6. Troponymy (manner)
1 and 3 are lexical (lemma to lemma), the others
are semantic (synset to synset).
Semantic Relations
• Hypernymy and Hyponymy
– Relation between word senses (synsets)
– X is a hyponym of Y if X is a kind of Y
– Hyponymy is transitive and asymmetrical
– Hypernymy is inverse of Hyponymy
– Path: ( lion à animal à animate entity à entity)
– Distance between synsets often used to
determine how closely related to concepts are
Semantic Relations (continued)
• Meronymy and Holonymy
– Part-whole relation, branch is a part of tree
– X is a meronym of Y if X is a part of Y
– Holonymy is the inverse relation of Meronymy
{kitchen} ………………. {house}
Kinds of Meronymy
Component-object Head – Body
Staff-object Wood – Table
Member-collection Tree – Forest
Feature-Activity Speech – Conference
Place-Area Palo Alto – California
Phase-State Youth – Life
Resource-process Pen – Writing
Actor-Act Physician – Treatment
Lexical Relation
• Antonymy
– Opposites in meaning
– Relation between word forms
– Often determined by phonetics, word length etc.
– ({rise, ascend} vs. {fall, descend})
Kinds of Antonymy
Size Small – Big
Quality Good – Bad
State Warm – Cool
Personality Dr. Jekyl- Mr. Hyde
Direction East- West
Action Buy – Sell
Amount Little – A lot
Place Far – Near
Time Day – Night
Gender Boy – Girl
Entailment.
• Snoring entails sleeping.
• Buying entails paying.
• Proper Temporal Inclusion.
• Inclusion can be in any way.
– Sleeping temporally includes snoring.
– Buying temporally includes paying.
Gloss
study
Hyponymy
Hyponymy
Dwelling, abode
bedroom
kitchen
house, home
A place that serves as the living
quarters of one or mor efamilies
guestroom
veranda
backyard
hermitage cottage
Meronymy
Hyponymy
Meronymy
Hypernymy
WordNet Sub-Graph (English)
WordNet goes global
• PrincetonWordNet
– The first wordnet in the world was for English
developed at Princeton over 15 years (Miller
1995, Fellbaum 1998).
• EuroWordNet
– linked structure of European language wordnets
was built in 1998 over 3 years with funding from
the EC.
• Global WordNet:
– Building on Princeton WordNet and EuroWordNet
– http://www.globalwordnet.org