PowerPoint Presentation
Comp90042
Workshop
Week 6
25 April
1
Lexical Semantics
Distributional Semantics
2
Table of Content
Give illustrative examples that show the difference between:
(a) Synonyms and hypernyms
(b) Hyponyms and meronyms
The relationships between words meanings
3
Question 1
Synonyms: words share (mostly) the same meanings
snake and serpent
4
Synonyms
snake
serpent
Synonyms
Synonyms: words share (mostly) the same meanings
snake and serpent
Hypernyms: One word is a hypernym of a second word when it is a more general instance (“higher up” in the hierarchy) of the latter
reptile is the hypernym of snake (in its animal sense)
5
Synonyms vs Hypernyms
snake
reptile
Hypernyms
Hypernyms: One word is a hypernym of a second word when it is a more general instance (“higher up” in the hierarchy) of the latter
reptile is the hypernym of snake (in its animal sense)
Hyponyms : One word is a hyponym of a second word when it is a more specific instance (“lower down” in the hierarchy) of the latter
snake is the hypernym of reptile(in its animal sense)
6
Hyponyms vs Hypernyms
snake
reptile
Hyponyms
Hyponyms : One word is a hyponym of a second word when it is a more specific in- stance (“lower down” in the hierarchy) of the latter
snake is the hypernym of reptile(in its animal sense)
Meronyms : One word is a meronym of a second word when it is a part of the whole defined by the latter
scales (the skin structure) is a meronym of reptile.
7
Hyponyms vs Meronyms
reptile
scales
Meronym
Movie and Film
Hand and Finger
Furniture and Table
8
Exercise
9
Question 2
10
Wordnet
Wu & Palmer similarity
LCS: lowest common subsumer
Depth: path length from node to root
11
Word Similarity
Choose the first meaning of two words
Depth(information) = 5
12
Wordnet
Choose the first meaning of two words
Depth(information) = 5
Depth(retrieval) = ?
13
Wordnet
Choose the first meaning of two words
Depth(information) = 5
Depth(retrieval) = 8
14
Wordnet
Choose the first meaning of two words
Depth(information) = 5
Depth(retrieval) = 8
LCS(information,retrieval): ?
15
Wordnet
Choose the first meaning of two words
Depth(information) = 5
Depth(retrieval) = 8
LCS(information,retrieval): entity
16
Wordnet
Choose the first meaning of two words
Depth(information) = 5
Depth(retrieval) = 8
LCS(information,retrieval): entity
Depth(LCS) = ?
17
Wordnet
Choose the first meaning of two words
Depth(information) = 5
Depth(retrieval) = 8
LCS(information,retrieval): entity
Depth(LCS) = Depth(entity) = 1
18
Wordnet
Choose the first meaning of two words
Depth(information) = 5
Depth(retrieval) = 8
LCS(information, retrieval): entity
Depth(LCS) = Depth(entity) = 1
19
Wordnet
20
Similarity
Try to calculate the similarity of information and science yourself
The maximum similarity is 0.727
sim(information, science) > sim ( information, retrieval)
Does this mesh with your intuition?
21
Similarity
22
Question 3
Words can have multiple senses
23
Question 3
Words can have multiple senses
Word sense disambiguation
automatically determining which sense (usually, Wordnet synset) of a word is intended for a given token instance with a document.
24
Question 3
25
Question 4
26
Point-wise Mutual Information (PMI)
represent how often two events co-occur
p(x,y) : joint distribution of x and y = count(x,y)/Σ
p(x): individual distribution of x. = Σx /Σ
p(y): individual distribution of y = Σy /Σ
Total number of instance (Σ): 55+225+315+1405 = 2000
27
Question 4
Total number of instance (Σ): 55+225+315+1405 = 2000
P(world) = (55 + 225) / 2000 = 0.14
28
Question 4
Total number of instance (Σ): 55+225+315+1405 = 2000
P(world) = (55 + 225) / 2000 = 0.14
P(cup) = ?
29
Question 4
Total number of instance (Σ): 55+225+315+1405 = 2000
P(world) = (55 + 225) / 2000 = 0.14
P(cup) = (55 + 315) / 2000 = 0.185
30
Question 4
Total number of instance (Σ): 55+225+315+1405 = 2000
P(world) = (55 + 225) / 2000 = 0.14
P(cup) = (55 + 315) / 2000 = 0.185
P(w,c) = ?
31
Question 4
Total number of instance (Σ): 55+225+315+1405 = 2000
P(world) = (55 + 225) / 2000 = 0.14
P(cup) = (55 + 315) / 2000 = 0.185
P(w,c) = 55 / 2000 = 0.0275
32
Question 4
Total number of instance (Σ): 55+225+315+1405 = 2000
P(world) = (55 + 225) / 2000 = 0.14
P(cup) = (55 + 315) / 2000 = 0.185
P(w,c) = 55 / 2000 = 0.0275
PMI(w,c) = ?
33
Question 4
Total number of instance (Σ): 55+225+315+1405 = 2000
P(world) = (55 + 225) / 2000 = 0.14
P(cup) = (55 + 315) / 2000 = 0.185
P(w,c) = 55 / 2000 = 0.0275
PMI(w,c) = log2(p(w,c) / (p(w)*p(y))
= log2(0.0275 / (0.14*0.185))
≈ 0.0865
34
Question 4
PMI(w,c) ≈ 0.0865
Distributional similarity?
35
Question 4
PMI(w,c) ≈ 0.0865
Distributional similarity
slightly positive
occur together slightly more commonly than would occur purely by chance.
World Cup!
36
Question 4
37
Question 5
38
Singular Value Decomposition
throw away the less important characteristics
identify the most important characteristics of word
39
Question 5
throw away the less important characteristics
identify the most important characteristics of word
create dense matrix
Save time and storage
40
Question 5
Word embedding:
41
Question 6
Word embedding:
Representation of words into a vector space
Capture semantic and syntactic relationship between words
42
Question 6
Word embedding:
Representation of words into a vector space
Capture semantic and syntactic relationship between words
broadly the same as what we expect in distributional similarity
43
Question 6
44
SG and CBOW
45
SG and CBOW
SG or CBOW?
SG or CBOW?
46
SG and CBOW
SG or CBOW
SG or CBOW
Taking the dot product of the relevant vectors
Marginalising
Negative samples
Example of SG:
47
Training
/docProps/thumbnail.jpeg