CS计算机代考程序代写 Probabilistic Context-Free Grammar

Probabilistic Context-Free Grammar
COMP90042
Natural Language Processing
Lecture 15
Semester 1 2021 Week 8 Jey Han Lau
COPYRIGHT 2021, THE UNIVERSITY OF MELBOURNE
1

COMP90042
L15

Context-free grammars assign hierarchical structure to language
‣ Formulated as generating all strings in the language ‣ Predicting the structure(s) 

for a given string
Ambiguity In Parsing


Raises problem of 
 ambiguity — which is 
 better?
Probabilistic CFG!
2

COMP90042
L15
• • •
Basics of Probabilistic CFGs (PCFGs) PCFG parsing
Limitations of CFG
Outline
3

COMP90042
L15
Basics of PCFGs
4

COMP90042
L15
Basics of PCFGs
• Samesymbolset:
‣ Terminals: words such as book
‣ Non-terminal: syntactic labels such as NP or NN
• Sameproductions(rules)
‣ LHS non-terminal → ordered list of RHS symbols
• Inaddition,storeaprobabilitywitheach production
‣ NP→DTNN
‣ NN→cat
‣ NN → leprechaun ‣…
[p=0.45] [p=0.02]
[p = 0.00001]
5

COMP90042
L15
Basics of PCFGs
• Probabilityvaluesdenoteconditional ‣ P(LHS → RHS)
‣ P(RHS | LHS)
• Consequentlythey:
‣ must be positive values, between 0 and 1 ‣ must sum to one for given LHS
• E.g.,
‣ NN → aadvark
‣ NN→cat
‣ NN → leprechaun
‣ ∑P(NN→x)=1 x
[p = 0.0003] [p=0.02]
[p = 0.0001]
6

COMP90042
L15
7

COMP90042
L15
Stochastic Generation with PCFGs
Almost the same as for CFG, with one twist:
1. Start with S, the sentence symbol
2. Choose a rule with S as the LHS
‣ Randomly select a RHS according to P(RHS | LHS)

e.g., S → VP
‣ Apply this rule, e.g., substitute VP for S
3. Repeat step 2 for each non-terminal in the string 
 (here, VP)
4. Stop when no non-terminals remain
Gives us a tree, as before, with a sentence as the yield
8

COMP90042
L15


How Likely Is a Tree? Given a tree, we can compute its probability
‣ Decomposes into probability of each production P(tree) =
P(S → VP) ×
P(VP → Verb NP) ×
P(Verb → Book) ×
P(NP → Det Nominal) ×
P(Det → the) ×
P(Nominal → Nominal Noun) × P(Nominal → Noun) ×
P(Noun → dinner) ×
P(Noun → flight)
9

COMP90042
L15
P(tree)
How Likely Is a Tree?
= P(S → VP) × P(VP → Verb NP) × P(Verb → Book) × 

P(NP → Det Nominal) × P(Det → the) × P(Nominal → Nominal Noun) × 
 P(Nominal → Noun) × P(Noun → dinner) × P(Noun → flight)
= 0.05 × 0.20 × 0.30 ×
 0.20 × 0.60 × 0.20 ×
 0.75 × 0.10 × 0.40
= 2.2 × 10-6
10

COMP90042
L15
Resolving Parse Ambiguity
• Can select between different trees based on P(T)
• P(Tleft) = 2.2 × 10-6 P(Tright) = 6.1 × 10-7
11

COMP90042
L15
PCFG Parsing
12

COMP90042
L15

Before we looked at ‣ CYK
‣ for unweighted grammars (CFGs) ‣ finds all possible trees


But there are often 1000s, many completely nonsensical
Parsing PCFGs
Can we solve for the most probable tree?
13

COMP90042
L15
• • •
CYK finds all trees for a sentence; we want best tree Prob. CYK follows similar process to standard CYK Convert grammar to Chomsky Normal Form (CNF)
CYK for PCFGs
‣ VP → Verb NP NP ‣ VP → Verb NP+NP
NP+NP → NP NP
‣ where NP+NP is a new symbol.
[0.10]

[0.10]
 [1.0]

14

COMP90042
L15

[0,1]
we
[0,2]
eat
sushi
with
chopsticks
[1,2]
[0,3]
[1,3]
[2,3]
[0,4]
[1,4]
[2,4]
[3,4]
[0,5]
[1,5]
[2,5]
[3,5]
[4,5]
S →NPVP 1 NP →NPPP 1⁄8 → we 1⁄4 → sushi 1⁄8 → chopsticks 1⁄2 PP →INNP 1 IN → with 1 VP →VNP 1⁄2 →VPPP 1⁄4 → MDV 1⁄4 V → eat 1
15

COMP90042
L15
we

NP 1/4

[0,1]
[0,2]
eat
sushi
with
chopsticks
V1 [1,2]
[0,3]
[1,3]
NP 1/8 [2,3]
[0,4]
[1,4]
[2,4]
IN 1 [3,4]
[0,5]
[1,5]
[2,5]
[3,5]
NP 1/2 [4,5]
S →NPVP 1 NP →NPPP 1⁄8 → we 1⁄4 → sushi 1⁄8 → chopsticks 1⁄2 PP →INNP 1 IN → with 1 VP →VNP 1⁄2 →VPPP 1⁄4 → MDV 1⁄4 V → eat 1
16

COMP90042
L15
we

NP 1/4

[0,1]
Ø [0,2]
eat
sushi
with
chopsticks
V1 [1,2]
[0,3]
[1,3]
NP 1/8 [2,3]
[0,4]
[1,4]
[2,4]
IN 1 [3,4]
[0,5]
[1,5]
[2,5]
[3,5]
NP 1/2 [4,5]
S →NPVP 1 NP →NPPP 1⁄8 → we 1⁄4 → sushi 1⁄8 → chopsticks 1⁄2 PP →INNP 1 IN → with 1 VP →VNP 1⁄2 →VPPP 1⁄4 → MDV 1⁄4 V → eat 1
17

COMP90042
L15
we
V1 [1,2]
eat
sushi
VP 1/16
(1/2 * 1 * 1/8) [1,3]
NP 1/8 [2,3]
[0,4]
[2,4]
with
IN 1 [3,4]
chopsticks

NP 1/4

[0,1]
Ø [0,2]
[0,3]
[0,5]
[1,4]
[1,5]
[2,5]
[3,5]
NP 1/2 [4,5]
S →NPVP 1 NP →NPPP 1⁄8 → we 1⁄4 → sushi 1⁄8 → chopsticks 1⁄2 PP →INNP 1 IN → with 1 VP →VNP 1⁄2 →VPPP 1⁄4 → MDV 1⁄4 V → eat 1
18

COMP90042
L15
we

NP 1/4

[0,1]
V1 [1,2]
eat
sushi
[0,4]
with
chopsticks
Ø [0,2]
S 1/64
(1 * 1/4 * 1/16) [0,3]
[0,5]
VP 1/16 [1,3]
[1,4]
[1,5]
NP 1/8 [2,3]
[2,4]
IN 1 [3,4]
[2,5]
[3,5]
NP 1/2 [4,5]
S →NPVP 1 NP →NPPP 1⁄8 → we 1⁄4 → sushi 1⁄8 → chopsticks 1⁄2 PP →INNP 1 IN → with 1 VP →VNP 1⁄2 →VPPP 1⁄4 → MDV 1⁄4 V → eat 1
19

COMP90042
L15
we

NP 1/4

[0,1]
V1 [1,2]
eat
sushi
Ø [0,4]
Ø [1,4]
with
chopsticks
Ø [0,2]
S 1/64 [0,3]
[0,5]
VP 1/16 [1,3]
[1,5]
NP 1/8 [2,3]
Ø [2,4]
IN 1 [3,4]
[2,5]
[3,5]
NP 1/2 [4,5]
S →NPVP 1 NP →NPPP 1⁄8 → we 1⁄4 → sushi 1⁄8 → chopsticks 1⁄2 PP →INNP 1 IN → with 1 VP →VNP 1⁄2 →VPPP 1⁄4 → MDV 1⁄4 V → eat 1
20

COMP90042
L15
we

NP 1/4

[0,1]
V1 [1,2]
eat
sushi
VP 1/16 [1,3]
Ø [1,4]
with
chopsticks
Ø [0,2]
S 1/64 [0,3]
Ø [0,4]
[0,5]
[1,5]
NP 1/8 [2,3]
Ø [2,4]
IN 1 [3,4]
[2,5]
PP 1/2
(1 * 1 * 1/2) [3,5]
NP 1/2 [4,5]
S →NPVP 1 NP →NPPP 1⁄8 → we 1⁄4 → sushi 1⁄8 → chopsticks 1⁄2 PP →INNP 1 IN → with 1 VP →VNP 1⁄2 →VPPP 1⁄4 → MDV 1⁄4 V → eat 1
21

COMP90042
L15
we
eat
sushi
with
chopsticks

NP 1/4

[0,1]
Ø [0,2]
S 1/64 [0,3]
Ø [0,4]
[0,5]
V1 [1,2]
VP 1/16 [1,3]
Ø [1,4]
[1,5]
NP 1/8 [2,3]
Ø [2,4]
NP 1/128
(1/8 * 1/8 * 1/2) [2,5]
IN 1 [3,4]
PP 1/2 [3,5]
NP 1/2 [4,5]
S →NPVP 1 NP →NPPP 1⁄8 → we 1⁄4 → sushi 1⁄8 → chopsticks 1⁄2 PP →INNP 1 IN → with 1 VP →VNP 1⁄2 →VPPP 1⁄4 → MDV 1⁄4 V → eat 1
22

COMP90042
L15
we
eat
sushi
with
chopsticks

NP 1/4

[0,1]
Ø [0,2]
S 1/64 [0,3]
Ø [0,4]
[0,5]
V1 [1,2]
VP 1/16 [1,3]
Ø [1,4]
?
[1,5]
NP 1/8 [2,3]
Ø [2,4]
NP 1/128 [2,5]
IN 1 [3,4]
PP 1/2 [3,5]
NP 1/2 [4,5]
S →NPVP 1 NP →NPPP 1⁄8 → we 1⁄4 → sushi 1⁄8 → chopsticks 1⁄2 PP →INNP 1 IN → with 1 VP →VNP 1⁄2 →VPPP 1⁄4 → MDV 1⁄4 V → eat 1
PollEv.com/jeyhanlau569
23

COMP90042
L15
24

COMP90042
L15
we
Ø [0,2]
eat
sushi
S 1/64 [0,3]
Ø [0,4]
with
chopsticks

NP 1/4

[0,1]
[0,5]
V1 [1,2]
VP 1/16 [1,3]
Ø [1,4]
VP 1/256
(1/2 * 1 * 1/128) [1,5]
NP 1/8 [2,3]
Ø [2,4]
NP 1/128 [2,5]
IN 1 [3,4]
PP 1/2 [3,5]
NP 1/2 [4,5]
S →NPVP 1 NP →NPPP 1⁄8 → we 1⁄4 → sushi 1⁄8 → chopsticks 1⁄2 PP →INNP 1 IN → with 1 VP →VNP 1⁄2 →VPPP 1⁄4 → MDV 1⁄4 V → eat 1
25

COMP90042
L15
we
Ø [0,2]
eat
sushi
S 1/64 [0,3]
Ø [0,4]
with
chopsticks

NP 1/4

[0,1]
[0,5]
V1 [1,2]
VP 1/16 [1,3]
Ø [1,4]
Ø [2,4]
VP 1/128
(1/4*1/16*1/2) [1,5]
NP 1/8 [2,3]
NP 1/128 [2,5]
IN 1 [3,4]
PP 1/2 [3,5]
NP 1/2 [4,5]
S →NPVP 1 NP →NPPP 1⁄8 → we 1⁄4 → sushi 1⁄8 → chopsticks 1⁄2 PP →INNP 1 IN → with 1 VP →VNP 1⁄2 →VPPP 1⁄4 → MDV 1⁄4 V → eat 1
26

COMP90042
L15
we
Ø [0,2]
V1 [1,2]
eat
sushi
S 1/64 [0,3]
VP 1/16 [1,3]
Ø [0,4]
Ø [1,4]
with
chopsticks

NP 1/4

[0,1]
[0,5]
VP 1/128
[1,5]
1/128 > 1/256!
S →NPVP 1 NP →NPPP 1⁄8 → we 1⁄4 → sushi 1⁄8 → chopsticks 1⁄2 PP →INNP 1 IN → with 1 VP →VNP 1⁄2 →VPPP 1⁄4 → MDV 1⁄4 V → eat 1
27
NP 1/8 [2,3]
Ø [2,4]
IN 1 [3,4]
NP 1/128 [2,5]
PP 1/2 [3,5]
NP 1/2 [4,5]

COMP90042
L15
we

NP 1/4

[0,1]
Ø [0,2]
eat
sushi
S 1/64 [0,3]
Ø [0,4]
with
chopsticks
S 1/512
(1 * 1/4 * 1/128) [0,5]
V1 [1,2]
VP 1/16 [1,3]
Ø [1,4]
VP 1/128 [1,5]
NP 1/8 [2,3]
Ø [2,4]
IN 1 [3,4]
NP 1/128 [2,5]
PP 1/2 [3,5]
NP 1/2 [4,5]
S →NPVP 1 NP →NPPP 1⁄8 → we 1⁄4 → sushi 1⁄8 → chopsticks 1⁄2 PP →INNP 1 IN → with 1 VP →VNP 1⁄2 →VPPP 1⁄4 → MDV 1⁄4 V → eat 1
28

COMP90042
L15
we

NP 1/4

[0,1]
Ø [0,2]
eat
sushi
S 1/64 [0,3]
Ø [0,4]
with
chopsticks
success!
S 1/512 [0,5]
V1 [1,2]
VP 1/16 [1,3]
Ø [1,4]
VP 1/128 [1,5]
NP 1/8 [2,3]
Ø [2,4]
IN 1 [3,4]
NP 1/128 [2,5]
PP 1/2 [3,5]
NP 1/2 [4,5]
S →NPVP 1 NP →NPPP 1⁄8 → we 1⁄4 → sushi 1⁄8 → chopsticks 1⁄2 PP →INNP 1 IN → with 1 VP →VNP 1⁄2 →VPPP 1⁄4 → MDV 1⁄4 V → eat 1
29

COMP90042
L15

• •

S in the top-right corner of parse table indicates success
Prob CYK: Retrieving the Parses
Retain back-pointer to best analysis
To get parse(s), follow pointers back for each match
Convert back from CNF by removing new non- terminals
30

COMP90042
L15
we

NP 1/4

[0,1]
Ø [0,2]
eat
sushi
S 1/64 [0,3]
Ø [0,4]
with
chopsticks
S 1/512 [0,5]
V1 [1,2]
VP 1/16 [1,3]
Ø [1,4]
VP 1/128 [1,5]
NP 1/8 [2,3]
Ø [2,4]
IN 1 [3,4]
NP 1/128 [2,5]
PP 1/2 [3,5]
NP 1/2 [4,5]
P(T) = 1/512
31

COMP90042
L15
Prob. CYK
Source: JM3 Ch 14
32

Both CYK and
 prob. CYK store all
 possible NTs
CYK
validity test now looks to
 see that the child chart cells
 have non-zero probability
Instead of storing set

of symbols, store the 
 probability of best scoring

tree fragment covering span 
 [i,j] with root symbol A


Overwrite lower scoring
 analysis if this one is better,
 and record the best production
Probabilistic CYK
Complexity in terms of sentence length n?
PollEv.com/jeyhanlau569
33

COMP90042
L15
34

COMP90042
L15
Limitations of CFG
35

COMP90042
L15
CFG Problem 1:
Poor Independence Assumptions
• Rewrite decisions made independently, whereas inter- dependence is often needed to capture global structure.
• NP → DT NN [0.28]
• NP → PRP [0.25]
• Probability of a rule independent of rest of tree
• No way to represent this contextual differences in PCFG probabilities
NP statistics in the Switchboard corpus
36

COMP90042
L15
Poor Independence Assumptions
NP statistics in the Switchboard corpus PCFG probabilities based on Switchboard corpus
NP → PRP should go up to 0.91 as a subject NP → DT NN should be 0.66 as an object



Solution: add a condition to denote whether NP is a subject or object
37

COMP90042
L15
Solution: Parent Conditioning
• Make non-terminals more explicit by incorporating parent symbol into each symbol

• NP^S represents subject position (left)
• NP^VP denotes object position (right)
38

COMP90042
L15
CFG Problem 2:
Lack of Lexical Conditioning
• Lack of sensitivity to words in tree
• Prepositional phrase (PP) attachment ambiguity
‣ Worker dumped sacks into a bin

39

COMP90042
L15
PP Attachment Ambiguity
“into a bin” describes the resulting location of the sacks
sacks to be dumped are the ones which are “into a bin”
40

COMP90042
L15
Coordination Ambiguity
• dogsinhousesandcats
• dogsissemanticallyabetterconjunctforcatsthan houses (dogs can’t fit into cats!)
41

COMP90042
L15
Solution: Head Lexicalisation • Record head word with parent symbols
‣ the most salient child of a constituent, usually the noun in a NP, verb in a VP etc
‣ VP→VBDNPPP ⇒

VP(dumped) → VBD(dumped) NP(sacks) PP(into)
42

COMP90042
L15

Incorporate head words into productions, to capture the most important links between words
‣ Captures correlations between head words of phrases
‣ PP(into): VP(dumped) vs. NP(sacks) Grammar symbol inventory expands massively!
‣ Many of the productions too specific, rarely seen
‣ Learning more involved to avoid sparsity problems 
 (e.g., zero probabilities)

Head Lexicalisation
43

COMP90042
L15

PCFGs widely used, and there are efficient parsers available.
‣ Collins parser, Berkeley parser, Stanford parser
‣ All use some form of lexicalisation But there are other grammar formalisms
‣ Lexical function grammar
‣ Head-driven phrase structure grammar ‣ Next lecture: dependency grammar!

A Final Word
44

COMP90042
L15

Required Reading JM3 Ch. 13-13.2
45