7a_Grammars_Parsing.dvi
COMP9414 Grammars and Parsing 1
This Lecture
� Overview of Natural Languages
� Syntax and Grammar for Natural Languages
� (Simple) Semantics for Natural Languages
� Pragmatics for Natural Languages
UNSW ©W. Wobcke et al. 2019–2021
COMP9414: Artificial Intelligence
Lecture 7a: Grammars and Parsing
Wayne Wobcke
e-mail:w. .au
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 3
Natural Language Processing
� Syntax
◮ Linguistic Knowledge
◮ Grammars and Parsing
◮ Probabilistic Parsing
� Semantics
◮ Semantic Interpretation and Logical Form
� Pragmatics
◮ Discourse Processing
◮ Speech Act Theory
◮ (Spoken) Dialogue Systems
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 2
Linguistics Landscape
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 5
NLP Applications
� Chatbots
◮ Customer service, e.g. CBA, Amtrak, Lyft, Spotify, Whole Foods
� Personal Assistants
◮ Siri, Alexa, Google Assistant
� Information Extraction
◮ Financial reports, news articles
� Machine (Assisted) Translation
◮ Weather reports, EU contracts, Canada Hansard
� Social Robotics
◮ Home care robots
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 4
Related Disciplines
� Linguistics
◮ Study of language in the abstract and particular languages
� Psycholinguistics
◮ Psychological models of human language processing
� Neurolinguistics
◮ Neural models of human language processing
� Logic
◮ Study of formal reasoning
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 7
Structural Ambiguity
“John saw Mary with a telescope”
� Different interpretation → different representation
“John sold a car to Mary” and “Mary was sold a car by John”
� Same interpretation → same representation
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 6
Central Problem – Ambiguity
� Natural languages exhibit ambiguity
“The fisherman went to the bank” (lexical)
“The boy saw a girl with a telescope” (structural)
“Every student took an exam” (semantic)
“The table won’t fit through the doorway because it is too
[wide/narrow]” (pragmatic)
� Ambiguity makes it difficult to interpret meaning of phrases/sentences
◮ But also makes inference harder to define and compute
� Resolve ambiguity by mapping to unambiguous representation
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 9
Framework (Chomsky)
� descriptive vs prescriptive
◮ Goal not to dictate use of language, but describe how language,
especially spoken language, is actually used
� sentence vs utterance
◮ Consider sentences (as an abstraction over utterances)
� competence vs performance
◮ Focus on underlying linguistic knowledge
� descriptive vs explanatory adequacy
◮ Aim to explain how linguistic knowledge is acquired
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 8
Syntax
� Linguistic Knowledge and Grammars
� Context Free Grammars
� Parsing
◮ Top Down Parsing
◮ Bottom Up Parsing
◮ Chart Parsing
◮ Deterministic Parsing
◮ Probabilistic Parsing
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 11
Lexical Items (Basic Words)
� Open class
◮ Nouns: denote objects (e.g. cat, John, justice)
◮ Verbs: denote actions, events (e.g. buy, break, believe)
◮ Adjectives: denote properties of objects (e.g. red, large)
◮ Adverbs: denote properties of events (e.g. quickly)
� Closed class (function words)
◮ Prepositions: at, in, of, on, . . .
◮ Articles: the, a, an
◮ Conjunctions: and, or, if, then, than, . . .
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 10
Methodology
� Autonomy of Syntax
John promised to work
∗John persuaded to work
vs
∗John was promised to work (by someone else)
John was persuaded to work (by someone else)
� Shows ‘promise’ and ‘persuade’ have different properties
∗ means ungrammatical
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 13
Noun Phrases
� Distribution over sentences
◮ Noun phrases: occur as “subject” with a range of “predicates”
• 〈noun phrase〉 ate the bone
• 〈noun phrase〉 saw the bird in the sky
• 〈noun phrase〉 believes that 2 + 2 = 4
◮ Examples
John, The dog, The big ugly dog,
The man in the red car,
The oldest man in the world with a beard,
The oldest man who lives in China, . . .
� Sentences need not “make sense”
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 12
Sentence Forms
� Declarative (indicative)
◮ Bart is listening.
� Yes/No question (interrogative)
◮ Is Bart listening?
� Wh-question (interrogative)
◮ When is Bart listening?
� Imperative
◮ Listen, Bart!
� Subjunctive
◮ If Bart were listening, he might hear something useful.
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 15
Inside Noun Phrases
� Within noun phrase
◮ Main item (the head of the phrase): noun
◮ Optional specifiers
• Determiners (articles, demonstratives, quantifiers)
• Adjectives and other nouns
◮ Mandatory arguments
• Depend on head (e.g. capital 〈of France〉)
◮ Optional modifiers
• Adjectival phrases (e.g. larger than Spain)
• Prepositional phrases (e.g. in the park)
• Relative clauses (e.g. who likes beer)
◮ Order specifiers, head, modifiers in English
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 14
Verb Phrases
� Distribution over sentences
◮ Verb phrases: occur as “predicate” with a range of “subjects”
John 〈verb phrase〉
The dog 〈verb phrase〉
Any noun phrase 〈verb phrase〉
◮ Examples
. . .
� Notice 〈verb phrase〉 depends on 〈noun phrase〉
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 17
Prepositional Phrases
� Within prepositional phrase
◮ Main item (the head of the phrase): preposition
◮ Mandatory arguments
• 〈noun phrase〉 (e.g. in the park)
� Nouns, verbs, etc. are just the heads of phrases
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 16
Inside Verb Phrases
� Within verb phrase
◮ Main item (the head of the phrase): verb
◮ Optional specifiers
• Auxiliary verbs (e.g. do, does, will, might, . . .)
• Adverbs (e.g. quickly)
◮ Mandatory arguments
• depend on head (e.g. bought 〈Henry〉 〈a book〉)
◮ Optional modifiers
• Adverbial phrases (e.g. more quickly than Henry)
◮ Notice similar structure to noun phrases
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 19
Typical (Small) Grammar
S → NP VP
NP → [Det] Adj∗ N [AP | PP | Rel Clause]∗
VP → V [NP] [NP] PP∗
AP → Adj PP
PP → P NP
Det → a | an | the | . . .
N → John | park | telescope | . . .
V → saw | likes | believes | . . .
Adj → hot | hotter | . . .
P → in | . . .
Special notation: ∗ is “0 or more”; [ . . ] is “optional”
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 18
Context Free Grammars
� Nonterminal symbols (grammatical categories)
� Terminal symbols (lexical items)
� Start symbol (a nonterminal) e.g. 〈sentence〉
� Rewrite rules
◮ nonterminal → sequence of nonterminals, terminals
e.g. 〈sentence〉 → 〈noun phrase〉 〈verb phrase〉
� Open question: is English context free?
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 21
(Leftmost) Derivation of Example
S
⇒ NP VP
⇒ N VP
⇒ John VP
⇒ John V NP PP
⇒ John saw NP PP
⇒ John saw N PP
⇒ John saw Mary PP
⇒ John saw Mary P NP
⇒ John saw Mary with NP
⇒ John saw Mary with Det N
⇒ John saw Mary with a N
⇒ John saw Mary with a telescope
⇒ means “rewrites as”
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 20
Syntactic Structure
Syntactically ambiguous = more than one parse tree
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 23
Parsing
� Aim is to compute a derivation of a sentence (hence tree)
� Methods
◮ Top down
• Start with S, apply rewrite rules until sentence reached
◮ Bottom up
• Start with sentence, apply rewrite rules “in reverse” until S is
reached
◮ Chart parsing
• Chart records parsed fragments and hypotheses
• Can mix top down and bottom up strategies
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 22
Rightmost Derivation
S
⇒ NP VP
⇒ NP V NP PP
⇒ NP V NP P NP
⇒ NP V NP P Det N
⇒ NP V NP P Det telescope
⇒ NP V NP P a telescope
⇒ . . .
⇒ . . .
⇒ . . .
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 25
Example
STACK INPUT
S John saw Mary with a telescope
VP NP John saw Mary with a telescope
VP N John saw Mary with a telescope
VP John John saw Mary with a telescope
VP saw Mary with a telescope
PP NP V saw Mary with a telescope
PP NP saw saw Mary with a telescope
PP NP Mary with a telescope
. . . . . .
. . . . . .
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 24
Top Down Parsing
� Use a stack to record working hypothesis
� Start with S as only symbol on stack
� At each step
◮ Rewrite top of stack T using grammar rule T → RHS
i.e. replace T by RHS (in reverse order), OR
◮ Match word on top of stack to next word in sentence
� Apply backtracking on failure
� Accept sentence when stack is empty and all words in sentence
matched; reject sentence when no rules to try
� Produces leftmost derivation
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 27
Example
STACK INPUT
John saw Mary with a telescope
John saw Mary with a telescope
N saw Mary with a telescope
NP saw Mary with a telescope
NP saw Mary with a telescope
NP V Mary with a telescope
NP V Mary with a telescope
NP V N with a telescope
. . . . . .
. . . . . .
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 26
Bottom Up Parsing
� Use a stack to record parsed (left-right) fragment
� Start with stack empty
� At each step
◮ Rewrite sequence at top of stack using rule T → RHS i.e. replace
RHS (in reverse) by T, OR
◮ Move word from input to stack
� Apply backtracking on failure
� Accept sentence when input empty and stack contains S; reject
sentence when no more rules to try
� Produces rightmost derivation (in reverse)
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 29
Fundamental Rule
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 28
Chart Parsing
� Use a chart to record parsed fragments and hypotheses
� Hypotheses N → α • β where N → αβ is a grammar rule means
“trying to parse N as αβ and have so far parsed α”
� One node in chart for each word gap, start and end
� One arc in chart for each hypothesis
� At each step, apply fundamental rule
◮ If chart has N → α•Bβ from n1 to n2 and B → γ• from n2 to n3
add N → αB•β from n1 to n3
� Accept sentence when S → α• is added from start to end
� Can produce any sort of derivation
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 31
Top Down Chart Parsing
� Start with S → •α from start node to start node for all rules S → α
where S is start symbol
� When adding N → α•Bγ from n1 to n2
◮ Also add B →•β from n2 to n2 for each rule B → β
◮ Including the special case when α is empty and n1 = n2
� Exercise: Trace top down chart parser on example
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 30
Example Chart
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 33
Compare and Contrast
� Top Down Parsing
◮ Simple, Memory efficient
◮ Much repeated work, may loop infinitely
� Bottom Up Parsing
◮ Less repeated work, harder to control
� Chart Parsing
◮ Memory inefficient (especially with features)
◮ No repeated work, difficult to control
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 32
Bottom Up Chart Parsing
� Start with arcs for each lexical item
� When adding C → α• from n1 to n2
◮ Also add B →•Cγ from n1 to n1 for each rule B → Cγ
� Exercise: Trace bottom up down chart parser on example
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 35
Heuristics
Minimal Attachment
� Minimize size of parse tree
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 34
Deterministic Parsing
� Motivation
◮ People don’t notice ambiguity . . .
◮ But sometimes have trouble
“The horse raced past the barn fell”
“We painted all the walls with cracks”
“The man kept the dog in the house”
� Can we do what the “human parser” does?
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 37
Heuristics
Lexical Preference
� Try to fill most common subcategorization frame
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 36
Heuristics
Right Association
� Always attach to rightmost (lower) nodes
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 39
Probabilistic Chart Parsing
� Start with probabilities calculated by part of speech tagger
� Multiply probabilities when applying fundamental rule
� Best-First Chart Parsing
◮ Examine most likely constituents first (priority queue)
• Various ways to estimate these probabilities!
◮ When adding A → αB•β, try to extend to A → αBβ1 •β2
◮ Never constructs constituents with lower probability than parse
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 38
Probabilistic Context Free Grammars
� Associate probabilities with grammar rules
◮ Requires parsed corpus (e.g. Penn Treebank)
◮ Count number of times rule used in parsing corpus sentences
� Probability of parse tree
◮ Πr probability rule r * Πw probability of word w given category
◮ Assuming independence (again)
UNSW ©W. Wobcke et al. 2019–2021
COMP9414 Grammars and Parsing 40
Summary
� Syntactic Knowledge
◮ Grammatical categories defined by distribution
◮ Much determined by properties of lexical items
� Context Free Grammars
◮ Useful and powerful formalism
◮ Relatively efficient parsers
◮ Limited when dealing with complex phenomena
� Parsing
◮ Top down method is easy to understand, but not efficient
◮ Bottom up method is more efficient
UNSW ©W. Wobcke et al. 2019–2021