Computational Linguistics
Computational
Linguistics
Copyright © 2017 Suzanne
Stevenson, Graeme Hirst and
Gerald Penn. All rights reserved.
2
2. Introduction to
syntax and parsing
Gerald Penn
Department of Computer Science, University of Toronto
CSC 2501 / 485
Fall 2018
Reading: Jurafsky & Martin: 5.0–1, 12.0–12.3.3, 12.3.7,
[13.1–2]. Bird et al: 8.0–4.
2
Syntactic structure 1
• Syntax:
• The combinatorial structure of words.
• How words can be linearly organized:
left/right precedence, and contiguity.
• How words can be hierarchically organized
into phrases and sentences.
3
Syntactic structure 2
The cat hunted the squirrel living in the tree with
persistence.
[ [The cat]
[hunted [the squirrel [living [in [the tree] ] ] ]
[with [persistence] ] ] ]
4
Syntactic structure 2
The cat hunted the squirrel living in the tree with
persistence.
hunted
the squirrel
The cat
in
the tree
with
persistence
living
5
Syntactic structure 2
The cat hunted the squirrel living in the tree with
persistence.
S
NP VP
V NP
NP
PP
P NP
hunted
the squirrel
The cat
in
the tree
PP
P NP
with
persistence
S
V
living N
DET N
DET N
DET N
6
Syntactic structure 3
• Goal: meaning, interpretation, semantics.
• So why do we care about syntax?
7
• Grammar:
• Formal specification of allowable structures.
• Knowledge
• Representation
• Parsing:
• Analysis of string of words to determine the
structure assigned by grammar.
• Algorithm
• Process
Grammars and parsing
8
• Main issues:
• Which words are grouped together into phrases.
• How words within a phrase project the properties
of a single, common word (the head of the
phrase).
• How different phrases relate to each other.
• Grammar encodes these relations. Some
grammars interpret these relations with
respect to meaning.
Using grammar to capture structure
Good and bad grammars
• There are many possible grammars for any
natural language.
• Some are better than others.
• Desiderata:
• Faithfulness to (vastly divergent) details about
language.
• Economy of description.
• Fidelity to some prevailing linguistic intuition.
• Efficiency of parsing.
9
X
NineX SevenX
Twenty-three
10
• Primitives: lexical categories or parts of
speech.
• Each word-type is a member of one or more.
• Each word-token is an instance of exactly one.
e.g. The cat in the hat sat.
• Categories are open or closed to new words.
• Eight main categories, many subcategories.
Elements of grammar
X SevenX
Twenty-three
11
• Primitives: lexical categories or parts of
speech.
• Each word-type is a member of one or more.
• Each word-token is an instance of exactly one.
• Categories are open or closed to new words.
• Eight main categories, many subcategories.
• The categories might possibly be language-
specific as well.
Elements of grammar
X
Nine
12
• Nouns: denote an object, a concept, a place,
…
• Count nouns: dog, spleen, Band-Aid, …
• Mass nouns: water, wheat, …
• Proper nouns: Fred, New York City, …
• Pronouns: he, she, you, I, they, …
• Adjectives: denote an attribute of the
denotation of a noun.
• Intersective: pink, furry, …
• Measure: big, …
• Intensional: former, alleged, …
Parts of speech 1
13
• Determiners, articles: specify certain
attributes of the denotation of a noun that are
grammatically relevant.
• the, a, some, …
• Verbs: predicates, denote an action or a
state. Numerous distinctions, e.g. transitivity:
• Intransitive: sleep, die, …
• Transitive: eat, kiss, …
• Ditransitive: give, sell, …
• Copula: be, feel, become, …
Parts of speech 2
14
• Adverbs: denote an attribute of the
denotation of a predicate.
• Time and place: today, there, now, …
• Manner: happily, furtively, …
• Degree: much, very, …
• Prepositions: relate two phrases with a
location, direction, manner, etc.
• up, at, with, in front of, before, …
Parts of speech 3
15
• Conjunctions: combine two clauses or
phrases:
• Coordinating conjunctions: and, or, but
• Subordinating conjunctions: because, while,…
• Interjections: stand-alone emotive
expressions:
• um, wow, oh dear, balderdash, crikey, …
Parts of speech 4
16
• Combinations:
• Phrase: a hierarchical grouping of words and/or
phrases.
• Clause: a phrase consisting of a verb and
(almost) all of its dependents.
• Sentence: a clause that is syntactically
independent of other clauses.
• Can be represented by tree (or a labelled
bracketing).
• Terminology: A constituent is a well-formed
phrase with overtones of semantic and/or
psychological significance.
Elements of grammar
17
• Noun phrase (NP):
• a mouse
• mice
• Mickey
• the handsome marmot
• the handsome marmot on the roof
• the handsome marmot whom I adore
• Verb phrase (VP):
• laughed loudly
• quickly gave the book to Mary
Types of phrase 1
18
• Adjective phrase (AP):
• green
• proud of Kyle
• very happy that you went
• Prepositional phrase (PP):
• in the sink
• without feathers
• astride the donkey
Types of phrase 2
19
• Clauses:
• Ross remarked upon Nadia’s dexterity
• to become a millionaire by the age of 30
• that her mother had lent her for the banquet
• Sentences:
• Ross remarked upon Nadia’s dexterity.
• Nathan wants to become a millionaire by the age
of 30.
• Nadia rode the donkey that her mother had lent
her for the banquet.
• The handsome marmot on the roof [in dialogue].
Clauses and sentences 1
20
• Clauses may act as noun phrases:
• To become a millionaire by the age of 30 is what
Ross wants.
• Nadia riding her donkey is a spectacular sight.
• Ross discovered that Nadia had been feeding his
truffles to the donkey.
Clauses and sentences 2
21
The structure of an idealized
phrase
XP → ZP X YP
XP
XZP YPsubject or
pre-modifier
object, complement or
post-modifier, adjunct
head
word
xxxx
22
Example phrases
very that you wenthappy
AP
AADV S
quickly with Mayago
VP
VADV PPPP
to the store
S
NP VPAUX
Kim gowill
23
• A context-free grammar is a quadruple
G = (Vt,Vn, P, S), where
• Vt is a finite set of terminal symbols.
• Vn is a finite set of non-terminal symbols.
• P is a finite set of production rules of the form
A → α
where A ∈ Vn and α is a sequence of symbols in
(Vn ∪ Vt)*.
• S ∈ Vn is the start symbol.
Formal definition of a CFG
24
Lexical
categories:
NT’s that rewrite
as a single T.
S = S,
A very simple grammar
S
NP
NP
NP
VP
VP
PP
Det
Adj
N
V
P
→ NP VP
→ Det N
→ Det Adj N
→ NP PP
→ V
→ V NP
→ P NP
→ the | a | an
→ old | red | happy | …
→ dog | park | statue | contumely | run | …
→ saw | ate | run | disdained | …
→ in | to | on | under | with | …
P = {
}
Vt and Vn can be
inferred from the
production rules.
The lexicon:
In practice, a sep-
arate data structure
25
• Non-terminal (NT):
A symbol that occurs on the left-hand side (LHS) of
some rule.
• Pre-terminal: a kind of non-terminal located on the
LHS of a lexical entry.
• Terminal (T):
A symbol that never occurs on the LHS of a rule.
• Start symbol:
A specially designated NT that must be the root of
any tree derived from the grammar.
In our grammars, it is usually S for sentence.
Terminology
27
• Parsing: Determining the structure of a
sequence of words, given a grammar.
• Which grammar rules should be used?
• To which symbols (words / terminals and nodes /
non-terminals) should each rule apply?
Parsing 1
28
• Input:
• A context-free grammar.
• A sequence of words
Time flies like an arrow
or, more precisely, of sets of parts of speech.
{noun,verb} {noun,verb} {verb,prep} {det} {noun}
• Process:
• Working from left to right, guess how each word
fits in.
Parsing 2
29
• If a guess leads to failure (parse is stymied),
back up to a choice point and try a different
guess.
• Backtracking, non-determinism.
• At each guess, must save state of parse on a
stack.
• (Or, explore in parallel.)
• Want to guess right:
• Order of preference for rules.
Parsing 3
31
• Top-down or rule-directed parsing:
“Can I take these rules and match them to this
input?”
• Initial goal is an S.
• Repeatedly look for rules that decompose /expand
current goals and give new goals.
E.g., goal of S may decompose to goals NP and VP.
• Eventually get to goals that look at input.
E.g., goal of NP may decompose to det noun.
• Succeed iff entire input stream is accounted for as S.
Top-down parsing 1
Top-down parsing 2
• Example: A recursive descent parser.
>>> nltk.app.rdparser()
• Operations on leftmost frontier node:
• Expand it.
• Match it to the next input word.
32
33
Top-down parsing 3
• Choice of next operation (in NLTK demo):
• If it’s a terminal, try matching it to input.
• If it’s a non-terminal, try expanding with first-listed
untried rule for that non-terminal.
34
Bottom-up parsing 1
35
• Bottom-up or data-directed parsing:
“Can I take this input and match it to these
rules?”
• Try to find rules that match a possible PoS of the
input words …
• … and then rules that match the constituents so
formed.
• Succeed iff the entire input is eventually matched
to an S.
Bottom-up parsing 2
• Example: A shift–reduce parser.
>>> nltk.app.srparser()
• Operations:
• Shift next input word onto stack.
• Match the top n elements of stack to RHS of rule,
reduce them to LHS.
36
37
Bottom-up parsing 3
• Choice of next operation (in NLTK demo):
• Always prefer reduction to shifting.
• Choose the first-listed reduction that applies.
• Choice of next operation (in real life):
• Always prefer reduction to shifting for words, but
not necessarily for larger constituents.
38
39
• Neither top-down nor bottom-up search
exploits useful idiosyncrasies that CFG rules,
alone or together, often have.
• Problems:
• Recomputation of constituents.
• Recomputation of common prefixes.
• Solution: Keep track of:
• Completed constituents.
• Partial matches of rules.
Problems