Syntax and Parsing 1: Context-Free Grammar
Strings, Languages and Grammars: The Basics
This time:
Strings, languages and grammars Constituents and phrase structure
Phrase structure trees
Context-Free Grammar (CFG)
CFGs and NLP
CFGs vs. finite state models of language CFG and natural languages
Data Science Group (Informatics) NLE/ANLP
Strings and Constituents
Intuitively, NL sentences have structure:
the cat on the mat caught a mouse
We can identify constituent phrases:
the cat
on the mat
the cat on the mat caught a mouse …
but not, for example:
the cat on mat caught a
Autumn 2015
1 / 25
We may consider a language to be a collection of strings:
the cat sat
the cat sat on the mat
the cat on the mat caught a mouse etc., etc.
A grammar is a way of describing a language:
Defines what strings belong to the language (and what not)
May provide useful structural information about strings There are many different approaches to grammar
— this lecture is mainly about context-free grammar
Data Science Group (Informatics) NLE/ANLP Autumn 2015 2 / 25
Types of Constituent Phrase
Constituent phrases come in different types. For example:
noun phrase/noun group (NP):
the cat
Felix
three cats with sharp claws and teeth they
verb phrase/verb group (VP):
caught a mouse
jumped
was chased by a dog
should have known better than to try to swim
Also: prepositional phrase (PP), adjectival phrase (AP), etc., etc.
Data Science Group (Informatics)
NLE/ANLP
Autumn 2015
3 / 25
Data Science Group (Informatics) NLE/ANLP Autumn 2015 4 / 25
Phrase Structure
Phrase Structure Trees
One way of indicating phrasal constituents is by bracketing:
[[ the cat [ on [ the mat ]]] [caught [a mouse]]] May also label constituents according to their type:
[S [NP the cat [PP on [NP the mat ]]] [VP caught [NP a mouse]]]
Can indicate PoS as well as phrasal information:
[S [NP [ART the] [N cat]] [VP [V caught] [NP [ART a] [N mouse]]]]
Alternatively, show structure as phrase structure diagram or parse tree:
S
NP
ART N V PP
the
Data Science Group (Informatics)
VP
cat sat P NP onART N
the
mat
Autumn 2015
Data Science Group (Informatics) NLE/ANLP
Phrase Structure Trees
What do they contain?
hierarchical information about constituents dominance information
what is each constituent made from?
temporal ordering information about constituents linear precedence information
what precedes what?
labelling information
Phrasal and PoS type information e.g. S, NP, ART, N. etc.
Autumn 2015
5 / 25
NLE/ANLP
6 / 25
What’s Phrase Structure Good For?
A step towards identifying meaning Structure provides clues about semantics
Who is doing what to whom?
Some NLP tasks may require phrase structure information:
Information extraction (finding entities and relations in document) Question answering
Opinion mining?
Other NLP tasks may not need phrase structure information: Document classification/retrieval
Data Science Group (Informatics) NLE/ANLP
Autumn 2015
7 / 25
Data Science Group (Informatics) NLE/ANLP Autumn 2015
8 / 25
Context-Free Grammar
Context-Free Grammar
A Context-Free Grammar (CFG) provides a means of associating a sentence with hierarchical phrase structure
A CFG G consists of a finite set of rules of the form: A→X1X2…Xk wherek ≥0
A is a non-terminal: e.g. S, N, NP, VP, ART
each Xi is either a non-terminal or a terminal (a word)
→ can be read: “has the immediate constituents” or “consists of”
A context-free grammar (CFG) can be thought of as:
1 A means of characterising a language
A given CFG G characterises a language (string set) L(G)
A language is context-free if it can be characterised by a CFG
2 A means of checking membership of a language Checking whether s ∈ L(G) is the recognition problem
3 A means of assigning phrase structure to a given string The process of assigning structure is known as parsing
Data Science Group (Informatics) NLE/ANLP
Autumn 2015
9 / 25
Data Science Group (Informatics) NLE/ANLP
Structure Assigned by a CFG
It will assign phrase structure to each of these sentences.
Autumn 2015
10 / 25
Example
Here’s an example of a simple CFG: 1) S→NPVP
2) NP→John 3) NP→Mary 4) NP→ARTN 5) ART→a
7) VP→VNP
8) VP→VNPPP 9) VP→VPP
10) V→sang 11) PP→PNP 12) P→to
NP
John
S
V
sang
VP
ART
a
6) N→song
This grammar will ‘generate’ simple sentences such as:
NP
John sang a song
John sang a song to Mary John sang to Mary
N
song
Data Science Group (Informatics) NLE/ANLP
Autumn 2015
11 / 25
Data Science Group (Informatics)
NLE/ANLP
Autumn 2015
12 / 25
Some Questions about CFGs
Comparing CFGs and FSAs
How do CFGs compare with other ways of characterising languages?
e.g. what about finite state models of language?
is one type of grammar more powerful than another?
How do we use a CFG G for recognition/generation? Can we decide whether a string belongs to L(G)?
Can we use G to generate strings of the language?
How do we use CFGs to parse?
i.e. how do we assign phrase structure to strings in L(G)?
Finite State Automata (FSAs)
Recognition is efficient
Time linear in the length n of the string
O(n)
But FSA formalism is not very expressive:
e.g. can’t deal with nested structures
structure that may be assigned to strings often inadequate for NLP
Data Science Group (Informatics) NLE/ANLP Autumn 2015 14 / 25
Descriptive Power of CFG
Any language describable with a FSA is describable with a CFG
There are languages that can be described with a CFG that cannot be described with a FSA
Data Science Group (Informatics) NLE/ANLP
Comparing CFGs and FSAs
Context-Free Grammar
Brute-force parsing strategies are not efficient – exponential time
We can do better using dynamic programming — CKY algorithm
— chart parsing
CFGs are more expressive
— can deal with nested structures
— constituent structure that is useful for NLP tasks
Autumn 2015
13 / 25
finite regular languages ⊂ languages ⊂
context- free languages
context- unrestricted ⊂ sensitive ⊂ languages
languages
Data Science Group (Informatics) NLE/ANLP
Autumn 2015
15 / 25
Data Science Group (Informatics)
NLE/ANLP
Autumn 2015 16 / 25
Requirements for Natural Language
CFGs and Natural Language
Perhaps of more direct interest:
There is general agreement that NLs are not regular languages i.e. cannot adequately be described with FSAs
Much of the syntax of the world’s NLs seems to be context-free i.e. can be adequately described with CFGs
However, in some NL applications FSA may still be preferred — benefits of CFG analysis may not justify ‘cost’
Are natural languages ≃ context-free languages?
Mostly, it would seem that the world’s languages are CFLs
But some NLs are arguably not context-free (e.g. Swiss German) Descriptive awkwardness:
Agreement Subcategorization
Relationships between different sentence types — active versus passive sentences
Relationships between arbitrarily distant phrases
Data Science Group (Informatics) NLE/ANLP
Agreement phenomena
Many natural languages require ‘agreement’:
Autumn 2015
17 / 25
Data Science Group (Informatics)
Subcategorization
Different verbs may require VP→V1
VP→V2 VP→V3 VP→V4 VP→V5 VP→V6 …
NLE/ANLP
Autumn 2015
18 / 25
NP
the cat the cats
VP
chase the dog(s)
singular plural
different kinds of following phrases: (die)
Can be encoded by a CFG, but at cost of redundancy:
NP NP NP NP PP NPS S
(love) (give) (put ) (tell ) (believe)
S → NP.s VP.s
S → NP.p VP.p NP.s → ART.s N.s NP.p → ART.p N.p ART.s → the ART.p → the
N.s → cat
N.p → cats
VP.s → V.s NP.s VP.s → V.s NP.p VP.p → V.p NP.s VP.p → V.p NP.p V.s → chases V.p → chase N.s → dog
N.p → dogs
If we split up verbs into sub-types there is no way to refer to verbs in general
Data Science Group (Informatics)
NLE/ANLP
Autumn 2015
19 / 25
Data Science Group (Informatics) NLE/ANLP Autumn 2015
20 / 25
chases the dog(s)
Related Sentence Types
Unbounded Dependency Constructions
Some kinds of sentences are related:
‘Wh’-questions
Who did Chris see ε?
Who did Robin say that Alex saw ε?
Who did Terry believe Jo said that Lee saw ε ? etc.
Interpretation of who depends on arbitrarily distant structure
Data Science Group (Informatics) NLE/ANLP Autumn 2015
Grammar Formalisms and Natural Language
Generative Power:
Can the formalism describe the language at all?
Notational expressivity:
Can the formalism capture the appropriate generalizations?
Computational effectiveness:
Does the formalism have a sensible, practical computational
interpretation?
CFGs score well on first and third point, but less well on the second
Kim will sing Sandy saw Pat
⇒ Will Kim sing?
⇒ Pat was seen by Sandy
No way to express this relatedness with CFGs
Data Science Group (Informatics) NLE/ANLP
Swiss German: Cross-Serial Dependencies
An example of a language phenomenon that is not context-free:
21 / 25
22 / 25
Autumn 2015
… mer d’chind em Hans lönd hälfe aastriiche … we the children Hans let help paint
… we let the children help Hans paint
NP NP
Crossing dependencies
NP V
Nested dependencies
V
V
Shieber, S. (1985). Evidence against the context-freeness of natural language. Linguistics and Philosophy 8, pp. 333-343.
Data Science Group (Informatics) NLE/ANLP Autumn 2015
23 / 25
Data Science Group (Informatics) NLE/ANLP Autumn 2015
24 / 25
Next Topic: Dependency Parsing
Syntactic analysis and dependencies Phrase structure vs. dependency structure Usefulness of dependency parsing Dependency parsing in practice
Data Science Group (Informatics) NLE/ANLP Autumn 2015 25 / 25