Syntax and Parsing 2: Dependency Parsing
Andrew D. Robertson Informatics Autumn 2015
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 1 / 35
Lecture Contents
1 Syntactic Analysis with Dependency Parsing
2 Phrase Structure versus Dependency Structure
Usefulness of Dependency Parsing
3 More Dependency Structure Details Challenges in Representation Projectivity
4 Transition-based Dependency Parsing
5 Evaluation
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 2 / 35
Syntactic Analysis
Determining the syntactic structure of a sentence
Dependency parsing: an approach to automatic syntactic analysis
Style of analysis is inspired by what’s known as “dependency grammar”
Phrase Structure
NP
JJ NNS VBZ JJ NN PU
Hungry badgers like strawberry jam .
S
VP
NP
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 3 / 35
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 4 / 35
Dependency Structure
root
amod
nsubj
badgers
like
punct
dobj amod
strawberry jam .
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 5 / 35
Dependency Structure Explained
Relation function
Head Dependant
Describes sentences in terms of binary, asymmetric, labelled relations called dependencies (/arcs/relations).
Binary: relations occurs between two words only Asymmentric: directionality in the relation, head to dependant Dependencies labelled with the function of the relation
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 6 / 35
Dependency Structure Explained (cont.)
Notion of “dependant” varies according to function: argument
modifier
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 7 / 35
Head-dependant rules
Predicate-Argument structure:
Modifiers:
nsubj dobj
badgers like jam like(badgers,jam)
mod
angry hungry smelly badgers
mod
mod
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 8 / 35
Explicit Encoding
Dependency structures encode explicitly: head-dependant relations: the arcs between words function types: the labels on the arcs
Phrase structures encode explicitly:
phrases: the nonterminal nodes
types of structure: the nonterminal labels
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 9 / 35
Conversion
Difference in explicit encoding, but describing same underlying idea
Each head represents a phrase composed of itself and its dependants
Functional relations in phrase structures identified by structural rules
Conversion remains non-trivial
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 10 / 35
Phrase Structure
NP
JJ NNS VBZ JJ NN PU
Hungry badgers like strawberry jam .
S
VP
NP
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 11 / 35
Dependency Structure
root
amod
nsubj
badgers
like
punct
dobj amod
strawberry jam .
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 12 / 35
Usefulness of Dependency Parsing
Learn how the elements within a sentence interact Examples of usefulness within:
Opinion mining Machine Translation Information Extraction
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 13 / 35
Opinion Mining
Given a set of DVD reviews, find reviewer opinions about particular DVD topics, such as “plot”, “characters”, “dialogue” and “cinematography”.
Example sentence: “The plot was awful but there were certainly some interesting characters.”
Dependency analysis shows that “awful” is a description of the plot, and that “interesting” is only relevant to the “characters”.
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 14 / 35
Machine Translation
La melon mangˆis la viro
Direct word substitution: the badger ate the man
The “n” in “melon” is a clue for a dependency parser that the sentence is differently structured
In fact: the man ate the badger
dobj nsubj La melon manĝis la viro
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 15 / 35
Information Extraction
Automatically extract structured information from unstructured data
Given sentences annotated with entities (i.e. after applying Named Entity Recognition)
Populate database of knowledge from news texts
In a document, 2 companies mentioned, and the word “bought” Does “bought” relate to the companies?
Which did the buying?
So we can assert “bought(company1,company2)”
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 16 / 35
Grammar-based versus Data-Driven
Grammar-based: given rule set, determine if sentence is well-formed according to the grammar
Data-driven:
Train classifier to assign most plausible structure to sentence Trained on sentences annotated with correct dependencies
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 17 / 35
More Dependency Structure Details
Sentences of tokens (pre-tokenized), all with unique IDs Tokens can be morphemes, whole words, punctuation “Mal-san-ul-ejo” = “hospital”
“en-iri” = “to enter”
Each token has a single head
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 18 / 35
Projectivity
wn-m … wn wn+1 wn+2 wn+3 wn wn+1 wn+2 wn+3 Can restrict class of well-formed trees to only those which are
projective
An arc (relation) is projective if there is a directed path from the head word to all words between the endpoints of the arc
A dependency tree is projective if all of its arcs are projective
Majority of English sentences are projective, not the case with freer word order languages
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 19 / 35
Text File Representation
ID Form
1 Hungry
2 badgers
3 like
4 strawberry
5jam
6.
PoS Head DepRel
JJ 2 amod NNS 3 nsubj VBZ 0 root JJ 5 amod NN3dobj PU3punct
root
amod
nsubj
badgers
like
punct
dobj amod
strawberry jam .
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 20 / 35
Interactive Parser
Copy parser output into the “Plain” panel and press SHIFT+ENTER to
visualise the tree. See upcoming lab session for instructions on running
the parser.
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 21 / 35
Two Main Styles of Dependency Parsing
Graph-based (for details, see further reading)
Transition-based
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 22 / 35
Transition-based Parsing
Based on transitions of an abstract machine
Greedily transition between states choosing most probable transition (errors could propagate) given what it sees in the sentence
Committed to transitions, so can look at history
Can create rich feature representations based on current state too
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 23 / 35
Transition-based Algorithm (setup)
([root]stack, [squirrels,enjoy,tea, .]buffer, {}arcset) NNS VBP NN .
Transitions:
Shift: remove word from buffer, put on stack
precondition: buffer non-empty
Left-arc: add arc from buffer item to stack item preconditions: buffer & stack non-empty, stack item not root aftereffect: pop the stack
Right-arc: add arc from stack item to buffer item preconditions: buffer & stack non-empty)
aftereffect: pop stack item, replace buffer head with it
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 24 / 35
Transition-based Algorithm (setup cont.)
([root]stack, [squirrels,enjoy,tea, .]buffer, {}arcset) NNS VBP NN .
Train a classifier (e.g. support vector machine) to look at feature vector and pick the next transition
Represent parser state with a vector of features “buffer[0]:POS:NNS”=4, “buffer[1]:FORM:enjoy=10” feature_vector = {4:1,10:1}
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 25 / 35
Transition-based Algorithm (step 1)
([root]stack, [squirrels,enjoy,tea, .]buffer, {}arcset) NNS VBP NN .
Precondition for left-arc not met
Between right-arc and shift, most probable is shift (root to NNS unlikely)
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 26 / 35
Transition-based Algorithm (step 2)
([root,squirrels]stack, [enjoy,tea, .]buffer, {}arcset) NNS VBP NN .
All transitions possible
If relation between stack head and buffer unlikely, then shift
Most likely that the NNS is an argument to the VBP, so a dependant. Therefore left-arc
Position information and lack of auxiliary verb makes “nsubj” most likely.
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 27 / 35
Transition-based Algorithm (step 3)
nsubj
A=
([root]stack, [enjoy,tea, .]buffer, Aarcset) VBP NN .
“squirrels” popped from the stack left-arc preconditions not met (root)
Shift or right-arc? If we right-arc now, “enjoy” is discarded! And it looks like it has more dependants.
So shift!
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 28 / 35
Transition-based Algorithm (step 4)
([root,enjoy]stack, [tea, .]buffer, Aarcset) VBP NN .
The NN after this VBP is likely to be another argument (dependant), so right-arc
Position information suggests “dobj”
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 29 / 35
Transition-based Algorithm (step 5)
nsubj dobj
A=
([root]stack, [enjoy, .]buffer, Aarcset) VBP
“enjoy” replaces “tea” on the buffer head Left-arc preconditions not met (root)
Again, right-arc would discard “enjoy” which is likely to have another dependant
So shift!
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 30 / 35
Transition-based Algorithm (step 6)
([root, enjoy ]stack , [ . ]buffer , Aarcset ) VBP
Punctuation is unlikely to be the head of a verb, so no left-arc If we shift, we reach terminal state with several items on stack A right-arc is most suitable
“punct” relation for punctuation
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 31 / 35
Transition-based Algorithm (step 7)
punct nsubj dobj
A=
([root]stack, [enjoy]buffer, Aarcset) VBP
“enjoy” replaces “.” on the buffer head Left-arc precondition not met (root) Shift reaches terminal state
Right-arc likely between root and VBP, especially if last on buffer, and root has no dependants
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 32 / 35
Transition-based Algorithm (step 8)
punct nsubj dobj
A=
root
([]stack, [root]buffer, Aarcset)
“root” replaces “enjoy” on the buffer head
Stack empty, so no arc possible
So shift! This produces an empty buffer, and we’re done!
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 33 / 35
Algorithm Notes
Linear time complexity: O(n) Creates projective dependency trees
Requires further modification to deal with non-projectivity (complexity hit)
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 34 / 35
Evaluation
Treebank of data marked up with dependencies
Exact match
Unlabelled attachment score and labelled attachment score
Precision: percentage correct of particular dependency type in parser output
Recall: percentage identified of particular dependency type in parser output
F-measure: combination of precision and recall.
Andrew D. Robertson (Informatics)
NLE/ANLP
Autumn 2015 35 / 35