程序代写代做代考 Computational Linguistics

Computational Linguistics

Computational

Linguistics

Copyright © 2017 Suzanne

Stevenson, Graeme Hirst

and Gerald Penn. All rights

reserved.

4

4. Extending grammars

with features

Gerald Penn

Department of Computer Science, University of Toronto

CSC 2501 / 485

Fall 2018

Reading: Jurafsky & Martin: 12.3.4–6, 15.0–3;

[Allen: 4.1–5]; Bird et al: 9.

2

• Problem: Agreement phenomena.

Nadia {washes/*wash} the dog.

The boys {*washes/wash} the dog.

You {*washes/wash} the dog.

• Morphological inflection of verb must
match subject noun in person and number.

Agreement and inflection

Subject–verb agreement 1

3

Singular Plural

1 I wash we wash

2 you wash you wash

3 he/she/it washes they wash

1 I am we are

2 you are you are

3 he, she, it is they are

Present tense

Subject–verb agreement 2

4

Singular Plural

1 I washed we washed

2 you washed you washed

3 he, she, it washed they washed

1 I was we were

2 you were you were

3 he, she, it was they were

Past tense

Agreement features 1

5

• English agreement rules are fairly simple.

• Subject : verb w.r.t. person and number.

• No agreement required between verb and object.

• Many languages have other agreements.

• E.g., German: Article and adjective ending
depends on noun gender and case:

Agreement features 2

6

Nominative Case (Subject Case)

Masculine

der

Feminine

die

Neuter

das

Plural

die

der neue Wagen

the new car

die schöne Stadt

the beautiful city

das alte Auto

the old car

die neuen Bücher

the new books

Masculine

ein

Feminine

eine

Neuter

ein

Plural

keine

ein neuer Wagen

a new car

eine schöne Stadt

a beautiful city

ein altes Auto

an old car

keine neuen Bücher

no new books

A
s
k
a

b
o
u
t.c

o
m

: G
e
rm

a
n
la

n
g
u
a
g
e
: A

d
je

c
tiv

e
e

n
d
in

g
s
I a

n
d
II.

h
ttp

://g
e
rm

a
n
.a

b
o
u
t.c

o
m

/lib
ra

ry
/w

e
e
k
ly

/a
a

0
3
0

2
9
8
.h

tm
a

n
d

a
a
0
3
3
0
9
8
.h

tm

Agreement features 2

7

Accusative Case (Direct Object)

Masculine

den

Feminine

die

Neuter

das

Plural

die

den neuen Wagen

the new car

die schöne Stadt

the beautiful city

das alte Auto

the old car

die neuen Bücher

the new books

Masculine

einen

Feminine

eine

Neuter

ein

Plural

keine

einen neuen Wagen

a new car

eine schöne Stadt

a beautiful city

ein altes Auto

an old car

keine neuen Bücher

no new books

A
s
k
a

b
o
u
t.c

o
m

: G
e
rm

a
n
la

n
g
u
a
g
e
: A

d
je

c
tiv

e
e

n
d
in

g
s
I a

n
d
II.

h
ttp

://g
e
rm

a
n
.a

b
o
u
t.c

o
m

/lib
ra

ry
/w

e
e
k
ly

/a
a

0
3
0

2
9
8
.h

tm
a

n
d

a
a
0
3
3
0
9
8
.h

tm

Agreement features 3

8

E.g., Chinese: Numeral classifiers, often based on
shape, aggregation, …:

两条鱼 liang tiao yu ‘two CLASSIF-LONG-ROPELIKE fish’
两条河 liang tiao he ‘two CLASSIF-LONG-ROPELIKE rivers’
两条腿 liang tiao tui ‘two CLASSIF-LONG-ROPELIKE legs’
两条裤子 liang tiao kuzi ‘two CLASSIF-LONG-ROPELIKE pants’
两只胳膊 liang zhi gebo ‘two CLASSIF-GENERAL arms’
两件上衣 liang jian shangyi ‘two CLASSIF-CLOTHES-ABOVE-WAIST tops’
两套西装 liang tao xizhuang ‘two CLASSIF-SET suits’

Zhang, Hong (2007). Numeral classifiers in Mandarin Chinese. Journal of East Asian Linguistics,

16(1), 43–59. Thanks also to Tong Wang, Vanessa Wei Feng, and Helena Hong Gao.

Agreement features 1

9

• English agreement rules are fairly simple.

• Many languages have other agreements.

• Some languages have multiple grammatical
genders.

• E.g. Chichewa has genders for men, women,
bridges, houses, diminuitives, men inside houses,
etc. Between 12-18 in total.

• Some languages overtly realize many of
these distinctions.

• E.g. some Hungarian verbs have as many as 4096
inflected forms.

Inflectional morphology
• Word may be inflected …

• … to indicate paradigmatic properties, e.g.
singular / plural, past / present, …

• … to indicate some (other) semantic properties

• … to agree with inflection of other words.

• Each (open-class) word-type has a base
form / stem / lemma.

• Each occurrence of a word includes inflection
by a (possibly null) morphological change.

10

• Problem: How to account for this in grammar.

• Possible solution: Replace all NPs, Vs, and
VPs throughout the grammar.

11

S → NP3s VP3s

S → NP3p VP3p

S → NP2 VP2

S → NP1s VP1s

S → NP1p VP1p

NP3s → dog, bear, …

NP3p → dogs, bears

NP2 → you


VP3s → V3s NP


V3s → is, was,

washes, washed, …

V3p → are, were,

wash, washed, …

V1s → am, was, wash,

washed, …

S → NP VP

NP → you, dog, dogs, bear, bears,

VP → V NP

V → washes, wash, washed, is,

was, …

Rule proliferation 1

12

• Drawback 1: the result is big … really big.

• Drawback 2: Losing the generalization:

• All these Ss, NPs, VPs have the same structure.

• Doesn’t depend on particular verb, noun, and
number.

• CF rules collapse together structural and
featural information.

• All information must be completely and
directly specified.

• E.g., can’t just say that values must be equal for
some feature without saying exactly what values.

Rule proliferation 2

14

• Solution: Separate feature information from
syntactic, structural, and lexical information.

• A feature structure is a list of pairs:
[feature-name feature-value]

• Feature-values may be atoms or feature
structures.

• Can consider syntactic category or word to be
bundle of features too.

• Can represent syntactic structure.

Feature structures 1

15

Feature structures 2

Cat N

Num s

Pers 3

Lex dog

Cat N

Agr Num s

Pers 3

Lex dog

][

Feature paths:

features of

features; e.g.,

(Agr Pers 3)

Num s

Pers 3

Lex dog

N Cat N

Num s

Pers 3

dog Num s

Pers 3

N/dog

• Drawback: many equivalent notations.

16

Feature structures 3

Cat Det

Num s

Pers 3

Lex a

Cat N

Num s

Pers 3

Lex dog

NP formed from Det and N.
Feature values in components become

feature names in new constituent.

Cat NP

Num s

Det Num s

Pers 3

Lex a

N Num s

Pers 3

Lex dog

[ ]

[ ]

• 1. Lexical specification:

Description of properties of a word:
morphological, syntactic, semantic, …

18

Components of feature use

Or: N → dog
(N Agr) = 3s

N → dogs

(N Agr) = 3p

V → sleeps

(V Agr) = 3s

V → sleep

(V Agr) = {1s,2s,1p,2p,3p}

Cat N

Agr 3s

dog: ][
Cat N

Agr 3p

dogs: ][

Cat V

Agr 3s

sleeps: ][
Cat V

Agr {1s,2s,1p,2p,3p}

sleep: ][

19

• 2. Agreement:

• Constraints on co-occurrence in a rule — within
or across phrases.

• Typically are equational constraints.

Components of feature use

NP → Det N

(Det Num) = (N Num)

S → NP VP

(NP Agr) = (VP Agr)

21

• 3. Projection:

• Sharing of features between the head of a
phrase and the phrase itself.

• Head features:

• Agr is typical, but so is the head-word itself as
a feature.
(Common enough that there’s usually a mechanism for “declaring” head
features and omitting them from rules.)

Components of feature use

VP → V . . .

(VP Agr) = (V Agr)

• What does it mean for two features to be
“equal”?

• A copy of the value or feature structure, or
a pointer to the same value or feature structure
(re-entrancy, shared feature paths).

22

Constraints on feature values 1

Cat N

Agr ➀ Num s

Pers 3

Lex sky

][
Cat N

Agr ➀

Lex dog

Copy

Pointer

23

• But: It may be sufficient that two features are
not equal, just compatible — that they can be
unified.

• E.g., and

Constraints on feature values 2

Cat N

Pers 3

Num s

Cat N

Pers 3

Gndr F

24

• Feature structure X subsumes feature structure
Y if Y is consistent with, and at least as specific
as X.

• Also say that Y extends X.
Y can add (non-contradictory) features to those
in X.

• Definition: X subsumes Y (X ⊑ Y) iff there is a
simulation of X inside Y, i.e., a function s.t.:

• sim(X) = Y

• If X is atomic, so is Y and X = Y

• Otherwise, for all feature values X.f: Y.f is defined,
and sim simulates X.f inside Y.f.

Subsumption of feature structures 1

• Examples:

25

Subsumption of feature structures 2

Cat N

Pers 3

Num s

Cat N

Pers 3

Gndr F

Cat N

Pers 3

Cat N

Pers 3

Gndr F

⊑ ⋢but

Cat VP

Agr ➀

Subj [Agr ➀]

[ ]

Cat VP

Agr ➀

Subj Agr ➀ Pers 3

Num s

Third example from Jurafsky & Martin, p. 496

26

• The unification of X and Y (X ⨆ Y) is the most
general feature structure Z that is subsumed
by both X and Y.

• Z is the smallest feature structure that extends
both X and Y.

• Unification is a constructive operation.

• If any feature values in X and Y are incompatible,
it fails.

• Else it produces a feature structure that includes
all the features in X and all the features in Y.

Unification 1

27

Unification 2

Cat N

Pers 3

Num s

Cat N

Pers 3

Gndr F

Cat N

Pers 3

Num s

Gndr F

⨆ =

28

• Each constituent has an associated feature
structure.

• Constituents with children have a feature structure
for each child.

• Arc addition:

• The feature structure of the new arc is initialized
with all known constraints.

• Arc extension:

• The feature structure of the predicted constituent
must unify with that of the completed constituent
extending the arc.

Features in chart parsing

30

S → NP VP

(NP Agr) = (VP Agr)

NP → Det N
(NP Agr) = (N Agr)

(Det Agr) = (N Agr)

VP → V

(VP Agr) = (V Agr)

Sample grammar fragment

Det → a

[Agr 3s]

N → dog

[Agr 3s]

V → sleep

[Agr ^3s]

Det → all

[Agr 3p]

N → dogs

[Agr 3p]

V → sleeps

[Agr 3s]

Det → the

[Agr {3s,3p}]

Mismatched features fail

31

doga sleep

Det [Agr 3s]

NP

S
FAIL

N [Agr 3s] V [Agr ^3s]

Agr ①
Det [Agr ①]

N [Agr ① 3s]
[ ]

VP

Agr ②

V [Agr ② ^3s]][

[Agr①] ⨆ [Agr②]

Unifiable features succeed

32

doga sleeps

Det [Agr 3s]

NP

S
SUCCEED

N [Agr 3s] V [Agr 3s]

Agr ①
Det [Agr ①]

N [Agr ① 3s]
[ ]

VP

Agr ②

V [Agr ② 3s]][

[Agr①] ⨆ [Agr②]

33

• Distinguishes structure from ”functional” info.

• Allows for economy of specification:

• Equations in rules:
S → NP VP

(NP Agr) = (VP Agr)

• Sets of values in lexicon:
N → fish

(N Agr {3s, 3p})

• Allows for indirect specification and transfer of
information, e.g., head features.

Advantages of this approach

Must unify with

Features and the lexicon
• Lexicon may contain each inflected form.

• Feature values and base form listed.

• Lexicon may contain only base forms.

• Process of morphological analysis maps inflected
form to base form plus feature values.

• Time–space trade-off, varies by language.

• Lexicon may contain semantics for each
form.

34

Morphological analysis
• Morphological analysis is simple in English.

• Reverse the rules for inflections, including spelling
changes.

• Irregular forms will always have to be explicitly
listed in lexicon.

35

dogs → dog [Agr 3p]

dog → dog [Agr 3s]

berries → berry [Agr 3p]

buses → bus [Agr 3p]

eats → eat [Agr 3s, Tns pres]

ripped → rip [Tns past]

tarried → tarry [Tns past]

running → run [Tns pp]

children → child [Agr 3p] sang → sing [Tns past]

Morphology in other languages

• Rules may be more complex in other (even
European) languages.

• Languages with compounding (e.g., German)
or agglutination (e.g., Finnish) require more-
sophisticated methods.

• E.g., Verdauungsspaziergang, a stroll that one
takes after a meal to assist in digestion.

36

Semantics as a lexical feature

• Add a Sem feature:

• The meaning of dog is dog.
The meaning of chien and Hund are both dog.
The meaning of dog is G52790.

37

Cat N

Num s

Pers 3

Lex dog
Sem dog

Typewriter font

for semantic objects

• A representation of properties relevant to
meaning and interpretation:

• Things

• Predicates (events)

• Roles

• Syntactic structure helps in:

• Determining things and predicates.

• Determining mapping of things to roles of
predicates.

Entities (e.g., in a knowledge base)

Relations between things and predicates.

Goal of parsing

38

39

Example

The goalie kicked the ball.

Event: kicked

Thing: The goalie Thing: the ball

Role: Agent

(doer)

Role: Theme

(thing affected)

kick (agent=goalie, theme=ball)

40

• Mapping from structure to objects of
interpretation
• Things: NPs, Ss

• Predicates: verbs, preps, APs

• Roles: ??

• What are the roles in these examples?

Sara left.
Joan found the treasure in the garage.
Ken put the ball in the garage.
Tim cut the wire with a pair of scissors.
Melissa visited Ottawa with Nadia.
Andrew felt like a failure.

Syntax ↔ interpretation

41

• Mapping from structure to objects of
interpretation
• Things: NPs, Ss

• Predicates: verbs, preps, APs

• Roles: ?? (thematic roles)

• What are the roles in these examples?

Sara left.
Joan found the treasure in the garage.
Ken put the ball in the garage.
Tim cut the wire with a pair of scissors.
Melissa visited Ottawa with Nadia.
Andrew felt like a failure.

Syntax ↔ interpretation

• Mapping is more or less regular:

Subject ≈ Agent / Experiencer
Object ≈ Theme
Object of preposition ≈ Goal/Location/

Recipient / Instrument

• This mapping is used to determine
appropriate semantic representation.

43

Grammatical function vs. thematic roles

Verb subcategorization 1
• Problem: Constraints on verbs and their

complements.

Nadia told / instructed / *said / *informed Ross to sit down.
Nadia *told / *instructed / said / *informed to sit down.
Nadia told / *instructed / *said / informed Ross of the

requirement to sit down.

Nadia gave / donated her painting to the museum.
Nadia gave / *donated the museum her painting.

Nadia put / ate the cake in the kitchen.
Nadia *put / ate the cake.

44

Verb subcategorization 2
• VPs are much more complex than just V with

optional NP and/or PP.

• Can include more than one NP.

• Can include clauses of various types:
that Ross fed the marmoset
to pay him the money

• Subcat: A feature on a verb indicating the
kinds of verb phrase it allows:
_np, _np_np, _inf, _np_inf, …

45

Write this way to

distinguish from

constituents.

47

• Tense and aspect markings on verb:

• Locate the event in time (relative to another time).

• Mark the event as complete/finished or in progress.

Nadia rides the horse. — In progress now.

Nadia rode the horse. — Completed before now.

Nadia had ridden the horse. — Completed before before now.

Nadia was riding the horse. — In progress before now.

Verb tense and aspect 1

Verb tense and aspect 2
• Tense: past or present

• Aspect: simple, progressive, or perfect

48

Simple Progressive Perfect

Present rides is riding has ridden

Past rode was riding had ridden

Nadia …

… the horse

Auxiliary verb

Verb tense and aspect 3
• Tense: past or present

• Aspect: simple, progressive, or perfect

49

Simple

Present rides

Past rode

Nadia …

… the horse

Perfect progressive

(continuous)

has been riding

had been riding

Auxiliary verbs

Modal verbs
• Modal verbs: Auxiliary verbs that express

degrees of certainty, obligation, possibility,
prediction, etc.

Nadia

{could, should, must, ought to, might, will, …}

{ride, be riding, have ridden, have been riding}

the horse.

50

51

• Structure (so far):
[MODAL] [HAVE] [BE] MAIN-VERB

• General pattern:

VP → AUX VP

AUX → MODAL | HAVE | BE

• Use features to capture necessary agreements.

English auxiliary system

52

• Voice: System of assigning thematic roles to
syntactic positions.

• English has active and passive voices.

• Passive expressed with be+past participle.
Other auxiliaries may also apply, including progressive be.

• Nadia was kissed. Nadia was being kissed.
Nadia had been kissed. Nadia had been being kissed.
Nadia could be kissed. Nadia could have been being
kissed.

• Structure:
[MODAL] [HAVE] [BE1] [BE2] MAIN-VERB

Voice 1

53

Voice 2

The goalie kicked the ball.

Event: kicked

Thing: the goalie Thing: the ball

Role: Agent

(doer)

Role: Theme

(thing affected)

ACTIVE

kick (agent=goalie, theme=ball)

54

Voice 3

The ball was kicked.

Event: kicked

Thing: the ball

PASSIVE

Role: Theme

(thing affected)

kick (agent=?, theme=ball)

55

Voice 4

The ball was kicked by the goalie.

Event: kicked

Thing: the ball Thing: the goalie

Role: Agent

(doer)

PASSIVE

Role: Theme

(thing affected)

kick (agent=goalie, theme=ball)

56

Passive as Diathetic alternation

the ballthe goalie kicked

57

the goaliethe ball kicked bywas

From object position in VP

to subject position in S

From subject position

in S to PP in VP

But the semantic representation doesn’t change

Passive as Diathetic alternation

Some useful features
• VForm: The tense/aspect form of a verb:

passive, pastprt, …

• CompForm: The tense/aspect form of the
complement of an auxiliary.

60

61

• For all rules of the form:

• Augment Aux+VP rules:

VP → AUX VP

(AUX Root) = Be2

(AUX CompForm) = (VP2 VForm)

(VP2 VForm) = passive

Augmenting rules for passive voice

VP → V NP X

(V Subcat) = _y
VP → V X

(V Subcat) = _y

(V VForm) = passive

(VP VForm) = passive

ADD

Metarule to ease grammar coding

65

The GAP feature for passive voice
S → NP VP

(NP Agr) = (VP Agr)
(VP VForm) = passive
(VP Gap Cat) = NP
(VP Gap Agr) = (NP Agr)
(VP Gap Sem) = (NP Sem)

VP → AUX VP
(VP1 Agr) = (AUX Agr)
(VP1 VForm) = (VP2 VForm)
(VP1 Gap) = (VP2 Gap)
(AUX Lex) = be2
(VP2 VForm) = passive

V → kicked
(V VForm) = {pastprt, passive}
(V Subcat) = _np
(V Lex) = kick
(V Sem) = kick

VP → V NP
(VP VForm) = (V VForm)
(VP Gap) = (NP Gap)
(V Subcat) = _np

NP → ε
(NP Gap Cat) = NP
(NP Gap Agr) = (NP Agr)
(NP Gap Sem) = (NP Sem)

NP → cans
(NP Agr) = 3p
(NP Lex) = can
(NP Sem) = cans

AUX → were
(AUX Agr) = 3p
(AUX Lex) = be2

Empty string

1

2

3

4

5

1

2

3

4

5

1

2

3

4

1

2

3

1

2

3

1

2

3

1

2

cans were kicked ε

NP (Agr ➀

Sem ➁

Gap (Cat NP

Agr ➀

Sem ➁))

AUX (Agr 3p

Lex be2)

V (VForm {passive,

pastprt}

Subcat _np
Sem kick)

NP (Agr 3p
Sem cans)

VP

(VForm ➂

Gap ➃

V (VForm ➂ {passive,

pastprt}

Subcat _np
Sem kick)

NP (Agr ➀

Sem ➁

Gap ➃ (Cat NP

Agr ➀

Sem ➁)))
VP

(Agr ➄

VForm ➂

Gap ➃

AUX (Agr ➄ 3p

Lex be2)

VP (VForm ➂

Gap ➃

V (VForm ➂ passive

Subcat _np
Sem kick)

NP (Agr ➀

Sem ➁

Gap ➃ (Cat NP

Agr ➀

Sem ➁)))

S

(NP (Agr ➊ 3p
Sem ➋ cans )

VP (Agr ➊

VForm ➂

Gap ➃

AUX (Agr ➊ 3p

Lex be2)

VP (VForm ➂

Gap ➃

V (VForm ➂ passive

Subcat _np
Sem kick)

NP (Agr ➀

Sem ➋

Gap ➃ (Cat NP

Agr ➀

Sem ➋)))

66

Note: The green ➊’s of the

S were ➄’s until the 4th con-

straint of the rule S → NP

VP. The 5th constraint fills in

the Sem of the Gap ➋.

67

• Other constructions involve NPs in syntactic
configurations where they would not get the
right thematic roles using linear order alone.

Nadia seems to like Ross.
Nadia seems to be liked.
Nadia is easy to like.
Who did Nadia like?
I fed the dog that Nadia likes to walk.

• Can use grammar rules with gap features to
ensure correct structure/interpretation of
these as well.

Other cases of gap percolation

68

• Features help capture syntactic constructions
in a general and elegant grammar.

• Features can encode the compositional
semantics of a sentence as you parse it.

• Features can accomplish mapping functions
between syntax and semantics that simplify
the interpretation process.

Summary