CS计算机代考程序代写 flex algorithm l12-discourse-v3

l12-discourse-v3

COPYRIGHT 2021, THE UNIVERSITY OF MELBOURNE
1

COMP90042
Natural Language Processing

Lecture 12
Semester 1 2021 Week 6

Jey Han Lau

Discourse

COMP90042 L12

2

Discourse

• Most tasks/models we learned operate at word or
sentence level:

‣ POS tagging

‣ Language models

‣ Lexical/distributional semantics

• But NLP often deals with documents

• Discourse: understanding how sentences relate
to each other in a document

COMP90042 L12

3

Outline

• Discourse segmentation

• Discourse parsing

• Anaphora resolution

COMP90042 L12

4

Discourse Segmentation

COMP90042 L12

5

Discourse Segmentation

• A document can be viewed as a sequence of
segments

• A segment: a span of cohesive text

• Cohesion: organised around a topic or function

COMP90042 L12

6

• Wikipedia biographies: early years, major events,
impact on others

COMP90042 L12

7

• Scientific articles: introduction, related work,
experiments

COMP90042 L12

8

Unsupervised Approaches

• TextTiling algorithm: looking for points of low
lexical cohesion between sentences

• For each sentence gap:
‣ Create two BOW vectors consisting of words from k

sentences on either side of gap
‣ Use cosine to get a similarity score (sim) for two

vectors
‣ For gap i, calculate a depth score, insert boundaries

when depth is greater than some threshold t
) + ) 𝑑𝑒𝑝𝑡h(gap𝑖) = (𝑠𝑖𝑚𝑖−1  − 𝑠𝑖𝑚𝑖 (𝑠𝑖𝑚𝑖+1  − 𝑠𝑖𝑚𝑖

COMP90042 L12

9

Text Tiling Example (k=1, t=0.9)

He walked 15 minutes to the tram stop.
Then he waited for another 20 minutes, but
the tram didn’t come.
The tram drivers were on strike that
morning.
So he walked home and got his bike out of
the garage.
He started riding but quickly discovered he
had a flat tire
He walked his bike back home.
He looked around but his wife had cleaned
the garage and he couldn’t find the bike
pump.

sim: 0.9

sim: 0.7

sim: 0.1

sim: 0.5

sim: 0.8

sim: 0.5

d=0.7-0.9=-0.2

d=(0.9-0.7)+(0.1-0.7)=-0.4

d=(0.7-0.1)+(0.5-0.1)=1.0

d=(0.1-0.5)+(0.8-0.5)=-0.1

d=(0.1-0.5)+(0.8-0.5)=-0.6

d=0.8-0.5=0.3

) + ) 𝑑𝑒𝑝𝑡h(gap𝑖) = (𝑠𝑖𝑚𝑖−1  − 𝑠𝑖𝑚𝑖 (𝑠𝑖𝑚𝑖+1  − 𝑠𝑖𝑚𝑖

COMP90042 L12

10

Supervised Approaches

• Get labelled data
from easy sources
‣ Scientific

publications
‣ Wikipedia

articles

COMP90042 L12

11

Supervised Discourse Segmenter

• Apply a binary classifier to identify boundaries
• Or use sequential classifiers
• Potentially include classification of section types

(introduction, conclusion, etc.)
• Integrate a wider range of features, including

‣ distributional semantics
‣ discourse markers (therefore, and, etc)

COMP90042 L12

12

Discourse Parsing

COMP90042 L12

13

Discourse Analysis

• Identify discourse units, and the relations that
hold between them

• Rhetorical Structure Theory (RST) is a
framework to do hierarchical analysis of discourse
structure in documents

COMP90042 L12

14

Discourse Units

• Typically clauses of a sentence

• DUs do not cross sentence boundary

• [It does have beautiful scenery,] [some of the best
since Lord of the Rings.]

• 2 merged DUs = another

composite DU

DU

COMP90042 L12

15

Discouse Relations

• Relations between discourse units:

‣ conjuction, justify, concession, elaboration, etc

‣ [It does have beautiful scenery,] 

↑(elaboration) 


[some of the best since Lord of the Rings.]

COMP90042 L12

16

Nucleus vs. Satellite
• Within a discourse relation, one argument is the nucleus

(the primary argument)

• The supporting argument is the satellite

‣ [It does have beautiful scenery,]nucleus

↑(elaboration) 


[some of the best since Lord of the Rings.]satellite
• Some relations are equal (e.g. conjunction), and so both

arguments are nuclei

‣ [He was a likable chap,]nucleus

↑(conjunction) 


[and I hated to see him die.]nucleus

COMP90042 L12

1A: [It could have been a great movie.]
1B: [It does have beautiful scenery,]
1C: [some of the best since Lord of the Rings.]
1D: [The acting is well done,]
1E: [and I really liked the son of the leader of the Samurai.]
1F: [He was a likable chap,]
1G: [and I hated to see him die.]
1H: [But, other than all that, this movie is nothing more than 

hidden rip-offs.] 17

RST Tree

• An RST relation
combines two or
more DUs into
composite DUs

• Process of
combining DUs is
repeated, creating
an RST tree

COMP90042 L12

18

RST Parsing

• Task: given a document, recover the RST tree

‣ Rule-basd parsing

‣ Bottom-up approach

‣ Top-down aproach

1A: [It could have been a great movie.]
1B: [It does have beautiful scenery,]
1C: [some of the best since Lord of the Rings.]
1D: [The acting is well done,]
1E: [and I really liked the son of the leader of the Samurai.]
1F: [He was a likable chap,]
1G: [and I hated to see him die.]
1H: [But, other than all that, this movie is nothing more than 

hidden rip-offs.]

COMP90042 L12

19

Parsing Using Discourse Markers

• Some discourse markers (cue phrases) explicitly
indicate relations
‣ although, but, for example, in other words, so,

because, in conclusion,…
• Can be used to build a simple rule-based parser
• However
‣ Many relations are not marked by discourse

marker
‣ Many discourse markers ambiguous (e.g. and)

COMP90042 L12

20

Parsing Using Machine Learning

• RST Discourse Treebank

‣ 300+ documents annotated with RST trees

• Basic idea:

‣ Segment document into DUs

‣ Combine adjacent DUs into composite DUs
iteratively to create the full RST tree (bottom-up
parsing)

COMP90042 L12

21

Bottom-Up Parsing

• Transition-based parsing (lecture 16):

‣ Greedy, uses shift-reduce algorithm

• CYK/chart parsing algorithm (lecture 14)

‣ Global, but some constraints prevent CYK from
finding globally optimal tree for discourse
parsing

COMP90042 L12

22

Top-Down Parsing

1. Segment document into DUs

2. Decide a boundary to split
into 2 segments

3. For each segment, repeat
step 2

1A: [It could have been a great movie.]
1B: [It does have beautiful scenery,]
1C: [some of the best since Lord of the Rings.]
1D: [The acting is well done,]
1E: [and I really liked the son of the leader of the Samurai.]
1F: [He was a likable chap,]
1G: [and I hated to see him die.]
1H: [But, other than all that, this movie is nothing more than 

hidden rip-offs.]

1

2

3
4

5
6
7

COMP90042 L12

23

Discourse Parsing Features

• Bag of words

• Discourse markers

• Starting/ending n-grams

• Location in the text

• Syntax features

• Lexical and distributional similarities

COMP90042 L12

Applications of Discourse Parsing?

PollEv.com/jeyhanlau569

• Summarisation

• Sentiment analysis

• Argumentation

• Authorship attribution

• Essay scoring

http://PollEv.com/jeyhanlau569
http://PollEv.com/jeyhanlau569

COMP90042 L12

25

COMP90042 L12

1A: [It could have been a great movie.]
1B: [It does have beautiful scenery,]
1C: [some of the best since Lord of the Rings.]
1D: [The acting is well done,]
1E: [and I really liked the son of the leader of the Samurai.]
1F: [He was a likable chap,]
1G: [and I hated to see him die.]
1H: [But, other than all that, this movie is nothing more than 

hidden rip-offs.] 26

Applications of Discourse Parsing?

• Summarisation

• Sentiment analysis

• Argumentation

• Authorship attribution

• Essay scoring

COMP90042 L12

27

Anaphora Resolution

COMP90042 L12

28

Anaphors

• Anaphor: linguistic expressions that refer back to
earlier elements in the text

• Anaphors have a antecedent in the discourse, often
but not always a noun phrase
‣ Yesterday, Ted was late for work. It all started when

his car wouldn’t start.
• Pronouns are the most common anaphor
• But there are various others
‣ Demonstratives (that problem)

COMP90042 L12

29

Motivation

• Essential for deep semantic analysis

‣ Very useful for QA, e.g., reading
comprehension

Ted’s car broke down. So he went over to Bill’s
house to borrow his car. Bill said that was fine.

Whose car is borrowed?

COMP90042 L12

30

Antecedent Restrictions

• Pronouns must agree in number with their antecedents
‣ His coworkers were leaving for lunch when Ted

arrived. They invited him, but he said no.
• Pronouns must agree in gender with their antecedents
‣ Sue was leaving for lunch when Ted arrived. She

invited him, but he said no.
• Pronouns whose antecedents are the subject of the

same syntactic clause must be reflexive (…self)
‣ Ted was angry at him. [him ≠ Ted]
‣ Ted was angry at himself. [himself = Ted]

COMP90042 L12

31

Antecedent Preferences

• The antecedents of pronouns should be recent
‣ He waited for another 20 minutes, but the tram

didn’t come. So he walked home and got his
bike out of the garage. He started riding it to
work.

• The antecedent should be salient, as determined
by grammatical position
‣ Subject > object > argument of preposition
‣ Ted usually rode to work with Bill. He was never

late.

COMP90042 L12

32

Entities and Reference

• Discourse 16.1 (left) more coherent

• Pronouns all refer to John consistently, the
protagonist

COMP90042 L12

33

Centering Theory

• A unified account of relationship between
discourse structure and entity reference

• Every utterance in the discourse is characterised
by a set of entities, known as centers

• Explains preference of certain entities for
ambiguous pronouns

COMP90042 L12

34

For an Utterance Un
• Forward-looking centers:

‣ All entities in Un: 

Cf(Un) = [e1, e2, …]

‣ Cf(16.1a) = [John, music store, piano]

‣ Ordered by syntactic prominence: subjects > objects

• Backward-looking center:

‣ Highest ranked forward-looking center in previous utterance (Cf(Un-1))
that is also in current utterance (Un)

‣ Candidate entities in 16.1b = [John, music store]

‣ Cb(16.1b) = [John]

‣ Not music store because John has a higher rank in previous
utterance’s forward-looking centers Cf(Un-1)

COMP90042 L12

35

Centering Algorithm

• When resolving entity for anaphora resolution,
choose the entity such that the top foward-
looking center matches with the backward-
looking center

• Why? Because the text reads more fluent when
this condition is satisfied

COMP90042 L12

36

Text is coherent because the top
forward-looking center matches the
backward-looking center for each
utterance:

top forward-looking center = John
backward-looking center = John

Not quite the case here.

Cf(16.2b) = [music store, John]
Cb(16.2b) = [John]

Cf(16.2d) = [music store, John]
Cb(16.2d) = [John]

COMP90042 L12

37

Supervised Anaphor Resolution

• Build a binary classifier for anaphor/antecedent pairs
• Convert restrictions and preferences into features

‣ Binary features for number/gender compatibility
‣ Position of antecedent in text
‣ Include features about type of antecedent

• With enough data, can approximate the centering
algorithm

• But also easy to include features that are potentially
helpful
‣ words around anaphor/antecedent

COMP90042 L12

38

Anaphora Resolution Tools

• Stanford CoreNLP includes 

pronoun coreference models
‣ rule-based system isn’t too 


bad
‣ considerably faster than neural models

SYSTEM LANGUAGE PREPROCESSING TIME COREF TIME TOTAL TIME F1 SCORE

Deterministic English 3.87s 0.11s 3.98s 49.5

Statistical English 0.48s 1.23s 1.71s 56.2

Neural English 3.22s 4.96s 8.18s 60.0

Deterministic Chinese 0.39s 0.16s 0.55s 47.5

Neural Chinese 0.42s 7.02s 7.44s 53.9

Source: https://stanfordnlp.github.io/CoreNLP/coref.html

Evaluated on CoNLL 2012 task.

https://stanfordnlp.github.io/CoreNLP/coref.html
https://stanfordnlp.github.io/CoreNLP/coref.html

COMP90042 L12

39

A Final Word

• For many tasks, it is important to consider context
larger than sentences

• Traditionally many popular NLP applications has
been sentence-focused (e.g. machine translation),
but that is beginning to change…

COMP90042 L12

40

Further Reading

• E18, Ch 16