程序代写代做代考 Hidden Markov Mode algorithm chain gui PowerPoint Presentation

PowerPoint Presentation

Comp90042
Workshop
Week 9

18 May

1

1

Probabilistic context-free grammar 

Probabilistic CYK parsing vs. CYK parsing

Dependency parsing

(transition-based, probabilistic) dependency parsing vs. (probabilistic) CYK parsing
2
Table of Contents

What is a probabilistic context-free grammar and what problem does it attempt to solve?

A probabilistic context-free grammar adds probability to each production rule in the context-free grammar.

Provides a “language model” to describe the likely sentences in a language.
3
Probabilistic context-free grammar

What differentiates probabilistic CYK parsing from CYK parsing? Why is this important? How does this affect the algorithms used for parsing?
Parsing: the process of identifying the structure(s) of a sentence, according to a grammar of the language.
Dynamic programming: large search space
Chart:
CYK: partial parses (licensed structures) for various spans (sequences of tokens) ;
Probabilistic CYK: additionally, probabilities of production rules
Important: differentiate likely parses from unlikely ones
Affect: need to track the maximum probability analysis for each span, rather than the set of grammar symbols
4
Probabilistic CYK parsing vs CYK parsing

4

5
HMM to CFG
A hidden Markov model assigns each word in a sentence with a tag, e.g.,
       Donald/NNP has/VBZ small/JJ hands/NNS
The probability of the sequence is based on the tag-word pairs, and the pairs of adjacent tags. Show how this process can be framed as a CFG, and how various probabilities (e.g., observation, transition, and initial state) can be assigned to productions. What are the similarities and differences between CYK parsing with this grammar, and HMM’s Viterbi algorithm for finding the best scoring state sequence?

6
Hidden Markov Revisiting
What are the parameters do we need to learn when training HMM?

Initial state probabilities
record the distribution of tags for the first token of each sentence

2. Transition probabilities
record the distribution of tags of the immediately following token

3. Emission probabilities
record the distribution of corresponding tokens

7
HMM

Donald/NNP has/VBZ small/JJ hands/NNS

Donald
has
small

Visualize the HMM
hands
VBZ
NNP
JJ
NNS

8
HMM to CFG
1: The first step is to write the probability assigned by the HMM to the tagged sentence
2: this can be drawn as a chain with the tag-tag transitions as the “spine”, with each observation as a leaf. The probabilities of initial/transition/observations can be attached to each edge, resulting in a right-branching tree.
3: treating as a “parse tree”, each clique from the tree (parent and direct children) can be treated as a CFG rule, parent         left-child right-child, e.g., NNP         Donald VBZ
4: The score for this rule is ANNP,VBZ   times ONNP,Donald . Note that we could use different grammar symbols, such that each production maps to a single HMM component.
5. CYK parsing under this grammar is the same as Viterbi in HMM, with extra waste effort assigning analysis to all word spans in the sentence, while Viterbi only considers spans that start at the first word of the sentence

Using typical dependency types, construct (by hand) a dependency parse for the following sentence: Yesterday, I shot an elephant in my pyjamas. Check your work against the output of the online GUI for the Stanford Parser: https://corenlp.run/

9
Dependency Parsing

ID         Token              Head    Relation
1           Yesterday       4           OBL:TMOD
2            ,                      4           PUNCT
3            I                      4           NSUBJ
4            shot                0          ROOT
5            an                   6          DET
6            elephant        4          DOBJ
7             in                    9          CASE
8             my                  9          POSS
9             pyjamas         4         OBL
10           .                       4         PUNCT
for description of relations: https://universaldependencies.org/u/dep/

10
Dependency Parsing

In what ways is (transition-based, probabilistic) dependency parsing similar to (probabilistic) CYK parsing? In what ways is it different?

Similarity:
(1) determine the structure of a sentence;
(2) attempt to disambiguate among the (perhaps many) possible structures licensed by the grammar using a probabilistic grammar;
(3) process the tokens in the sentence one-by-one, left-to-right
11
(probabilistic) dependency parsing vs. (probabilistic) CYK parsing

In what ways is (transition-based, probabilistic) dependency parsing similar to (probabilistic) CYK parsing? In what ways is it different?
Differences:
(1) Although POS tags are implicitly used in constructing the “oracle” (training), the dependency parser does not explicitly tag the sentence;
(2) the transition-based dependency parser can potentially take into account other (non-local) relations in the sentence, whereas CYK’s probabilities only depend on the (local) sub-tree;
(3) CYK add numerous fragments to the chart, not used in the final parse, whereas the transition-based dependency parser only adds edges used  in the final structure.
12
(probabilistic) dependency parsing vs. (probabilistic) CYK parsing

Probabilistic context-free grammar 

Probabilistic CYK parsing vs. CYK parsing

Dependency parsing

(transition-based, probabilistic) dependency parsing vs. (probabilistic) CYK parsing
13
Takeaways

/docProps/thumbnail.jpeg