1. Introduction and JUnit
Natural Language
1
This module explores the relationship of AI to natural (i.e., ordinary) language. NL has always been part of AI but two things have have recently made an NL a key technology—if not part of our lives. The first is the recognition that machine learning is extraordinarily helpful for this field; the second (not entirely independent) is the fact that people have grown accustomed to speaking to devices.
1
NLP Learning Goals
Frame the NLP Enterprise
Enable Role of Grammars
Use Tools
Explore Logic Approaches
Leverage Neural Net Approaches
Natural Language
3
Introduction
Grammars
Tools
Logic Approaches
Neural Net Approaches
We begin with discussions of analytical approaches to NL and cap this with the application of neural Nets.
NL Iput vs. Output
Input: Requires comprehension by the program
Output: ML generation
Our focus
Speech Act
Adapted from Russell
Assume text
It can be useful to think of natural language in terms of “speech acts”—something that involves the source, the content, and the destination.
5
Speech Acts
Speaker Utterance Hearer
Example:
From an AI course, you can learn how to create intelligent applications. “Intelligent” means that the application’s output is informed by human understanding.
There is no preconceived limit to the content size of a speech act.
6
Some Types of Speech Acts
—achieve speaker’s goals
Inform: There’s a pit in front of you”
Query: Can you see the gold?
Command: …
Adapted from Russell & Norvig
There are several kinds of speech acts as shown in the figure. And although they all contain words, they are very different.
7
Some Types of Speech Acts
—achieve speaker’s goals
Inform: There’s a pit in front of you”
Query: Can you see the gold?
Command: Pick it up
Promise: I’ll share the gold with you
Acknowledge: OK
Adapted from Russell & Norvig
There are several kinds of speech acts as shown in the figure. And although they all contain words, they are very different.
8
Requires knowledge of …
Situation
Semantic and syntactic conventions
…
Adapted from Russell & Norvig
Make sure that you know the difference
Make sure that you know the difference
Every speech act exists in some context—the same speech act can mean something entirely different in a different context. A sequence of words obeys syntactic conventions like programming languages: in other words, it must have a recognized format. At the same time, it must convey meaning—it’s semantics.
9
Requires knowledge of …
Situation
Semantic and syntactic conventions
Speaker’s (hearer’s?) …
goals,
knowledge base, and
rationality
Adapted from Russell & Norvig
In addition to the context, semantics, and syntax, the better we understand the goals, knowledge, and reasoning of the hearer and of the speaker, the better we can interpret and process an utterance.
10
Stages of Informing: Speaker
Adapted from Russell & Norvig
Stages of Informing: Speaker
Adapted from Russell & Norvig
Stages of Informing: Speaker
Adapted from Russell & Norvig
Stages of Informing: Hearer
Adapted from Russell & Norvig
The figure shows the roles of the speaker (S) and the hearer (H).
14
How Relations Are Expressed in English
15
from Russell & Norvig
Note analysis
Continuing this analytic approach, we can measure how often a verb, for example, occurs, or a noun/preposition sequence such as “here of.”
15
How Relations Are Expressed in English
16
from Russell & Norvig
Natural Language
17
Introduction
Grammars
Tools
Logic Approaches
Neural Net Approaches
This section reviews grammar in natural and formal language—its structure.
Grammars
Adapted from Russell & Norvig
Example: grammar G={S, p, q}, with productions
S → pSp,
S → qSq,
S → ε.
Typical derivation:
S → pSp → ppSpp → ppqSqpp → ppqqpp.
A grammar is defined by a vocabulary—e.g., {S, p, q}—and a set of rewrite rules, as shown in the figure.
18
Grammars
Adapted from Russell & Norvig
(Backus Nauer Form)
The figure shows a common rewrite rule. It says that every sentence consists of a noun phrase followed by a verb phrase. The symbols used for the latter are sufficient at this level (they are expressive, but you could use any symbols you want, actually.
19
Example*
20
* https://en.wikipedia.org/wiki/Context-sensitive_grammar#Examples
Grammar for { anbncn : n ≥ 1 } :
e.g., a3b3c3:
The figure shows they set of rewrite rules that specify expressions of the form anbncn.
20
Example*
21
* https://en.wikipedia.org/wiki/Context-sensitive_grammar#Examples
Grammar for { anbncn : n ≥ 1 } :
e.g., a3b3c3:
The figure shows they set of rewrite rules that specify expressions of the form anbncn.
21
Example*
22
* https://en.wikipedia.org/wiki/Context-sensitive_grammar#Examples
Grammar for { anbncn : n ≥ 1 } :
e.g., a3b3c3:
The figure shows they set of rewrite rules that specify expressions of the form anbncn.
22
Example: Wumpus World Lexicon
Adapted from Russell & Norvig
Continued
These figures show the complete grammar for Wumpus World. The first shows the terminals and the second the nonterminals.
23
Example: Wumpus World Lexicon
Adapted from Russell & Norvig
Continued
These figures show the complete grammar for Wumpus World. The first shows the terminals and the second the nonterminals.
24
Wumpus NL Parse Tree Example
Adapted from Russell & Norvig
Russell and Norvig developed a simple language for Wumpus World, typified by the figure.
25
Wumpus World Grammar
Adapted from Russell & Norvig
Grammar ctd.
Example
Comparing Formal (L1) & Natural (L2) Language
Adapted from Russell & Norvig
The figure shows typical problems with using a grammar for natural language. There are natural language utterances which make good sense to humans but which the grammar can’t parse (false negatives); conversely, there are expressions which the grammar can parse but which make no sense in the real world (false positives).
27
Comparing Formal (L1) & Natural (L2) Language
Adapted from Russell & Norvig
The figure shows typical problems with using a grammar for natural language. There are natural language utterances which make good sense to humans but which the grammar can’t parse (false negatives); conversely, there are expressions which the grammar can parse but which make no sense in the real world (false positives).
28
Syntax Meaning in NL?
Adapted from Russell & Norvig
It is one thing to recognize the legality of an utterance according to a grammar but that is just the beginning: we still need to assign meaning to the utterance. In other words, the utterance must belong within the real world. That typically means connecting it with concepts already known to users (readers, listeners etc.).
29
Recall the Wide Variety of Contexts!
Adapted from Russell & Norvig
The figure reminds us that there can be several—perhaps many—contexts for an utterance. Parsing does not distinguish them well—if at all.
30
Natural Language
31
Introduction
Grammars
Tools
Logic Approaches
Neural Net Approaches
Tools for dealing with natural language have multiplied on the Web.
Entities in an Utterance (Google)
32
https://cloud.google.com/natural-language/
A good example of a tool is from Google, where the figures show a typical analysis session. It goes well beyond syntax, and provides semantic options.
32
Analyses
33
Google N-L: Parsing Example—First Part
34
This figure shows its enhanced parsing.
34
Google N-L: Parsing Example—Second Part
35
Sentiment Analysis for a Document
score overall emotion: -1.0 (negative) to 1.0 (positive)
magnitude strength of emotion (both positive and negative): 0.0 to +inf
often proportional to length
Adapted from https://cloud.google.com/natural-language/docs/basics#interpreting_sentiment_analysis_values
To handle semantics in the real world, it is often useful to understand how the user is feeling. This is partially captured by sentiment analysis.
36
Google N-L: Sentiment Analysis
37
The color coding etc. can be translated into useful API’s, enabling a degree of natural language understanding to be integrated into applications.
37
Categorization
38
Tools like this are a reminder that there are many aspects to semantics. For example simply categorizing an utterance can be much of what is needed for a given application. Word by word semantic analysis maybe unnecessary.
38
Application: TEXTRUNNER
39
Adapted from Russell & Norvig
Achieves precision of 88% and recall of 45% (F1 of 60%) on a large Web corpus.
Has extracted hundreds of millions of facts from a corpus of a half-billion Web pages.
E.g., …
There are many natural language applications, textrunner being one. The utility of a natural language analyzer is highly dependent on what will be done with the results. For example, whether we can deal with the error that water kills bacteria.
39
Application: TEXTRUNNER
40
Adapted from Russell & Norvig
Achieves precision of 88% and recall of 45% (F1 of 60%) on a large Web corpus.
Has extracted hundreds of millions of facts from a corpus of a half-billion Web pages.
E.g., even though it has no predefined medical knowledge, it has extracted over 2000 answers to what kills bacteria. Correct answers include antibiotics, ozone, chlorine, Cipro, and broccoli sprouts. Questionable answers include “water,” which came from the sentence “Boiling water for at least 10 minutes will kill bacteria.”
https://openie.allenai.org/
There are many natural language applications, textrunner being one. The utility of a natural language analyzer is highly dependent on what will be done with the results. For example, whether we can deal with the error that water kills bacteria.
40
How This Can Go Wrong?
Adapted from Russell & Norvig
As mentioned above, there are many contexts for utterances. Natural language systems may not recognize them. Humans can be good at detecting sincerity or ambiguity whereas it is often believed that natural language systems have a harder time with this.
41
Example API: TextRazor
https://www.textrazor.com/
Adapted from Russell & Norvig
TextRazor is an example of a natural language API. An example follows.
42
TextRazor Example: Input
43
Barclays misled shareholders and the public about one of the biggest investments in the bank’s history, a BBC Panorama investigation has found.
The bank announced in 2008 that Manchester City owner Sheikh Mansour had agreed to invest more than £3bn.
But
…
Neither Sheikh Mansour nor IPIC responded to questions raised by Panorama.
In August last year, the UK’s Serious Fraud Office said it had started an investigation into commercial arrangements between the bank and Qatar Holding LLC, part of sovereign wealth fund Qatar Investment Authority.
https://www.textrazor.com/demo
TextRazor Example Output: Categories
44
0.93
economy, business and finance>economy>macro economics>investments
0.72
economy, business and finance>business information>business finance>shareholder
0.70
economy, business and finance>economy
0.64
economy, business and finance>market and exchange>securities
0.49
crime, law and justice>law
…
https://www.textrazor.com/demo
TextRazor Example Output: Topics
45
1.00
Barclays
1.00
Mansour bin Zayed Al Nahyan
1.00
Qatar Investment Authority
1.00
Finance
1.00
Economy
1.00
…
https://www.textrazor.com/demo
TextRazor Example Output: Meaning
Contextual entailment “captures relations of consequence, resolution, presupposition, and dependency which hold not purely logically, but against the background of a specific context.”
https://link.springer.com/article/10.1007/s11229-016-1221-y#:~:text=Contextual%20entailment%20captures%20relations%20of,background%20of%20a%20specific%20context.&text=Focusing%20on%20dependency%2C%20let%20us,instance%20of%20entailment%20in%20context.
Python NLTK for NL Text Processing
47
http://text-processing.com/demo/
Sentiment Analysis:
e.g., “I hate Churchill”
Tokenizing
Stemming
Tagging
The natural language toolkit http://text-processing.com/demo/ can be tried out online.
47
Natural Language
48
Introduction
Grammars
Tools
Logic Approaches
Neural Net Approaches
Before we get to machine learning approaches to NLP, we discuss one more family of approaches based on formal grammars.
Natural Language, Realistically
Adapted from Russell & Norvig
the use of a word referring to or replacing a word used earlier in a sentence
words, such as “I” or “here”, that can have different meanings depending on who is saying them
substitution of the name of an attribute or adjunct for that of the thing meant, e.g., suit for business executive, or the track for horse racing.
We’ve discussed some of the characteristics of natural language that are not found in pure logic. The figure shows more of them.
49
Need More Than Backus Nauer Form
Adapted from Russell & Norvig
The rewrite rules that we’ve been using are known as Backus-Nauer form.
50
Need More Than Backus Nauer Form
Adapted from Russell & Norvig
One approach is to use classical logic instead of production rules. In this case we can be more flexible. For example, instead of saying produces X we can say can be interpreted as an X. In other words, we’re using predicates (of which we will say more later).
51
Augmented Rules using FOL
Adapted from Russell & Norvig
If s1 (e.g., Raymond) is interpreted as a noun phrase and reference
and the result is interpreted as eating breakfast
and s2 (e.g., was eating) is interpreted as a verb phrase
then s1 who s2 (Raymond who was eating) can be interpreted as a noun phrase.
This example says if s1 is interpreted as a noun phrase … But notice that we can add another implication if s1 is interpreted as
52
Querying the Knowledge Base (KB)
Adapted from Russell & Norvig
The idea is to leverage automated logic processing, which been quite successful in recent years.
53
Natural Language
54
Introduction
Grammars
Tools
Logic Approaches
Neural Net Approaches
NLP via Neural Nets
55
Yoav Goldberg
“… neural network models started to be applied also to textual natural language signals, again with very promising results. … input encoding for natural language tasks, feed-forward networks, …
https://www.jair.org/index.php/jair/article/view/11030/26198
It is ironic that after years of natural language analysis, a simpler but massive alternative has turned out to be remarkably successful: using machine learning and a corpus of utterances in various contexts.
55
Neural Nets: What?
A problem-solving technique that simulates neurons and their interaction.
https://en.wikipedia.org/wiki/Neuron
Neural nets are based on aspects of the brain. A Neural net consists of neurons—cells that take input from and provide output to other neurons. Importantly, a single neuron does not seem to encode knowledge as we understand it. Knowledge is encoded by the set of connections between neurons.
Modelling Neuronal I/O
Process
* Output =
f (wo + w’o’)
*
*
*
Transfer function
input wo+w’o’
weight w
weight w’
In software, we model each connection with a number (the weight) that reflects its relative strength.
Each neuron in a neural net takes as input the sum of a output of other neurons, weighted by the connection strength. It then applies a function (a transfer function) to this quantity. The output of this function becomes an input to other neurons (after weighting), or else it is the output to the whole neural net.
Modelling Neuronal I/O
Process
Process
Process
Process
Process
Process
output o
output o’
*
*
*
Neuron simulation
input wo+w’o’
weight w
weight w’
This figure shows how a (simulated) neuron interacts with other neurons.
As with (what we know about) biological neural networks, learning consists mainly of modifying weights.
Google Translate Technical Architecture
59
Neural machine translation (NMT) … machine translation that uses a large artificial neural network to predict the likelihood of a sequence of words, typically modeling entire sentences in a single integrated model.
Deep neural machine … multiple neural network layers …[1]
https://en.m.wikipedia.org/wiki/Neural_machine_translation
One success, for example, has been Google Translate. This was a natural first target because there is a large corpus of input output data.
59
“NEURAL MACHINE TRANSLATION …*”
60
… encode a source sentence into a fixed-length vector from which a decoder generates a translation. …
* https://arxiv.org/pdf/1409.0473.pdf Bahdanau, Cho, and Bengio
The original paper describes the use of fixed length vectors.
60
“NEURAL MACHINE TRANSLATION …*”
61
… encode a source sentence into a fixed-length vector from which a decoder generates a translation.
… automatically search for parts of a source sentence that are relevant to predicting a target word,
… achieve a translation performance comparable to the existing state-of-the-art phrase-based system on the task of English-to-French translation. …
* https://arxiv.org/pdf/1409.0473.pdf Bahdanau, Cho, and Bengio
A Neural Net Architecture for Natural Language: First Convert Words*
62
* (partial) http://www.jair.org/index.php/jair/article/view/11030
Neural nets deal with numbers (within vectors) so the first order of business is to convert words into numerical code.
62
A NN Architecture for Natural Language*
63
* (partial) http://www.jair.org/index.php/jair/article/view/11030
Neural Nets typically have multiple levels. The example shown has 7 hidden (neither input nor output) layers. It shows the type of activation at each level.
63
Google Translate Example
64
Now is the winter of our discontent.
Made glorious summer by this sun of York
English Dutch (for example) English
A breakthrough occurred when Google Translate was used on a passage from language A to language B and the result back to language A. Even for a nontrivial passage, the two versions were surprisingly comparable. An example is shown in the figures.
64
Example: Google Translate
65
Example: Google Translate
66
NLP Summary
Substantial enterprise
Grammars have role
Tools prevalent
Logic approaches may help
Neural net approaches ascendant
To summarize: natural language has become a major application area of AI in the real world. Although grammars continue to have a role, API’s have become prevalent, and the neural net approach, leveraging existing text, is the dominating approach.
67
/docProps/thumbnail.jpeg