COMP30026 Models of Computation – Context-Free Languages
COMP30026 Models of Computation
Context-Free Languages
Bach Le / Anna Kalenkova
Lecture Week 8 Part 2
Semester 2, 2021
Models of Computation (Sem 2, 2021) Context-Free Languages © University of Melbourne 1 / 14
A Bit of History
Finite-state machines go back to McCulloch and Pitts (1943), who
wanted to model the working of neurons and synapses.
The formalism that we use today is from Moore (1956).
Kleene (1956) established the connections between regular
expressions and finite-state automata.
We now turn to context-free grammars.
These go back to Post’s “productions” and Chomsky’s grammar
formalism (1956).
Chomsky, a linguist, proposed a range of formalisms in grammar form
for the description of natural language syntax.
Models of Computation (Sem 2, 2021) Context-Free Languages © University of Melbourne 2 / 14
Machines vs Languages
Regular
Context-free
Context
sensitive
Decidable
Turing recognisable
Finite
automata
Pushdown
automata
Linear-
bounded
automata
Halting
Turing
machines
Turing machines
Models of Computation (Sem 2, 2021) Context-Free Languages © University of Melbourne 3 / 14
Context-Free Grammars in Computer Science
In the 60’s, computer scientists started adopting context-free free
grammars to describe syntax of programming languages.
They are frequently referred to as Backus-Naur Formalism (BNF).
Standard tools for parsing owe much to this formalism, which
indirectly has helped make parsing a routine task.
It is extensively used to specify syntax of programming languages,
data formats (XML, JSON), etc.
Pushdown automata are to context-free grammars what finite-state
automata are to regular languages.
Models of Computation (Sem 2, 2021) Context-Free Languages © University of Melbourne 4 / 14
Context-Free Grammars
We have already used the formalism of context-free grammars. To
specify the syntax of regular expressions we gave a grammar, much
like
R → 0
R → 1
R → eps
R → empty
R → R ∪ R
R → R R
R → R∗
Hence a grammar is a set of substitution rules, or productions. We
have the shorthand notation
R → 0 | 1 | eps | empty | R ∪ R | R R | R∗
Models of Computation (Sem 2, 2021) Context-Free Languages © University of Melbourne 5 / 14
Derivations, Sentences and Sentential Forms
A simpler example is this grammar G :
A → 0 A 0
A → 1 A 1
A → ǫ
Using the two rules as a rewrite system, we get derivations such as
A ⇒ 0A0
⇒ 00A00
⇒ 001A100
⇒ 00100100
A is called a variable. Other symbols (here 0 and 1) are terminals.
We refer to a valid string of terminals (such as 00100100) as a
sentence. The intermediate strings that mix variables and terminals
are sentential forms.
Models of Computation (Sem 2, 2021) Context-Free Languages © University of Melbourne 6 / 14
Context-Free Languages
Clearly a grammar determines a formal language.
The language of G is written L(G ).
L(G ) = {wwR | w ∈ {0, 1}∗}
A language which can be generated by some context-free grammar is
a context-free language (CFL).
It should be clear that some of the languages that we found not to be
regular are context-free, for example
{0n1n | n ≥ 1}
Models of Computation (Sem 2, 2021) Context-Free Languages © University of Melbourne 7 / 14
Context-Free Grammars Formally
A context-free grammar (CFG) G is a 4-tuple (V ,Σ,R , S), where
1 V is a finite set of variables,
2 Σ is a finite set of terminals,
3 R is a finite set of rules, each consisting of a variable (the
left-hand side) and a string in (V ∪ Σ)∗ (the right-hand side),
4 S is the start variable.
The binary relation ⇒ on (V ∪ Σ)∗ is defined as follows.
Let u, v ,w ∈ (V ∪ Σ)∗. Then uAw ⇒ uvw iff A → v is a rule in R .
That is, ⇒ captures a single derivation step.
Let
∗
⇒ be the reflexive transitive closure of ⇒.
L(G ) = {s ∈ Σ∗ | S
∗
⇒ s}
Models of Computation (Sem 2, 2021) Context-Free Languages © University of Melbourne 8 / 14
Right/Left Regular Grammars (Not Examinable)
Right regular grammar:
A → a A
A → ǫ
A → b B
B → b B
B → ǫ
Left regular grammar:
A → A b
A → ǫ
A → B a
B → B a
B → ǫ
A B
a
b
b
Models of Computation (Sem 2, 2021) Context-Free Languages © University of Melbourne 9 / 14
A Context-Free Grammar for Numeric Expressions
Here is a grammar with three variables, 14 terminals, and 15 rules:
E → T | T + E
T → F | F ∗ T
F → 0 | 1 | . . . | 9 | ( E )
When the start variable is unspecified, it is assumed to be the
variable of the first rule.
An example sentence in the language is
(3 + 7) * 2
The grammar ensures that * binds tighter than +.
Models of Computation (Sem 2, 2021) Context-Free Languages © University of Melbourne 10 / 14
Parse Trees
E
T
F
( E
T
F
3
+ E
T
F
7
)
* T
F
2
A parse tree for
(3 + 7) * 2
Models of Computation (Sem 2, 2021) Context-Free Languages © University of Melbourne 11 / 14
Parse Trees
There are different derivations leading to the sentence (3 + 7) * 2,
all corresponding to the parse tree above. They differ in the order in
which we choose to replace variables. Here is the leftmost derivation:
E ⇒ T
⇒ F ∗ T
⇒ ( E ) ∗ T
⇒ ( T + E ) ∗ T
⇒ ( F + E ) ∗ T
⇒ ( 3 + E ) ∗ T
⇒ ( 3 + T ) ∗ T
⇒ ( 3 + F ) ∗ T
⇒ ( 3 + 7 ) ∗ T
⇒ ( 3 + 7 ) ∗ F
⇒ ( 3 + 7 ) ∗ 2
E → T | T + E
T → F | F ∗ T
F → 0 | 1 | . . . | 9 | ( E )
Models of Computation (Sem 2, 2021) Context-Free Languages © University of Melbourne 12 / 14
Ambiguity
Consider the grammar
E → E + E | E ∗ E | ( E ) | 0 | 1 | . . . | 9
This grammar allows not only different derivations, but different
parse trees for 3 + 7 * 2:
E
E
3
+ E
E
7
* E
2
E
E
E
3
+ E
7
* E
2
Models of Computation (Sem 2, 2021) Context-Free Languages © University of Melbourne 13 / 14
Accidental vs Inherent Ambiguity
A grammar that has different parse trees for some sentence is
ambiguous.
Sometimes we can find a better grammar (as in our example) which
is not ambiguous, and which generates the same language.
Models of Computation (Sem 2, 2021) Context-Free Languages © University of Melbourne 14 / 14