CS代写 flex Context Free Languages AI Grammars and

Grammars and
COMP1600 / COMP6260
Australian National University
Semester 2, 2021
1/59

Formal Languages – A Reminder of Terminology
The alphabet or vocabulary of a formal language is a set of tokens (or letters). It is usually denoted Σ.
A string over Σ is a sequence of tokens, or the null-string ε.
􏰎 sequence may be empty, giving empty string ε
􏰎 ababc is a string over Σ = {a,b,c}
A language with alphabet Σ is some set of strings over Σ.
􏰎 For example, the set of all strings Σ∗
􏰎 or the set of all strings of even length, {w ∈ Σ∗ | w has even length}
Notation.
Σ∗ is the set of all strings over Σ.
Therefore, every language with alphabet Σ is some subset of Σ∗.
1/59
1/59

Specifying Languages
Languages can be given . . .
as a finite enumeration, e.g. L = {ε, a, ab, abb}
as a set, by giving a predicate, e.g. L = {w ∈ Σ∗ | P(w)} for some alphabet Σ
algebraically by regular expressions, e.g. L = L(r) for regexp r by an automaton, e.g. L = L(A) for some FSA A
by a grammar (this lecture)
Grammar.
a concept that has been invented in linguistics to describe natural languages
describes how strings are constructed rather than how membership can be checked (e.g. by an automaton)
the main tool to describe syntax.
2/59
2/59

Grammars in general
Formal Definition. A grammar is a quadruple ⟨Vt,Vn,S,P⟩ where Vt is a finite set of terminal symbols (the alphabet)
Vn is a finite set of non-terminal symbols disjoint from Vt (Notation: V = Vt ∪ Vn)
S is a distinguished non-terminal symbol called the start symbol P is a set of productions, written
α→β
where
􏰎 α ∈ V ∗VnV ∗ (i.e. at least one non-terminal in α)
􏰎 β ∈ V∗ (i.e. β is any list of symbols)
3/59
3/59

Example
The grammar
G =⟨{a,b}, {S,A}, S, {S →aAb, aA→aaAb, A→ε}⟩
has the following components:
Terminals: {a,b} Non-terminals: {S,A} Start symbol: S
Notation.
Productions:
S → aAb aA → aaAb
A→ε
Often, we just list the productions P, as all other components can be inferred (S is the standard notation for the start symbol)
The notation α → β1 | · · · | βn abbreviates the set of productions
α→β1, α→β2, …, α→βn (like for inductive data types)
4/59
4/59

Derivations
Intuition.
A production α → β tells you what you can “make” if you have α: you can turn it into β.
The production α → β allows us to re-write any string γαρ to γβρ. Notation: γαρ ⇒ γβρ
Derivations.
α ⇒∗ β if α can be re-written to β in 0 or more steps so ⇒∗ is the reflexive transitive closure of ⇒.
Language of a grammar.
informally: all strings of terminal symbols that can be generated from the start symbol S
formally: L(G)={w ∈Vt∗ |S ⇒∗ w}
Sentential Forms of a grammar.
informally: all strings (may contain non-terminals) that can be generated from S
formally: S(G)={w ∈V∗ |S ⇒∗ w}.
5/59
5/59

Example
Productions of the grammar G.
S →aAb, aA→aaAb, A→ε.
Example Derivation.
S ⇒ aAb ⇒ aaAbb ⇒ aaaAbbb ⇒ aaabbb
last string aaabbb is a sentence, others are sentential forms
Language of grammar G.
L(G)={anbn |n∈N,n≥1}
Alternative Grammar for the same language
S → aSb, S → ab.
(Grammars and languages are not in 1-1 correspondence)
6/59
6/59

The Chomsky Hierarchy
By Noam Chomsky (a linguist!), according to the form of productions: Unrestricted: (type 0) no constraints.
Context-sensitive: (type 1) the length of the left hand side of each production must not exceed the length of the right (with one exception).
Context-free: (type 2) the left of each production must be a single non-terminal.
Regular: (type 3) As for type 2, and the right of each production is also constrained (details to come).
(There are lots of intermediate types, too.)
7/59
7/59

Classification of Languages
Definition. A language is type n if it can be generated by a type n grammar.
Immediate Fact.
Every language of type n + 1 is also of type n.
Establishing that a language is of type n
give a grammar of type n that generates the language usually the easier task
Disproving that a language is of type n
must show that no type n-grammar generates the language usually a difficult problem
8/59
8/59

Example — language {anbn | n ∈ N, n ≥ 1} Different grammars for this language
Unrestricted (type 0):
Context-free (type 2):
S → aAb aA → aaAb
A→ε
S → ab S → aSb
Recall. We know from last week that there is no DFA accepting L We will see that this means that there’s no regular grammar so the language is context-free, but not regular.
9/59
9/59

Regular (Type 3) Grammars
Definition. A grammar is regular if all its productions are either right-linear, i.e. of the form
A → aB or A → a or A → ε or left-linear, i.e. of the form
A → Ba or A → a or A → ε.
right and left linear grammars are equivalent: they generate the same
languages
we focus on right linear (for no deep reason)
i.e. one symbol is generated at a time (cf. DFA/NFA!) termination with terminal symbols or ε
Next Goal. Regular Grammars generate precisely all regular languages.
10/59
10/59

Regular Languages – Many Views
Theorem. Let L be a language. Then the following are equivalent: L is the language generated by a right-linear grammar;
L is the language generated by a left-linear grammar;
L is the language accepted by some DFA;
L is the language accepted by some NFA;
L is the language specified by a regular expression.
So far.
have seen that NFAs and DFAs generate the same languages (subset construction)
have hinted at regular expressions and NFAs generate the same languages
Goal. Show that NFAs and right-linear grammars generate the same languages.
11/59
11/59

From NFAs to Right-linear Grammars
Given. Take an NFA A = (Σ,S,s0,F,R).
alphabet, state set, initial state, final states, transition relation
Construction of a right-linear grammar
terminal symbols are elements of the alphabet Σ; non-terminal symbols are the states S;
start symbol is the start state s0;
productions are constructed as follows:
gives production
T → aU
gives production
(Formally, a transition T −→ U means (T,a,U) ∈ R.)
Observation. The grammar so generated is right-linear, and hence regular.
12/59
Each transition a
T −→ U Each final state
T∈F T→ε a
12/59

NFAs to Right-linear Grammars – Example
Given. A non-deterministic automaton
􏰇􏰄 􏰇􏰄 􏰇􏰄 􏰇􏰄
ESaES1 ES2cES3
􏰆 􏰅 􏰆 􏰅b 􏰆 􏰅 􏰆􏰔 􏰓􏰅
aTbTcT 􏰑􏰐 􏰑􏰐 􏰑􏰐
Equivalent Grammar obtained by construction
􏰕􏰒
S → aS
S → aS1 S1 → bS1 S1 → bS2
S2 → cS2 S2 → cS3 S3 → ε
Exercise. Convince yourself that the NFA accepts precisely the words that the grammar generates.
13/59
13/59

From Right-linear Grammars to NFAs
Given. Right-linear grammar (Vt,Vn,S,P)
terminals, non-terminals, start symbol, productions
Construction of an equivalent NFA has alphabet is the terminal symbols Vt;
states are the non-terminal symbols Vn together with new state Sf (for final);
start state is the start symbol S;
final states are Sf and all non-terminals T such that there exists a
production T → ε;
transition relation is constructed as follows:
Each production
T → aU
Each transition
T → a
gives transition a
T −→ U gives transition
a
T −→ Sf
14/59
14/59

Right-linear Grammars to NFAs – Example
Given. Grammar G with the productions S → 0 S → 1T
T → ε T → 0T T → 1T (generates binary strings without leading zeros)
Equivalent Automaton obtained by construction.
Sf
0
S
the grammar generates.
15/59
1
T 0,1
Exercise. Convince yourself that the NFA accepts precisely the words that
15/59

Context-Free (Type 2) Grammars (CFGs)
Recall. A grammar is type-2 or context free if all productions have the form
A→ω
where A ∈ Vn is a non-terminal, and ω ∈ V ∗ is an (arbitrary) string. left side is non-terminal
right side can be anything
independent of context, replace LHS with RHS.
In Contrast. Context-Sensitive grammars may have productions αAβ → αωβ
may only replace A by ω if A appears in context α β
16/59
16/59

Example
Goal. Design a CFG for the language L={ambncm−n |m≥n≥0}
Strategy. Every word ω ∈ L can be split ω=am−n |anbn |cm−n
and hence L = {akanbnck | n,k ≥ 0}
convenient to not have comparison between n and m
generate ak …ck, i.e. same number of leading as and trailing cs fill … in the middle by anbn, i.e. same number of as and bs
use different non-terminals for both phases of the construction
Resulting Grammar. (productions only)
S → aSc | T T → aTb|ε
17/59
17/59

Example ctd.
Grammar
Example Derivation. of aaabbc:
S ⇒ aSc ⇒ aTc
⇒ aaTbc ⇒ aaaTbbc ⇒ aaabbc
S → aSc | T T → aTb|ε
18/59
18/59

Parse Trees
Idea. Represent derivation as tree rather than as list of rule applications describes where and how productions have been applied
generated word can be collected at the leaves
Example for the grammar that we have just constructed S
aSc T
a a
ε
Tb Tb
19/59
19/59

The Power of Context-Free Grammars
A fun example:
http://pdos.csail.mit.edu/scigen
20/59
20/59

Parse Trees Carry Semantics
Take the code
if e1 then if e2 then s1 else s2
where e1, e2 are boolean expressions and s1, s2 are subprograms. Two Readings.
if e1 then ( if e2 then s1 else s2 )
and
if e1 then ( if e2 then s1 ) else s2
Goal. unambiguous interpretation of the code leading to determined and
clear program execution.
21/59
21/59

Ambiguity
Recall that we can present CFG derivations as parse trees.
Until now this was mere pretty presentation; now it will become important.
A context-free grammar G is unambiguous iff every string can be derived by at most one parse tree.
G is ambiguous iff there exists any word w ∈ L(G ) derivable by more than one parse trees.
22/59
22/59

Example: If-Then and If-Then-Else
Consider the CFG
S → if bexpthenS | if bexpthenS elseS | prog
where bexp and prog stand for boolean expressions and (if-statement free) programs respectively, defined elsewhere.
The string if e1 then if e2 then s1 else s2 then has two parse trees:
SS
xx “”
if e1 then S
tt || 􏰣􏰣 “”
if e2 then S else S
xx 􏰣􏰣 “” ((
if e1 then S xx 􏰣􏰣
if e2 then S
else
S
􏰣􏰣 s2
􏰣􏰣 􏰣􏰣 􏰣􏰣 s1 s2 s1
23/59
23/59

Example: If-Then and If-Then-Else
That grammar was ambiguous. But here’s a grammar accepting the exact same language that is unambiguous:
S→ ifbexpthenS | T
T → if bexpthenT elseS | prog
There is now only one parse for if e1 then if e2 then s1 else s2. This is given on the next slide:
24/59
24/59

Example: If-Then and If-Then-Else
S
xx if e1 then
“”
􏰣􏰣
S T
if e2 then
tt || 􏰣􏰣 “” T else
􏰣􏰣 s1 T
􏰣􏰣 s2
You cannot parse this string as if e1 then ( if e2 then s1 ) else s2.
S
􏰣􏰣
25/59
25/59

Reflecting on This Example
Observation.
more than one grammar for a language some are ambiguous, others are not ambiguity is a property of grammars
Grammars for Programs.
ambiguity is bad: don’t know how program will execute! replace ambiguous grammar with unambiguous one
Choices for converting ambiguous grammars to unambiguous ones decide on just one parse tree
e.g. if e1 then ( if e2 then s1 ) else s2 vs if e1 then ( if e2 then s1 else s2 )
in example: we have chosen if e1 then ( if e2 then s1 else s2 )
26/59
26/59

What Ambiguity Isn’t
Question. Is the grammar with the following production ambiguous? T → ifbexpthenTelseS
Reasoning.
Suppose that the above production was used we can then expand either T or S first.
A. This is not ambiguity.
both options give rise to the same parse tree
indeed, for context-free languages it doesn’t matter what production is applied first.
thinking about parse trees, both expansions happen in parallel.
Main Message. Parse trees provide a better representation of syntax than
derivations. 27/59 27/59

Inherently Ambiguous Languages
Q. Can we always remove ambiguity?
Example. Language L = {aibjck | i = j or j = k}
Q. Why is this context free?
A. Note that L = {ai bi ck } ∪ {ai bj cj }
idea: start with production that “splits” between the union S → T | W where T is “left” and W is “right”
Complete Grammar.
T → UV
U → aUb | ε V → cV | ε
W → XY
X → aX | ε Y → bYc | ε
S→T|W
28/59
28/59

Inherently Ambiguous Languages
Problem. Both left part ai bi ck and right part ai bj cj has non-empty intersection: aibici
Ambiguity where a, b and c are equi-numerous: SS
􏰣􏰣 􏰣􏰣 TW
}} !! || “” UVXY
~~ 􏰣􏰣 !! 􏰣􏰣 !! ~~ 􏰣􏰣 || 􏰣􏰣 aUbcVaXbYc
􏰣􏰣 􏰣􏰣 􏰣􏰣 􏰣􏰣 εεεε
Fact. There is no unambiguous grammar for this language (we don’t prove this)
29/59
29/59

The Bad News
Q. Can we compute an unambiguous grammar whenever one exists? Q. Can we even determine whether an unambiguous grammar exists?
A. If we interpret “compute” and “determine” as “by means of a program”, then no.
There is no program that solves this problem for all grammars input: CFG G, output: ambiguous or not. This problem is
undecidable
(More undecidable problems next week!)
30/59
30/59

Example: Subtraction
Example.
S → S−S|int int stands for integers
the intended meaning of − is subtraction Ambiguity.
SS
􏰦􏰦 􏰣􏰣 􏰤􏰤 􏰢􏰢 􏰣􏰣 􏰥􏰥 S−15−S
􏰢􏰢 􏰣􏰣 􏰥􏰥 􏰦􏰦 􏰣􏰣 􏰤􏰤 5−3 3−1
Evaluation.
left parse tree evaluates to 1 right parse tree evaluates to 3 so ambiguity matters!
31/59
31/59

Technique 1: Associativity
Idea for ambiguity induced by binary operator (think: −) prescribe “implicit parentheses”, e.g. a − b − c ≡ (a − b) − c make operator associate to the left or the right
Left Associativity.
S → S−int|int 5−3−1 can only be read as (5−3)−1
this is left associativity Right Associativity.
S → int − S | int
Idea. Break the symmetry
one side of operator forced to lower level here: force right hand side of i to lower level
Result.
32/59
32/59

Example: Multiplication and Addition
Example. Grammar for addition and multiplication S → S∗S|S+S|int
Ambiguity.
1+2∗3 can be read as (1+2)∗3 and 1+(2∗3) with different results also 1 + 2 + 3 is ambiguous – but this doesn’t matter here.
Take 1. The trick we have just seen
strictly evaluate from left to right
but this gives 1 + 2 ∗ 3 ≡ (1 + 2) ∗ 3, not intended!
Goal. Want ∗ to have higher precedence than +
33/59
33/59

Technique 2: Precedence
Example Grammar giving ∗ higher precedence: S→S+T|T
T → T ∗ int | int
Given e.g. 1+2∗3 or 2∗3+1
forced to expand + first: otherwise only ∗ so + will be last operation evaluated
Example. Derivation of 1 + 2 ∗ 3
suppose we start with S ⇒ T ⇒ T ∗ int stuck, as cannot generate 1 + 2 from T
Idea. Forcing operation with higher priority to lower level
three levels: S, (highest), T (middle) and integers lowest-priority operation generated by highest-level nonterminal
34/59
34/59

Example: Basic Arithmetic
Repeated use of + and ∗:
S→S+T|S−T|T T → T∗U | T/U | U U → (S) | int
Main Differences.
have parentheses to break operator priorities, e.g. (1 + 2) ∗ 3 parentheses at lowest level, so highest priority
lower-priority operator can be inside parentheses
expressions of arbitrary complexity (no nesting in previous examples)
35/59
35/59

Example: Balanced Brackets
S → ε|(S)|SS
associativity: create brackets from left or from right (as before)
…twowaysofgenerating(): S⇒SS⇒S⇒(S)⇒() indeed, any expression has infinitely many parse trees
Reason. More than one way to derive ε.
Ambiguity.
36/59
36/59

Technique 3: Controlling ε
Alternative Grammar with only one way to derive ε:
S→ε|T
T → TU | U U→ () | (T)
ε can only be derived from S
all other derivations go through T
here: combined with multiple level technique ambiguity with ε can be easy to miss!
37/59
37/59

From Grammars to Automata
So Far.
regular languages correspond to regular grammars.
Q. What automata correspond to context free grammars?
38/59
38/59

General Structure of Automata
a0
read head
a1
a2

. . . .
an input tape
Auxiliary Memory
Finite
State Control
input tape is a set of symbols
finite state control is just like for DFAs / NFAs symbols are processed and head advances
new aspect: auxiliary memory
Auxiliary Memory classifies languages and grammars
no auxiliary memory: NFAs / DFAs: regular languages stack: push-down automata : context free languages unbounded tape: Turing machines: all languages
39/59
39/59

Push-down Automata — PDAs
a0
a1
a2

. .. .
an
read input tape head
zk
z2
z1
Finite
State Control
stack memory
40/59
40/59

PDAs ctd.
Actions of a push-down automaton change of internal state pushing or popping the stack advance to next input symbol
Action dependencies. Actions generally depend on current state (of finite state control)
input symbol
symbol at the top of the stack
Acceptance. The machine accepts if input string is fully read
machine is in accepting state stack is empty
Variations.
acceptance with empty stack: input fully read, stack empty
acceptance with final state: input fully read, machine in final state 41/59 41/59

Example
Language (that cannot be recognised by a DFA) L={anbn |n≥1}
cannot be recognised by a DFA
can be generated by a context free grammar can be recognised by a PDA
PDA design. (ad hoc, but showcases the idea)
phase 1: (state S1) push a’s from the input onto the stack
phase 2: (state S2) pop a’s from the stack, if there is a b on input
finalise: if the stack is empty and the input is exhausted in the final state (S3), accept the string.
42/59
42/59

Deterministic PDA – Definition
Definition. A deterministic PDA has the form (S,s0,F, Σ, Γ,Z, δ), where
S isthesetofstates,s0 ∈S istheinitialstateandF ⊆S arethe final states;
Σ is the alphabet, or set of input symbols;
Γ is the set of stack symbols, and Z ∈ Γ is the initial stack symbol; δ is a (partial) transition function
δ : S×(Σ∪{ε})×Γ 􏰡 S×Γ∗
δ : (state, input token or ε, top-of-stack) 􏰡 (new state, stack string)
Additional Requirement to ensure determinism:
if δ(s, ε, γ) is defined, then δ(s, a, γ) is undefined for all a ∈ Σ ensures that automaton has at most one execution
43/59
43/59

Notation
Given. Deterministic PDA with transition function
δ : S×(Σ∪{ε})×Γ 􏰡 S×Γ∗
δ : (state, input token or ε, top-of-stack) 􏰡 (new state, stack string)
Notation.
write δ(s, a, γ) = s′/σ
σ is a string that replaces top stack symbol final states are usually underlined (s)
Rationale.
replacing top stack symbol gives just one operation for push and pop pop: δ(s, a, γ) = s′/ε
push: δ(s,a,γ) = s′/wγ
44/59
44/59

Two types of PDA transition
Input consuming transitions
δ contains (s1,x,γ) 􏰈→ s2/σ automaton reads symbol x symbol x is consumed
Non-consuming transitions
δ contains (s1, ε, γ) 􏰈→ s2/σ
independent of input symbol
can happen any time and does not consume input symbol
45/59
45/59

Example ctd.
Language L = {anbn | n ≥ 1}
Push-down automaton
starts with Z (initial stack symbol) on stack final state is S3 (underlined)
transition function (partial) given by
δ(S0,a,Z) δ(S1, a, a) δ(S1, b, a) δ(S2, b, a) δ(S2,ε,Z)
􏰈→ S1 /aZ 􏰈→ S1 /aa 􏰈→ S2 /ε 􏰈→ S2/ε 􏰈→ S3/ε
push first a
push further a’s start popping a’s pop further a’s accept
(δ is partial, i.e. undefined for many arguments)
46/59
46/59

Example ctd. – Diagrammatically
start
a, a/aa
a,Z/aZ b,a/ε S0 S1
b, a/ε
S2 ε,Z/ε
S3
47/59
47/59

Example ctd. — PDA Trace
PDA configurations
triples: (state, remaining input, stack) top of stack on the left (by convention)
Example Execution.
(S0,aaabbb,Z) ⇒
⇒ (S1,abbb,aaZ)
⇒ (S1,bbb,aaaZ)
⇒ (S2,bb,aaZ)
⇒ (S2,b,aZ)
⇒ (S2,ε,Z)
(push first a) (push further a’s) (push further a’s) (start popping a’s) (pop further a’s) (pop further a’s) (accept)
(S1,aabbb,aZ)
⇒ (S3 , ε, ε)
Accepting execution. Ends in final state, input exhausted, empty stack.
48/59
48/59

Example ctd. — Rejection
PDA execution.
(S0,aaba,Z) ⇒ (S1,aba,aZ) ⇒ (S1,ba,aaZ)
⇒ (S2,a,aZ) ⇒ ???
Non-accepting execution.
No transition possible, stuck without reaching final state
rejection happens when transition function is undefined for current configuration (state, input, top of stack)
49/59
49/59

Example: Palindromes with ‘Centre Mark’
Example Language.
L={wcwR |w∈{a,b}∗∧wRiswreversed}
Deterministic PDA that accepts L
Push a’s and b’s onto the stack as we seem them;
When we see c, change state;
Now try to match the tokens we are reading with the tokens on top of the stack, popping as we go;
If the top of the stack is the empty stack symbol Z, pop it and enter the final state via an ε-transition. Hopefully our input has been used up too!
Exercise. Define this formally!
50/59
50/59

Non-Deterministic PDAs
Deterministic PDAs
transitions are a partial function
δ : S×(Σ∪{ε})×Γ 􏰡 S×Γ∗
δ : (state, input token or ε, top-of-stack) 􏰡 (new state, stack string) side condition about ε-transitions
Non-Deterministic PDAs
transitions given by relation
δ ⊆ S×(Σ∪{ε})×Γ × S×Γ∗
no side condition (at all).
Main differences
for deterministic PDA: at most one transition possible
for non-deterministic PDA: zero or more transitions possible
51/59
51/59

Non-Deterministic PDAs ctd.
Finite Automata
non-determinism is convenient
but doesn’t give extra power (subset construction) can convert every NFA to an equivalent DFA
Push-down automata.
non-determinism gives extra power
cannot convert every non-deterministic PDA to deterministic PDA
there are context free languages that can only be recognised by non-deterministic PDA
intuition: non-determinism allows “guessing”
Grammar / Automata correspondence
non-deterministic PDAs are more important they correspond to context-free languages
52/59
52/59

Example: Even-Length Palindromes
Palindromes of even length, without centre-marks L={wwR |w∈{a,b}∗∧wRiswreversed}
this is a context-free language
cannot be recognised by deterministic PDA
intuitive reason: no centre-mark, so don’t know when first half of word is read
Non-deterministic PDA for L has the transition δ(s,ε,γ) = r/x
x ∈ {a, b, Z }, s is the ‘push’ state and r the ‘match and pop’ state.
Intuition
“guess” (non-deterministically) whether we need to enter “match-and-pop”-state
automaton gets stuck if guess is not correct (no harm done) automaton accepts if guess is correct
53/59
53/59

Grammars and PDAs
Theorem. Context-free languages and non-deterministic PDAs are equivalent
for every CFL L there exists a PDA that accepts L
if L is accepted by a non-deterministic PDA, then L is a CFL.
Proof. We only do one direction: construct PDA from CFL. this is the “interesting” direction for parser generators other direction quite complex.
54/59
54/59

From CFG to PDA
Given. Context-Free Grammar G = (Vt , Vn , S , P )
Construct non-deterministic PDA A = (Q,Q0,F,Σ,Γ,Z,δ) States. Q0 (initial state), Q1 (working state) and Q2 (final state). Alphabet. Σ = Vt , terminal symbols of the grammar
Stack Alphabet. Γ = Vt ∪ Vn ∪ {Z}
Initialisation.
push start symbol S onto stack, enter working state Q1 δ(Q0,ε,Z) 􏰈→ Q1/SZ
Termination.
if the stack is empty (i.e. just contains Z), terminate δ(Q1,ε,Z) 􏰈→ Q2/ε
55/59
55/59

From CFGs to PDAs: working state
Idea.
build the derivation on the stack by expanding non-terminals according to productions productions
if a terminal appears that matches the input, pop it terminate, if the entire input has been consumed
Expand Non-Terminals.
non-terminals on the stack are replaced by right hand side of productions
δ(Q1, ε, A) 􏰈→ Q1/α for all productions A → α
Pop Terminals.
terminals on the stack are popped if they match the input δ(Q1, t, t) 􏰈→ Q1/ε for all terminals t
Result of Construction. Non-deterministic PDA
may have more than one production for a non-terminal
56/59
56/59

Example — Derive a PDA for a CFG
Arithmetic Expressions as a grammar:
S→S+T|T T→T∗U|U U → (S) | int
1. Initialise:
δ(Q0,ε,Z) 􏰈→ Q1/SZ 2. Expand non-terminals:
δ(Q1,ε,S) δ(Q1, ε, S) δ(Q1,ε,T)
􏰈→ Q1/S +T 􏰈→ Q1/T
􏰈→ Q1/T ∗U
δ(Q1,ε,T) δ(Q1, ε, U)
δ(Q1,ε,U)
􏰈→ Q1/U 􏰈→ Q1/(S) 􏰈→ Q1/int
57/59
57/59

CFG to PDA ctd.
3. Match and pop terminals:
4. Terminate:
δ(Q1, +, +) 􏰈→ δ(Q1, ∗, ∗) 􏰈→ δ(Q1, int, int) 􏰈→ δ(Q1, (, () 􏰈→ δ(Q1, ), )) 􏰈→
Q1 /ε Q1 /ε Q1 /ε Q1 /ε Q1 /ε
δ(Q1,ε,Z) 􏰈→ Q2/ε
58/59
58/59

Example Trace
(q0, int∗int, Z)
⇒(Q1, ⇒(Q1, ⇒(Q1, ⇒(Q1,
⇒ (Q1 ,
⇒ (Q1 ,
⇒ (Q1 ,
⇒ (Q1 ,
⇒ (Q1 ,
⇒ (Q2 , ⇒
int ∗ int, int ∗ int, int∗int, int∗int, int ∗ int,
∗int, int, int, ε, ε, accept
SZ) TZ) T∗UZ) U∗UZ) int ∗ UZ) ∗UZ ) UZ) intZ ) Z) ε)
59/59
59/59