Slide 1
Algorithms and Decision Procedures for
Context-Free Languages
Chapter 14
Decision Procedures for CFLs
Membership: Given a language L and a string w, is w in L?
Two approaches:
● If L is context-free, then there exists some context-free
grammar G that generates it. Try derivations in G and see
whether any of them generates w.
Problem: S S T | a Try to derive aaa
● If L is context-free, then there exists some PDA M that
accepts it. Run M on w.
Problem:
Normal Forms for Grammars
Chomsky Normal Form, in which all rules are of one of the following two forms:
● X a, where a , or
● X BC, where B and C are elements of V – .
Advantages:
● Parsers can use binary trees.
● Exact length of derivations is known:
S
A B
A A B B
a a b B B
b b
Conversion to Chomsky Normal Form
For any context-free grammar G, there exists a context-free grammar G’ in Chomsky normal form such that L(G) = L(G) – {}.
Algorithm:
1. Remove all -rules, using the algorithm removeEps. (We have seen this before.)
2. Remove all unit productions (rules of the form A B).
3. Remove all rules whose right hand sides have length
greater than 1 and include a terminal:
(e.g., A aB or A BaC)
4. Remove all rules whose right hand sides have length
greater than 2:
(e.g., A BCDE)
Definition: a rule is modifiable iff it is of the form:
P R, for some nullable R, P
removeEps(G: cfg) =
1. Let G = G.
2. Find the set N of nullable variables in G.
3. For each modifiable rule P R of G do
Add the rule P .
4. Delete from G all rules of the form X .
5. Return G.
L(G) = L(G) – {}
Removing -Productions
Example:
S aACa
A B | a
B C | c
C cC |
Example
S aACa | aAa | aCa | aa
A B | a
B C | c
C cC | c
Nullable: A,B,C
Unit Productions
A unit production is a rule A B (right-hand side consists of a single nonterminal symbol)
removeUnits(G)
1. Let G = G.
2. Until no unit productions remain in G do:
2.1 Choose some unit production X Y.
2.2 Remove it from G.
2.3 Consider only rules that still remain.
For every rule Y , where V*, do:
Add to G the rule X unless it is a rule
that has already been removed once.
3. Return G.
Example
Remove A B. Add A C | c.
Remove B C. Add B cC (B c, already there).
Remove A C. Add A cC (A c, already there).
So removeUnits returns:
S aACa | aAa | aCa | aa
A a | c | cC
B c | cC
C cC | c
S aACa | aAa | aCa | aa
A B | a
B C | c
C cC | c
Mixed Rules
removeMixed(G) =
1. Let G = G.
2. Create a new nonterminal Ta for each terminal a in .
3. Modify each rule whose right-hand side has length greater
than 1 and that contains a terminal symbol by substituting
Ta for each occurrence of the terminal a.
4. Add to G, for each Ta, the rule Ta a.
5. Return G.
Example
S aACa | aAa | aCa | aa
A a | c | cC
B c | cC
C cC | c
removeMixed returns:
S TaACTa | TaATa | TaCTa | TaTa
A a | c | TcC
B c | TcC
C TcC | c
Ta a
Tc c
Long Rules
removeLong(G) =
1. Let G = G.
2. For each rule r of the form:
A N1N2N3N4…Nn, n > 2
create new nonterminals M2, M3, … Mn-1.
3. Replace r with the rule A N1M2.
4. Add the rules:
M2 N2M3,
M3 N3M4,
…
Mn-1 Nn-1Nn.
5. Return G.
Example
S TaACTa | TaATa | TaCTa | TaTa
A a | c | TcC
B c | TcC
C TcC | c
Ta a
Tc c
removeLong returns:
S TaS1 S TaS3 S TaS4 S TaTa
S1 AS2 S3 ATa S4 CTa
S2 CTa
A a | c | TcC
B c | TcC
C TcC | c
Ta a
Tc c
Using a Grammar
decideCFLusingGrammar(L: CFL, w: string) =
1. If given a PDA, build G so that L(G) = L(M).
2. If w = then if SG is nullable then accept, else reject.
3. If w then:
3.1 Construct G in Chomsky normal form such that
L(G) = L(G) – {}.
3.2 If G derives w, it does so in 2|w| – 1 steps. Try all
derivations in G of 2|w| – 1 steps. If one of them
derives w, accept. Otherwise reject.
Emptiness
Given a context-free language L, is L = ?
decideCFLempty(G: context-free grammar) =
1. Let G = removeunproductive(G).
2. If S is not present in G then return True
else return False.
Finiteness
Given a context-free language L, is L infinite?
decideCFLinfinite(G: context-free grammar) =
1. Lexicographically enumerate all strings in * of length
greater than k and less than or equal to 2k.
2. If, for any such string w, decideCFL(L, w) returns True
then return True. L is infinite.
3. If, for all such strings w, decideCFL(L, w) returns False
then return False. L is not infinite.
decideCFLempty2(G: context-free grammar)
Check for strings of length up to k.
The Undecidable Questions about CFLs
● Is L = *?
● Is the complement of L context-free?
● Is L regular?
● Is L1 = L2?
● Is L1 L2?
● Is L1 L2 = ?
● Is L inherently ambiguous?
● Is G ambiguous?