Grammars
CS106 — Compiler Principles and Construction
Fall 2011
Lecture 5: Regular Languages, and Beyond
MUST FIT
Dr. Liang
Regular Languages, and Beyond Dr. Zhiyao Liang
1
Chapter highlights
Regular grammar
Three equivalent ways of describing a regular language
Properties of regular language
Limitation of regular language
Beyond regular language
The big picture of automata theory
Common tasks in the study of theory of computation
Regular Languages, and Beyond Dr. Zhiyao Liang
2
Most general grammar
(recall lecture 1)
A grammar G = (V, T, S, P) consists of the following quadruple:
a set V of variables (non-terminal symbols), including a starting symbol S NT
a set T of terminals (same as an alphabet, )
A start symbol S V
a set P of production rules
A grammar derives (or generates) a language.
Example of rules in a general grammar :
teh the
S SA | λ
A bA | λ
Regular Languages, and Beyond Dr. Zhiyao Liang
3
Context-free grammar
A grammar is said to be context-free if every rule has a single non-terminal on the left-hand side
Example: rules of one context-free grammar:
S aSb | λ
This means you can apply the rule in any context. More complicated languages (such as English) have context-dependent rules. A language generated from a context-free grammar is called a context-free language
Regular Languages, and Beyond Dr. Zhiyao Liang
4
Regular grammar
A grammar is said to be right-linear if all productions are of the form AxB or Ax, where A and B are variables and x is a string of terminals
A grammar is said to be left-linear if all productions are of the form ABx or Ax
A regular grammar is either right-linear or left-linear.
Regular Languages, and Beyond Dr. Zhiyao Liang
5
Linear grammar
A grammar can be linear without being right- or left-linear.
A linear grammar is a grammar in which at most one variable can occur on the right side of any production rule, without any restriction on the position of the variable.
Example:
S aS | A
A Ab | λ
Regular Languages, and Beyond Dr. Zhiyao Liang
6
Another formalism for regular languages
Every regular grammar generates a regular language, and every regular language can be generated by a regular grammar.
A regular grammar is a simpler, special-case of a context-free grammar
The regular languages are a proper subset of the context-free languages
Regular Languages, and Beyond Dr. Zhiyao Liang
7
Exercises
Find a regular grammar that generates the language on = {a,b} consisting of all strings with no more than three a’s
S bS | aA | λ
A bA | aB | λ
B bB | aC | λ
C bC | λ
Regular Languages, and Beyond Dr. Zhiyao Liang
8
Exercises
Find a regular grammar that generates the language consisting of even-length strings over {a,b}
S aaS | abS | baS | bbS | λ
Regular Languages, and Beyond Dr. Zhiyao Liang
9
Exercise
What language is generated by the following
context-free grammar?
S aSa | bSb | a | b | λ
The odd/even palindrome language:
L = {w(a+b+λ)wR}
Regular Languages, and Beyond Dr. Zhiyao Liang
10
Exercise
Give a regular (right-linear) grammar for the language consisting of all strings over {a, b, c} that begin with a, contain exactly two b’s, and end with cc
S aA
A aA | cA| bB
B aB | cB| bC|
C aC | cC | cc
S aA
A bB | aA | cA
B bC | aB | cB
C aC | cC | cD
D c
Regular Languages, and Beyond Dr. Zhiyao Liang
11
Theorem 3.3
Every language generated by a right-linear grammar is regular.
Proof:
Specify a procedure for automatically constructing an NFA that mimics the derivations of a right-linear grammar.
Regular Languages, and Beyond Dr. Zhiyao Liang
12
Theorem 3.4
Every regular language can be generated by a right-linear grammar.
Proof:
Generate a DFA for the language.
Specify a procedure for automatically constructing a right-linear grammar from the DFA.
Regular Languages, and Beyond Dr. Zhiyao Liang
13
Theorem 3.4
Every regular language can be generated by a right-linear grammar.
Proof:
Generate a DFA for the language.
Specify a procedure for automatically constructing a right-linear grammar from the DFA.
Regular Languages, and Beyond Dr. Zhiyao Liang
14
3 ways of specifying regular languages
Regular expressions
DFA NFA
Regular grammars
Regular languages
describe
accept
generate
Regular Languages, and Beyond Dr. Zhiyao Liang
15
Closure properties of regular languages
Theorem 4.1 [Linz] : The class of regular languages is closed under the following operations (that is, performing these operations on regular languages creates other regular languages)
Union
Concatenation
Kleene star
Complementation
Intersection
Difference
Regular Languages, and Beyond Dr. Zhiyao Liang
16
The membership question
Given a language L and a string w, is w L?
A method for answering the membership question is called a membership algorithm. Is there a membership algorithm for regular languages?
Regular Languages, and Beyond Dr. Zhiyao Liang
17
The membership question
Theorem 4.5 [Linz]: Given a standard representation (i.e., a finite automaton, a regular expression, or a regular grammar) of any regular language L on and w *, there exists an algorithm for determining whether w is in L.
Proof: Here is the algorithm:
If the standard representation of L is in the form of a regular expression, or a regular grammar, construct an equivalent FA.
Test w to see if it is accepted by the FA.
Regular Languages, and Beyond Dr. Zhiyao Liang
18
The finiteness question
Theorem 4.6 [Linz]:
Given a standard representation (i.e., a finite automaton, a regular expression, or a regular grammar) of any regular language L on , there exists an algorithm for determining whether L is empty, finite, or infinite.
Regular Languages, and Beyond Dr. Zhiyao Liang
19
The finiteness question
Proof: Here is the algorithm:
If the standard representation of L is in the form of a regular expression, or a regular grammar, construct an equivalent FA.
If there is a simple path from the initial vertex to any final vertex, then the language is not empty.
Find all the vertices that are the base of a cycle. If any of these vertices is on a path from the initial to a final vertex, the language is infinite; otherwise, it is finite.
Regular Languages, and Beyond Dr. Zhiyao Liang
20
The “does L1 = L2” question
Theorem 4.7:
Given standard representations of two regular languages L1 and L2, there exists an algorithm for determining whether or not L1 = L2.
Regular Languages, and Beyond Dr. Zhiyao Liang
21
The “does L1 = L2” question
Proof: Here is the algorithm:
Define a new language
L3 = (L1 ~L2) (~L1 L2)
L3 is regular (see previous closure proofs)
Therefore, we can find a DFA that accepts L3.
Use theorem 4.6 to decide if L3 is empty.
L3 = iff L1 = L2 (exercise 8 in section 1.1 in the Linz textbook).
So L1 = L2 if L3 = ; otherwise, L1 L2
Regular Languages, and Beyond Dr. Zhiyao Liang
22
Regular vs. context-free
Are regular languages context-free?
Yes, because context-free means that there is a single variable on the left side of each grammar rule. All regular languages are generated by grammars that have a single variable on the left side of each grammar rule.
But, as we have seen, not all context-free grammars are regular. So regular languages are a proper subset of the class of context-free languages.
Regular Languages, and Beyond Dr. Zhiyao Liang
23
The limitation of Regular Language
Is this a regular language ?
L = {anbn| n >=0 }
It is impossible to construct a DFA for L.
Why?
Suppose there is a DFA M with N states (N is a finite number) that recognize L.
Consider a string aN+1bN+1 , which is a string in L, then M should accept it.
Lets run aN+1bN+1 with M.
Reading the first N+1 symbols will go though N+1 edges of M.
Then there must be a loop, some state is reached twice.
Let the loop contain k edges,
Then aN+1+kbN+1 will also be accepted by M
Contradiction .
The above reasoning is called the pumping lemma for regular language.
Regular Languages, and Beyond Dr. Zhiyao Liang
24
A hierarchy of grammars
General Grammar
> (> means contains)
Context-free Gramma
>
Linear Grammar
>
Regular Grammar
=
Left-linear Grammar Right-linear Grammar
Regular Languages, and Beyond Dr. Zhiyao Liang
25
Push Down Automaton (PDA)
Language Acceptor Model for CFLs
It is an NFA with a stack.
Finite
State
control
Input
Stack
Accept/Reject
Regular Languages, and Beyond Dr. Zhiyao Liang
26
Definition 7.1: Pushdown Automaton
A nondeterministic pushdown automaton (NPDA) is a 7-tuple
M = (Q, S, G, q0, d, z, F), where
Q is a finite set of states
S is the input alphabet (a finite set)
G is the stack alphabet (a finite set)
d : Q (S {l}) G (finite subsets of Q G*)
is the transition function
q0 Q is the start state
z G is the initial stack symbol
F Q is the set of accepting states
Regular Languages, and Beyond Dr. Zhiyao Liang
27
PDA and CFG (further reading)
A language L can be generated by a CFG if and only if there is a (deterministic) PDA that can accepts L.
L = {aibici | i 1} is not context-free.
There is a pumping lemma for context-free language.
Regular Languages, and Beyond Dr. Zhiyao Liang
28
Automata –
differed by memory devices
All automata (FA, PDA, TM) have finite states
A Finite Automaton has only its (finite) states to serve as memory
No auxiliary memory device
A Pushdown Automaton has a set of states
It has a stack (unbounded memory)
We can create an automaton with more powerful memory devices
Motivation : the language anbncn
PDA with two stacks
Automaton with a queue
…
29
Regular Languages, and Beyond Dr. Zhiyao Liang
29
Automata differ in the “auxiliary memory” they have and how it functions.
DFA
NFA
DPDA
PDA
Turing machine
finite memory
Unbounded stack memory
Unbounded tape memory
(tape can be read forwards and
backwards without erasing)
30
Difference between
memory devices
Regular Languages, and Beyond Dr. Zhiyao Liang
Hierarchy of automata
DFA,
NFA
DPDA
PDA
LBA
DTM, NTM
31
Regular Languages, and Beyond Dr. Zhiyao Liang
Beyond TM
There is a language that is not recognized by any Turing machine.
Some problem is undecidable.
Regular Languages, and Beyond Dr. Zhiyao Liang
32
/docProps/thumbnail.jpeg