CS计算机代考程序代写 Java flex Context Free Languages Slide 1

Slide 1

Context-Free Grammars
Chapter 11

Languages and Machines

Context-free Grammars, Languages, and PDAs
Context-free Language
Context-free Grammar
PDA

L
Accepts

Context-Free Grammars

A context-free grammar G is a quadruple,
(V, , R, S), where:

V is the rule alphabet
, a subset of V, the set of terminals
V – , the set of nonterminals
R, a finite subset of (V – ) V*, the set of rules
S, an element of V – , the start symbol

Example:

({S, a, b}, {a, b}, {S  a S b, S  }, S)

Derivations

x G y iff x =  A 

and A    R

y =   

w0 G w1 G w2 G . . . G wn is a derivation in G.

Let G* be the reflexive, transitive closure of G.
Then the language generated by G, denoted L(G), is:
L(G) = {w  * : S G* w}.

An Example Derivation

Example:

Let G = ({S, a, b}, {a, b}, {S  a S b, S  }, S)

S  a S b  aa S bb  aaa S bbb  aaabbb

S * aaabbb

Definition of a Context-Free Grammar

A language L is context-free iff it is generated by some
context-free grammar G.

AnBn
S  
S  aSb

Balanced Parentheses
S  
S  SS
S  (S)

Recursive Grammar Rules
A rule is recursive iff it is X  w1Yw2, where:

Y * w3Xw4 for some w1, w2, w3, and w4 in V*.

A grammar is recursive iff it contains at least one recursive rule.

Examples: S  (S) S  (T)

T  (S)

Self-Embedding Grammar Rules
A rule in a grammar G is self-embedding iff it is :

X  w1Yw2, where Y * w3Xw4 and
both w1w3 and w4w2 are in +.

A grammar is self-embedding iff it contains at least one self-embedding rule.

Example: S  aSa is self-embedding

S  aS is recursive but not self-
embedding
S  aT
T  Sa is self-embedding

Where Context-Free Grammars Get Their Power
If a grammar G is not self-embedding then L(G) is regular.

If a language L has the property that every grammar that defines it is self-embedding, then L is not regular.

PalEven = {wwR : w  {a, b}*}
G = {{S, a, b}, {a, b}, R, S}, where:

R = { S  aSa
S  bSb
S   }.

Equal Numbers of a’s and b’s
Let L = {w  {a, b}*: #a(w) = #b(w)}.
G = {{S, a, b}, {a, b}, R, S}, where:

R = { S  aSb
S  bSa
S  SS
S   }.

Arithmetic Expressions
G = (V, , R, E), where
V = {+, *, (, ), id, E},
 = {+, *, (, ), id},
R = {
E  E + E
E  E  E
E  (E)
E  id }

Backus-Naur Form (BNF)
The symbol | should be read as “or”.

Example: S  aSb | bSa | SS | 

‘ ‘ replaced by “: : -” (read as “can be”)

Allow a nonterminal symbol to be any sequence of characters surrounded by angle brackets.

Examples of nonterminals:

A notation for writing practical context-free grammars

BNF for a Java Fragment
::= {} | {}
::= |
::= | while () |
if () |
do while (); |
; |
return | return |
;

HTML

Item 1, which will include a sublist

First item in sublist
Second item in sublist

Item 2

A grammar:

HTMLtext  Element HTMLtext | 

Element  UL | LI | … (and other kinds of elements that
are allowed in the body of an HTML document)

UL 

HTMLtext

LI 

HTMLtext

Designing Context-Free Grammars
● Generate related regions together.

AnBn

● Generate concatenated regions:

A  BC

● Generate outside in:

A  aAb

Concatenating Independent Languages
Let L = {anbncm : n, m  0}.

The cm portion of any string in L is completely independent of the anbn portion, so we should generate the two portions separately and concatenate them together.
G = ({S, N, C, a, b, c}, {a, b, c}, R, S} where:
R = { S  NC
N  aNb
N  
C  cC
C   }.

L = { : k  0 and i (ni  0)}

Examples of strings in L: , abab, aabbaaabbbabab

Note that L = {anbn : n  0}*.
G = ({S, M, a, b}, {a, b}, R, S} where:

R = { S  MS
S  
M  aMb
M  }.

Unequal a’s and b’s
L = {anbm : n  m}

G = (V, , R, S), where
V = {a, b, S, A, B},
 = {a, b},
R =
S  A /* more a’s than b’s
S  B /* more b’s than a’s
A  a /* at least one extra a generated
A  aA
A  aAb
B  b /* at least one extra b generated
B  Bb
B  aBb

Context free languages:

We care about structure.

E + E

id E * E

3 id id

5 7
Structure

To capture structure, we must capture the path we took through the grammar. Derivations do that.

Example:

S  
S  SS
S  (S)

1 2 3 4 5 6
S  SS  (S)S  ((S))S  (())S  (())(S)  (())()
S  SS  (S)S  ((S))S  ((S))(S)  (())(S)  (())()
1 2 3 5 4 6

But the order of rule application doesn’t matter.
Derivations

Parse trees capture essential structure:

1 2 3 4 5 6
S  SS  (S)S  ((S))S  (())S  (())(S)  (())()
S  SS  (S)S  ((S))S  ((S))(S)  (())(S)  (())()
1 2 3 5 4 6

S S

( S ) ( S )

( S ) 


Derivations

Parse Trees
A parse tree, derived by a grammar G = (V, , R, S), is a rooted, ordered tree in which:

● Every leaf node is labeled with an element of   {},

● The root node is labeled S,

● Every other node is labeled with some element of:
V – , and

● If m is a nonleaf node labeled X and the children of m
are labeled x1, x2, …, xn, then R contains the rule
X  x1x2…xn.

Ambiguity
A grammar is ambiguous iff there is at least one string in L(G) for which G produces more than one parse tree.

For most applications of context-free grammars, this is a problem.

An Arithmetic Expression Grammar
E  E + E id + id * id
E  E  E
E  (E)
E  id

Even a Very Simple Grammar Can be Highly Ambiguous
S  (())()
S  SS
S  (S)

Inherent Ambiguity
Some languages have the property that every grammar for them is ambiguous. We call such languages inherently ambiguous.

Example:

L = {anbncm: n, m  0}  {anbmcm: n, m  0}.

It can be proved that L is inherently ambiguous.

We can generate anbncm and anbmcm unambiguously but anbncn will be generated in two ways.

Inherent Ambiguity
Both of the following problems are undecidable:

Given a context-free grammar G, is G ambiguous?

Given a context-free language L, is L inherently

ambiguous?

But We Can Often Reduce Ambiguity
We can get rid of:

●  rules like S  ,

● rules with symmetric right-hand sides, e.g.,

S  SS
E  E + E

● rule sets that lead to ambiguous attachment of
optional postfixes.

A Highly Ambiguous Grammar
S 
S  SS
S  (S)

Resolving the Ambiguity with a Different Grammar
The biggest problem is the  rule.

A different grammar for the language of balanced parentheses:

S*  
S*  S
S  SS
S  (S)
S  ()

Nullable Variables
A variable X is nullable iff either:
(1) there is a rule X  , or
(2) there is a rule X  PQR… and P, Q, R, …
are all nullable.

So compute N, the set of nullable variables, as follows:

1. Set N to the set of variables that satisfy (1).
2. Repeat until no change
Add variables satisfying (2)

A General Technique for Getting Rid of -Rules
Definition: a rule is modifiable iff it is of the form:

P  R, for some nullable R, P    

removeEps(G: cfg) =
1. Let G = G.
2. Find the set N of nullable variables in G.
3. For each modifiable rule P  R of G do
Add the rule P  .
4. Delete from G all rules of the form X  .
5. Return G.

L(G) = L(G) – {}

An Example
G = {{S, T, A, B, C, a, b, c}, {a, b, c}, R, S), R =
{ S  aTa
T  ABC
A  aA | C
B  Bb | C
C  c |  }
Nullable variables = {A, B, C, T}

G: add : remove:
S  aa T  B C  
T  BC T  C
T  AC T  A
T  AB A  a
B  b

What If   L?
atmostoneEps(G: cfg) =
1. G = removeEps(G).
2. If SG is nullable then /* i. e.,   L(G)
2.1 Create in G a new start symbol S*.
2.2 Add to RG the two rules:
S*  
S*  SG.
3. Return G.

But There is Still Ambiguity
S*   What about ()()() ?
S*  S
S  SS
S  (S)
S  ()

Eliminating Symmetric Recursive Rules
S*  
S*  S
S  SS
S  (S)
S  ()

Replace S  SS with one of:

S  SS1 /* force branching to the left
S  S1S /* force branching to the right

So we get:
S*   S  SS1
S*  S S  S1
S1  (S)
S1  ()

Eliminating Symmetric Recursive Rules
So we get:
S*  
S*  S
S  SS1
S  S1
S1  (S)
S1  ()
S*

S S1

( ) ( ) ( )

Arithmetic Expressions
E  E + E
E  E  E
E  (E)
E  id
E E

E E E E

id  id  id id  id  id
Problem 1: Associativity

Arithmetic Expressions
E  E + E
E  E  E
E  (E)
E  id
E E

E E E E

id  id + id id  id + id
Problem 2: Precedence

Arithmetic Expressions – A Better Way
E  E + T
E  T
T  T * F
T  F
F  (E)
F  id

Example:

id + id * id

Dangling “else”
The dangling else problem:

::= if then
::= if then else

Consider:

if cond1 then if cond2 then stmt1 else stmt2

cond2

if then
if then else
cond1
stmt1
stmt2
cond2

if then else
if then
cond1
stmt2
stmt1
Dangling “else” ambiguity

Dangling “else” solution
::=
|
::= if then else
|
::= if then
| if then else

cond2

if then
if then else
cond1
stmt1
stmt2

other_stmt
other_stmt
Dangling “else” solution

k
k
n
n
n
n
n
n
b
a
b
a
b
a
…
2
2
1
1

Related Posts