MAKING A GRAMMAR LL(1)
PRINCIPLES OF PROGRAMMING LANGUAGES
Norbert Zeh Winter 2021
Dalhousie University
1/7
ROAD MAP
• Parsing: Transform (tokenized) program text into parse tree
• Modelling programming languages: Context-free grammars and languages
• Capturing the syntactic structure of a program: Parse trees
• Types of parsers and types of grammars they can parse
• Grammars that describe programming languages and can be parsed
efficiently
• Construction of an LL(1) grammar
• Parsing LL(1) languages
• Push-down automata
2/7
SOME FACTS ABOUT LL(1) LANGUAGES
There exist context-free languages that do not have LL(1) grammars.
3/7
SOME FACTS ABOUT LL(1) LANGUAGES
There exist context-free languages that do not have LL(1) grammars.
There is no known algorithm to determine whether a language is LL(1) (but there is an algorithm to decide whether a grammar is LL(1)).
3/7
SOME FACTS ABOUT LL(1) LANGUAGES
There exist context-free languages that do not have LL(1) grammars.
There is no known algorithm to determine whether a language is LL(1) (but there is an algorithm to decide whether a grammar is LL(1)).
The “obvious” grammar for most programming languages is usually not LL(1).
3/7
SOME FACTS ABOUT LL(1) LANGUAGES
There exist context-free languages that do not have LL(1) grammars.
There is no known algorithm to determine whether a language is LL(1) (but there is an algorithm to decide whether a grammar is LL(1)).
The “obvious” grammar for most programming languages is usually not LL(1).
In many situations, a non-LL(1) grammar can be transformed into an LL(1) grammar for the same language.
3/7
CONVERTING A GRAMMAR TO LL(1)
Two common reasons why a grammar is not LL(1) are “left-recursion” and “common prefixes”. Both can be eliminated by modifying the grammar.
4/7
CONVERTING A GRAMMAR TO LL(1)
Two common reasons why a grammar is not LL(1) are “left-recursion” and “common prefixes”. Both can be eliminated by modifying the grammar.
Left-recursion:
A → α|Aβ
4/7
CONVERTING A GRAMMAR TO LL(1)
Two common reasons why a grammar is not LL(1) are “left-recursion” and “common prefixes”. Both can be eliminated by modifying the
grammar.
Left-recursion:
A appears as the first symbol on the right-hand side of a production for A.
A → α|Aβ
4/7
CONVERTING A GRAMMAR TO LL(1)
Two common reasons why a grammar is not LL(1) are “left-recursion” and “common prefixes”. Both can be eliminated by modifying the grammar.
Left-recursion: Common prefix:
A → α|Aβ A → αβ|αγ
4/7
CONVERTING A GRAMMAR TO LL(1)
Two common reasons why a grammar is not LL(1) are “left-recursion” and “common prefixes”. Both can be eliminated by modifying the grammar.
Left-recursion:
Common prefix:
Example of a common prefix:
A → α|Aβ A → αβ|αγ
Expr → Term
Expr → Term + Expr
4/7
REMOVING LEFT-RECURSION
Left-recursion can be replaced with right-recursion:
A → α|Aβ ⇓
A → αA′ A′ → ε|βA′
5/7
REMOVING LEFT-RECURSION
Left-recursion can be replaced with right-recursion:
A → α|Aβ ⇓
A→αA′ A′ → ε|βA′
Using left-recursion:
A⇒ Aβ ⇒Aββ
⇒∗ Aββ…β⇒αββ…β
Using right-recursion:
A⇒ αA′ ⇒αβA′
⇒∗ αββ…βA′ ⇒αββ…β
5/7
REMOVING LEFT-RECURSION
Left-recursion can be replaced with right-recursion:
Caveat:
A → α|Aβ ⇓
A → αA′ A′ → ε|βA′
• Left-recursion is often used intentionally to capture the structure of the language (e.g., associativity of operators in arithmetic expressions).
• The above conversion discards this information.
5/7
REMOVING COMMON PREFIXES
Common prefixes can be removed using left-factoring:
A → αβ|αγ ⇓
A → αA′ A′ → β|γ
6/7
CONVERTING A GRAMMAR TO LL(1): EXAMPLE
Rule R
S → E$ E→EAT
E→T T→TMF
PREDICT(R)
{n, (} {n, (}
{n, (} {n, (}
{n, (} F→(E) {(}
A→+ {+} A→− {−}
M→∗ {∗} M→/ {/}
T→F
F→n {n}
7/7
CONVERTING A GRAMMAR TO LL(1): EXAMPLE
Rule R
S → E$ E→EAT
E→T T→TMF
PREDICT(R)
{n, (} {n, (}
{n, (} {n, (}
Rule R PREDICT(R)
S→E$ {n,(} E→TE′ {n,(}
{n, (} F→(E) {(}
A→+ {+} A→− {−}
M→∗ {∗} M→/ {/}
E′ →ε
E′ →ATE′
T → F T′ T′ →ε
{$,)} {+,−}
{n, (} {+,−,$,)}
T→F
F→n {n}
T′ →MFT′
F→n {n}
{∗,/} F → (E) {(}
A→+ {+} A→− {−}
M→∗ {∗} M→/ {/}
7/7