Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Syntax
Dr. Liam O’Connor
University of Edinburgh LFCS UNSW, Term 3 2020
1
Abstract Syntax Parsing
Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Arithmetic Expressions
i ∈ Z i Atom a Atom
a SExp (a) Atom b PExp
e Atom
e PExp a PExp
e PExp e SExp
Concrete Syntax
a×b PExp
b SExp a+b SExp
All the syntax we have seen so far is concrete syntax. Concrete syntax is described by judgements on strings, which describe the actual text input by the programmer.
2
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Abstract Syntax
Working with concrete syntax directly is unsuitable for both compiler implementation and proofs. Consider:
3+(4×5) 3+4×5 (3+(4×5))
1“There is more than one way to do it”.
3
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Abstract Syntax
Working with concrete syntax directly is unsuitable for both compiler implementation and proofs. Consider:
3+(4×5) 3+4×5 (3+(4×5))
TIMTOWTDI1 makes life harder for us. Different derivations represent the same semantic program. We would like a representation of programs that is as simple as possible, removing any extraneous information. Such a representation is called abstract syntax.
1“There is more than one way to do it”.
4
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Abstract Syntax
Typically, the abstract syntax of a program is represented as a tree rather than as a string.
(3+(4×5)) ←→
Writing trees in our inference rules would rapidly become unwieldy, however. We shall define a term language in which to express trees.
5
3
+
×
4
5
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Terms
Definition
In this course, a term is a structure that can either be a symbol, like Plus or Times or 3; or a compound, which consists of an symbol followed by one or more argument subterms, all in parentheses.
t ::= Symbol | (Symbol t1 t2 …)
These particular terms are also known as s-expressions. Terms can equivalently be thought of a subset of Haskell where the only kinds of expressions allowed are literals and data constructors.
6
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Term Examples
Example
(Plus (Num 3) (Times (Num 4) (Num 5)))
Armed with an appropriate Haskell data declaration, this can be implemented straightforwardly:
data Exp =
| Times Exp Exp
Plus Exp Exp | Num Int
7
3
+
×
4
5
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Concrete Syntax
i ∈ Z i Atom a Atom
a SExp (a) Atom b PExp
e Atom
e PExp a PExp
e PExp e SExp
Concrete to Abstract
Abstract Syntax
a×b PExp
b SExp a+b SExp
i ∈ Z a AST
(Num i) AST (Plus a b) AST (Times a b) AST
Now we have to specify a relation to connect the two!
b AST a AST b AST
8
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Relations
Up until now, most judgements we have used have been unary — corresponding to a set of satisfying objects.
It’s also possible for a judgement to express a relationship between two objects (a binary judgement) or a number of objects (an n-ary judgement).
Example (Relations)
4 divides 16 (binary)
mail is an anagram of liam (binary) 3 plus 5 equals 8 (ternary)
n-ary judgements where n ≥ 2 are sometimes called relations, and correspond to an n-tuple of satisfying objects.
9
Abstract Syntax
Parsing
Bindings First Order Abstract Syntax Higher Order Abstract Syntax
3+(4×5) 3 + 4 × 5 (3+(4×5))
Parsing Relation
(Plus (Num 3) (Times (Num 4) (Num 5)))
i∈Z i Atom
a Atom
a×b PExp
a PExp
a+b SExp
e Atom e PExp
b PExp b SExp
10
e SExp (e) Atom
e PExp e SExp
Abstract Syntax
Parsing
Bindings First Order Abstract Syntax Higher Order Abstract Syntax
3+(4×5) 3 + 4 × 5 (3+(4×5))
Parsing Relation
(Plus (Num 3) (Times (Num 4) (Num 5)))
i∈Z
i Atom ←→ (Num i) AST
a Atom
a×b PExp
a PExp
a+b SExp
e Atom e PExp
b PExp b SExp
11
e SExp (e) Atom
e PExp e SExp
Abstract Syntax
Parsing
Bindings First Order Abstract Syntax Higher Order Abstract Syntax
3+(4×5) 3 + 4 × 5 (3+(4×5))
Parsing Relation
(Plus (Num 3) (Times (Num 4) (Num 5)))
i∈Z
i Atom ←→ (Num i) AST
12
e SExp (e) Atom
a Atom←→a′ AST a×b PExp
a PExp
a+b SExp
e Atom e PExp
b PExp←→b′ AST b SExp
e PExp e SExp
Abstract Syntax
Parsing
Bindings First Order Abstract Syntax Higher Order Abstract Syntax
3+(4×5) 3 + 4 × 5 (3+(4×5))
Parsing Relation
(Plus (Num 3) (Times (Num 4) (Num 5)))
i∈Z
i Atom ←→ (Num i) AST
a Atom←→a′ AST b PExp←→b′ AST a×bPExp←→(Timesa′ b′)AST
13
e SExp (e) Atom
a PExp
a+b SExp
e Atom e PExp
b SExp
e PExp
e SExp
Abstract Syntax
Parsing
Bindings First Order Abstract Syntax Higher Order Abstract Syntax
3+(4×5) 3 + 4 × 5 (3+(4×5))
Parsing Relation
(Plus (Num 3) (Times (Num 4) (Num 5)))
14
e SExp (e) Atom
i∈Z
i Atom ←→ (Num i) AST
a Atom←→a′ AST b PExp←→b′ AST a×bPExp←→(Timesa′ b′)AST
a PExp←→a′ AST b SExp←→b′ AST a+bSExp←→(Plusa′ b′)AST
e Atom e PExp e PExp e SExp
Abstract Syntax
Parsing
Bindings First Order Abstract Syntax Higher Order Abstract Syntax
3+(4×5) 3 + 4 × 5 (3+(4×5))
Parsing Relation
(Plus (Num 3) (Times (Num 4) (Num 5)))
i∈Z
i Atom ←→ (Num i) AST
a Atom←→a′ AST b PExp←→b′ AST a×bPExp←→(Timesa′ b′)AST
a PExp←→a′ AST b SExp←→b′ AST a+bSExp←→(Plusa′ b′)AST
15
e SExp ←→ a′ AST e Atom e PExp (e) Atom ←→ a′ AST e PExp e SExp
Abstract Syntax
Parsing
Bindings First Order Abstract Syntax Higher Order Abstract Syntax
3+(4×5) 3 + 4 × 5 (3+(4×5))
Parsing Relation
(Plus (Num 3) (Times (Num 4) (Num 5)))
i∈Z
i Atom ←→ (Num i) AST
a Atom←→a′ AST b PExp←→b′ AST a×bPExp←→(Timesa′ b′)AST
a PExp←→a′ AST b SExp←→b′ AST a+bSExp←→(Plusa′ b′)AST
16
e SExp←→a′ AST e Atom←→a AST e PExp←→a AST (e)Atom←→a′ AST ePExp←→aAST eSExp←→aAST
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Relations as Algorithms
The parsing relation ←→ is an extension of our existing concrete syntax rules. Therefore it is unambiguous, just as those rules are.
Furthermore, the abstract syntax for a particular concrete syntax can be unambiguously determined solely by looking at the left hand side of ←→.
17
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Relations as Algorithms
The parsing relation ←→ is an extension of our existing concrete syntax rules. Therefore it is unambiguous, just as those rules are.
Furthermore, the abstract syntax for a particular concrete syntax can be unambiguously determined solely by looking at the left hand side of ←→.
An Algorithm
To determine the abstract syntax corresponding to a particular concrete syntax:
1 Derive the left hand side of the ←→ (the concrete syntax) bottom-up until reaching axioms.
2 Fill in the right hand side of the ←→ (the abstract syntax) top-down, starting at the axioms.
This process of converting concrete to abstract syntax is called parsing.
18
Abstract Syntax
Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Rules
Example
i∈ZaSeAeP i A (a) A e P e S
aAbPaPbS a×bP a+bS
1+2×3S
19
Abstract Syntax
Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Rules
Example
i∈ZaSeAeP i A (a) A e P e S
aAbPaPbS a×bP a+bS
1P 2×3S 1+2×3S
20
Abstract Syntax
Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Rules
Example
i∈ZaSeAeP i A (a) A e P e S
aAbPaPbS a×bP a+bS
1A
1P 2×3S
1+2×3S
21
Abstract Syntax
Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Rules
Example
i∈ZaSeAeP i A (a) A e P e S
aAbPaPbS a×bP a+bS
1A 2×3P 1P 2×3S
1+2×3S
22
Abstract Syntax
Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Rules
Example
i∈ZaSeAeP i A (a) A e P e S
aAbPaPbS a×bP a+bS
3A 2A 3P
1A 2×3P 1P 2×3S
1+2×3S
23
Abstract Syntax
Parsing
Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Rules
Example
a S ←→ a′ e A ←→ a (a) A ←→ a′ e P ←→ a
a A←→a′
a×bP←→(Timesa′ b′) a+bS←→(Plusa′ b′)
3A 2A 3P
1A 2×3P 1P 2×3S
1+2×3S
i ∈ Z
i A ←→ (Num i)
e P ←→ a
b P←→b′ a P←→a′
e S ←→ a b S←→b′
24
Abstract Syntax
Parsing
Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Rules
Example
a S ←→ a′ e A ←→ a (a) A ←→ a′ e P ←→ a
a A←→a′
a×bP←→(Timesa′ b′) a+bS←→(Plusa′ b′)
3A 2A 3P
1 A ←→ (Num 1) AST 2 × 3 P 1P 2×3S
1+2×3S
i ∈ Z
i A ←→ (Num i)
e P ←→ a
b P←→b′ a P←→a′
e S ←→ a b S←→b′
25
Abstract Syntax
Parsing
Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Rules
Example
a S ←→ a′ e A ←→ a (a) A ←→ a′ e P ←→ a
a A←→a′
a×bP←→(Timesa′ b′) a+bS←→(Plusa′ b′)
i ∈ Z
i A ←→ (Num i)
e P ←→ a
b P←→b′ a P←→a′
e S ←→ a b S←→b′
1 A ←→ (Num 1) AST 1 P ←→ (Num 1) AST
1+2×3S
3A 2A 3P
2 × 3 P 2 × 3 S
26
Abstract Syntax
Parsing
Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Rules
Example
a S ←→ a′ e A ←→ a (a) A ←→ a′ e P ←→ a
a A←→a′
a×bP←→(Timesa′ b′) a+bS←→(Plusa′ b′)
i ∈ Z
i A ←→ (Num i)
e P ←→ a
b P←→b′ a P←→a′
e S ←→ a b S←→b′
1 A ←→ (Num 1) AST 1 P ←→ (Num 1) AST
1+2×3S
3A 2A←→(Num2)AST 3P
2 × 3 P 2 × 3 S
27
Abstract Syntax
Parsing
Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Rules
Example
a S ←→ a′ e A ←→ a (a) A ←→ a′ e P ←→ a
a A←→a′
a×bP←→(Timesa′ b′) a+bS←→(Plusa′ b′)
i ∈ Z
i A ←→ (Num i)
e P ←→ a
b P←→b′ a P←→a′
e S ←→ a b S←→b′
1 A ←→ (Num 1) AST 1 P ←→ (Num 1) AST
1+2×3S
3 A ←→ (Num 3) AST 2A←→(Num2)AST 3P
2 × 3 P 2 × 3 S
28
Abstract Syntax
Parsing
Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Rules
Example
a S ←→ a′ e A ←→ a (a) A ←→ a′ e P ←→ a
a A←→a′
a×bP←→(Timesa′ b′) a+bS←→(Plusa′ b′)
i ∈ Z
i A ←→ (Num i)
e P ←→ a
b P←→b′ a P←→a′
e S ←→ a b S←→b′
1 A ←→ (Num 1) AST 1 P ←→ (Num 1) AST
1+2×3S
3 A ←→ (Num 3) AST 2A←→(Num2)AST 3P←→(Num3)AST
2 × 3 P 2 × 3 S
29
Abstract Syntax
Parsing
Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Rules
Example
a S ←→ a′ e A ←→ a (a) A ←→ a′ e P ←→ a
a A←→a′
a×bP←→(Timesa′ b′) a+bS←→(Plusa′ b′)
i ∈ Z
i A ←→ (Num i)
e P ←→ a
b P←→b′ a P←→a′
e S ←→ a b S←→b′
1 A ←→ (Num 1) AST 1 P ←→ (Num 1) AST
1+2×3S
3 A ←→ (Num 3) AST 2A←→(Num2)AST 3P←→(Num3)AST
2 × 3 P ←→ (Times (Num 2) (Num 3)) AST 2 × 3 S
30
Abstract Syntax
Parsing
Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Rules
Example
a S ←→ a′ e A ←→ a (a) A ←→ a′ e P ←→ a
a A←→a′
a×bP←→(Timesa′ b′) a+bS←→(Plusa′ b′)
i ∈ Z
i A ←→ (Num i)
e P ←→ a
b P←→b′ a P←→a′
e S ←→ a b S←→b′
1 A ←→ (Num 1) AST 1 P ←→ (Num 1) AST
1+2×3S
3 A ←→ (Num 3) AST 2A←→(Num2)AST 3P←→(Num3)AST
2 × 3 P ←→ (Times (Num 2) (Num 3)) AST 2 × 3 S ←→ (Times (Num 2) (Num 3)) AST
31
Abstract Syntax
Parsing
Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Rules
Example
a S ←→ a′ e A ←→ a (a) A ←→ a′ e P ←→ a
a A←→a′
a×bP←→(Timesa′ b′) a+bS←→(Plusa′ b′)
3 A ←→ (Num 3) AST 2A←→(Num2)AST 3P←→(Num3)AST
1 A ←→ (Num 1) AST 2 × 3 P ←→ (Times (Num 2) (Num 3)) AST 1 P ←→ (Num 1) AST 2 × 3 S ←→ (Times (Num 2) (Num 3)) AST
1 + 2 × 3 S ←→ (Plus (Num 1) (Times (Num 2) (Num 3))) AST
i ∈ Z
i A ←→ (Num i)
e P ←→ a
b P←→b′ a P←→a′
e S ←→ a b S←→b′
32
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
The Inverse
What about the inverse operation to parsing? Unparsing
Unparsing, also called pretty-printing, is the process of starting with the abstract syntax on the right hand side of the parsing relation ←→ and attempting to synthesise a concrete syntax on the left.
33
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
The Inverse
What about the inverse operation to parsing? Unparsing
Unparsing, also called pretty-printing, is the process of starting with the abstract syntax on the right hand side of the parsing relation ←→ and attempting to synthesise a concrete syntax on the left.
Problem
There are many concrete syntaxes for a given abstract syntax. The algorithm is non-deterministic.
While it is desirable to have: It is not usually true that:
parse ◦ unparse = id unparse ◦ parse = id
34
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Example
3+(4×5)
3 + 4 × 5 (Plus (Num 3) (Times (Num 4) (Num 5)))
(3+(4×5))
Going from right to left requires some formatting guesswork to produce readable code. Algorithms to do this can get quite involved!
Let’s implement a parser for arithmetic. to coding
35
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Adding Let
Let us extend our arithmetic expression language with variables, including a let construct to give them values.
Concrete Syntax
x Ident xAtom
x Ident e1 SExp e2 SExp letx=e1 ine2 endAtom
Example
let x = 3 in x + 4
end
let x = 3 in
let y = 4 in x + y end
end
36
Abstract Syntax
Parsing
Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Scope
let x = 5 in let y =2in
x+y end
end
binding occurrence of x
Abstract Syntax
Parsing
Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Scope
let x = 5 in let y =2in
x+y end
end
binding occurrence of x
usage occurrence of x
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Scope
let x = 5 in
binding occurrence of x
scope of x
let y =2in x+y
end
end
usage occurrence of x
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Scope
let x = 5 in
binding occurrence of x
let y =2in x+y
end
scope of x
end
The process of finding the binding occurrence of each used variable is called scope resolution. Usually this is done statically. If no binding can be found, an out of scope error is raised.
usage occurrence of x
40
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Shadowing
What does this program evaluate to?
let x = 5 in let x = 2 in
x+x end
+ x end
41
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Shadowing
What does this program evaluate to?
let x = 5 in
let x = 2 in x+x
end
+ x end
This program results in 9.
x is shadowed here
42
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
α-equivalence What is the difference between these two programs?
let x = 5 in let a = 5 in let x =2in let y =2in
x+x y+y end end
end end
43
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
α-equivalence What is the difference between these two programs?
let x = 5 in let a = 5 in let x =2in let y =2in
x+x y+y end end
end end
They are semantically identical, but differ in the choice of bound variable names. Such expressions are called α-equivalent.
We write e1 ≡α e2 if e1 is α-equivalent to e2. The relation ≡α is an equivalence relation. That is, it is reflexive, transitive and symmetric.
The process of consistently renaming variables that preserves α-equivalence is called α-renaming.
44
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Substitution
A variable x is free in an expression e if x occurs in e but is not bound in e.
Example (Free Variables)
The variable x is free in x + 1, but not in let x = 3 in x + 1 end.
45
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Substitution
A variable x is free in an expression e if x occurs in e but is not bound in e. Example (Free Variables)
The variable x is free in x + 1, but not in let x = 3 in x + 1 end.
A substitution, written e[x := t] (or e[t/x] in some other courses), is the replacement
of all free occurrences of x in e with the term t. Example (Simple Substitution)
(5 × x + 7)[x := y × 4] is the same as (5 × (y × 4) + 7).
46
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Problems with substitution
Consider these two α-equivalent expressions.
let y = 5 in y × x + 7 end
and
let z = 5 in z × x + 7 end
What happens if you apply the substitution [x := y × 3] to both expressions?
47
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Problems with substitution
Consider these two α-equivalent expressions.
let y = 5 in y × x + 7 end
and
let z = 5 in z × x + 7 end
What happens if you apply the substitution [x := y × 3] to both expressions? You get
two non-α-equivalent expressions!
let y = 5 in y × (y × 3) + 7 end
and
This problem is called capture.
let z = 5 in z × (y × 3) + 7 end
48
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Variable Capture
Capture can occur for a substitution e[x := t] whenever there is a bound variable in the expression e with the same name as a free variable occuring in t.
Fortunately
It is always possible to avoid capture.
α-rename the offending bound variable to an unused name, or
If you have access to the free variable’s definition, renaming the free variable, or
Use a different abstract syntax representation that makes capture impossible (More on this later).
49
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Abstract Syntax for Variables
We shall extend our AST and parsing relation to include a definition for let and variables.
Let Syntax
x Ident
x Atom ←→ (Var x) AST
x Ident e1 SExp←→a1 AST e2 SExp←→a2 AST letx=e1 ine2 endAtom←→(Letxa1 a2)AST
50
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
First Order Abstract Syntax
Consider the following two pieces of abstract syntax:
(Let “x” (Num 5) (Plus (Num 4) (Var “x”)))
(Let “y” (Num 5) (Plus (Num 4) (Var “y”))) This demonstrates some problems with our abstract syntax approach.
51
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
First Order Abstract Syntax
Consider the following two pieces of abstract syntax:
(Let “x” (Num 5) (Plus (Num 4) (Var “x”))) (Let “y” (Num 5) (Plus (Num 4) (Var “y”)))
This demonstrates some problems with our abstract syntax approach. 1 Substitution capture is a problem.
52
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
First Order Abstract Syntax
Consider the following two pieces of abstract syntax:
(Let “x” (Num 5) (Plus (Num 4) (Var “x”))) (Let “y” (Num 5) (Plus (Num 4) (Var “y”)))
This demonstrates some problems with our abstract syntax approach.
1 Substitution capture is a problem.
2 α-equivalent expressions are not equal. Determining if an expression is α-equivalent requires us to search for a consistent α-renaming of variables.
53
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
First Order Abstract Syntax
Consider the following two pieces of abstract syntax:
(Let “x” (Num 5) (Plus (Num 4) (Var “x”))) (Let “y” (Num 5) (Plus (Num 4) (Var “y”)))
This demonstrates some problems with our abstract syntax approach.
1 Substitution capture is a problem.
2 α-equivalent expressions are not equal. Determining if an expression is α-equivalent requires us to search for a consistent α-renaming of variables.
3 No distinction is made between binding and usage occurrences of variables. This means that we must define substitution by hand on each type of expression we introduce.
54
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
First Order Abstract Syntax
Consider the following two pieces of abstract syntax:
(Let “x” (Num 5) (Plus (Num 4) (Var “x”))) (Let “y” (Num 5) (Plus (Num 4) (Var “y”)))
This demonstrates some problems with our abstract syntax approach.
1 Substitution capture is a problem.
2 α-equivalent expressions are not equal. Determining if an expression is α-equivalent requires us to search for a consistent α-renaming of variables.
3 No distinction is made between binding and usage occurrences of variables. This means that we must define substitution by hand on each type of expression we introduce.
4 Scoping errors cannot be easily detected — malformed syntax is easy to write.
55
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
de Bruijn Indices
One popular approach to address the first issue is de Bruijn indices.
Key Idea
1 Remove all identifiers from binding expressions like Let.
2 Replace the identifier in a Var with a number indicating how many binders we
must skip in order to find the binder for that variable.
(Let “a” (Num 5) (Let “y” (Num 2)
(Plus (Var “a”) (Var “y”))))
56
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
de Bruijn Indices
One popular approach to address the first issue is de Bruijn indices.
Key Idea
1 Remove all identifiers from binding expressions like Let.
2 Replace the identifier in a Var with a number indicating how many binders we must skip in order to find the binder for that variable.
(Let “a” (Num 5) (Let “y” (Num 2)
(Plus (Var “a”) (Var “y”))))
(Let (Num 5) (Let (Num 2)
(Plus (Var 1) (Var 0))))
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
de Bruijn Indices
One popular approach to address the first issue is de Bruijn indices.
Key Idea
1 Remove all identifiers from binding expressions like Let.
2 Replace the identifier in a Var with a number indicating how many binders we must skip in order to find the binder for that variable.
(Let “a” (Num 5) (Let “y” (Num 2)
(Plus (Var “a”) (Var “y”))))
(Let (Num 5) (Let (Num 2)
(Plus (Var 1) (Var 0))))
58
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Debruijnification
Algorithm
Given a piece of first order abstract syntax with explicit variable names, we can convert to de Bruijn indices by keeping a stack of variable names, pushing onto the stack at each Let and popping after the variable goes out of scope. When a usage occurrence is encountered, replace the variable name with its first position in the stack (starting at the top of the stack).
This approach naturally handles shadowing. It’s also possible, but harder, to have de Bruijn indices going in the other direction (from the bottom of the stack, upwards).
59
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
de Bruijn Substitution
Substitution is now capture avoiding by definition.
(Num i)[n := t] = (Num i)
(Plusab)[n:=t] = (Plusa[n:=t]b[n:=t]) (Times a b)[n := t] = (Times a[n := t] b[n := t])
60
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
de Bruijn Substitution
Substitution is now capture avoiding by definition. (Num i)[n := t] = (Num i)
(Plusab)[n:=t] = (Times a b)[n := t] =
(Var m)[n := t] =
(Plusa[n:=t]b[n:=t]) (Times a[n := t] b[n := t])
t
(Var (m − 1)) (Var m)
if n = m if m > n otherwise
61
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
de Bruijn Substitution
Substitution is now capture avoiding by definition.
(Num i)[n := t] (Plusab)[n:=t] (Times a b)[n := t]
(Var m)[n := t] (Lete1 e2)[n:=t]
= (Numi)
= (Plusa[n:=t]b[n:=t]) = (Times a[n := t] b[n := t])
t if n = m
= (Var(m−1)) ifm>n
(Var m) otherwise
= (Lete1[n:=t]e2[n+1:=t↑0])
62
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
de Bruijn Substitution
Substitution is now capture avoiding by definition.
(Num i)[n := t] (Plusab)[n:=t] (Times a b)[n := t]
(Var m)[n := t]
(Lete1 e2)[n:=t]
Where e↑n is an up-shifting operation defined as follows:
= (Num i)
= (Plusa[n:=t]b[n:=t]) = (Times a[n := t] b[n := t])
t if n = m
= (Var(m−1)) ifm>n (Var m) otherwise
63
(Num i)↑n (Plus a b)↑n (Times a b)↑n
(Var m)↑n (Let e1 e2)↑n
= (Numi)
= (Plus a↑n b↑n ) = (Times a↑n b↑n )
(Var(m+1)) ifm≥n = (Var m) otherwise = Let e1↑n e2↑n+1
= (Lete1[n:=t]e2[n+1:=t↑0])
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Examining de Bruijn indices
How do de Bruijn indices stack up against our explicit names in terms of the problems we identified?
1 Substitution capture solved.
64
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Examining de Bruijn indices
How do de Bruijn indices stack up against our explicit names in terms of the problems we identified?
1 Substitution capture solved.
2 α-equivalent expressions are now equal.
65
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Examining de Bruijn indices
How do de Bruijn indices stack up against our explicit names in terms of the problems we identified?
1 Substitution capture solved.
2 α-equivalent expressions are now equal.
3 We still must define substitution machinery by hand for each type of expression.
66
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Examining de Bruijn indices
How do de Bruijn indices stack up against our explicit names in terms of the problems we identified?
1 Substitution capture solved.
2 α-equivalent expressions are now equal.
3 We still must define substitution machinery by hand for each type of expression.
4 It is still possible to make malformed syntax – indices that overflow the stack, for example.
67
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Examining de Bruijn indices
How do de Bruijn indices stack up against our explicit names in terms of the problems we identified?
1 Substitution capture solved.
2 α-equivalent expressions are now equal.
3 We still must define substitution machinery by hand for each type of expression.
4 It is still possible to make malformed syntax – indices that overflow the stack, for example.
Two out of four isn’t bad, but can we do better by changing the term language?
68
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Higher Order Terms
We shall change our term language to include built-in notions of variables and binding.
t ::= Symbol
| x
| t1 t2
| x. t
(symbols) (variables) (application) (binding or abstraction)
As in Haskell, we shall say that application is left-associative, so (Plus (Num 3) (Num 4)) = ((Plus (Num 3)) (Num 4))
Now the binding and usage occurrences of variables are distinguished from regular symbols in our term language. Let’s see what this lets us do…
69
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Representing Let
a1 AST a2 AST (Let a1 (x. a2)) AST
We no longer need a rule for variables, because they’re baked into the structure of terms.
70
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Representing Let
a1 AST a2 AST (Let a1 (x. a2)) AST
We no longer need a rule for variables, because they’re baked into the structure of terms.
How would we represent this AST in Haskell?
data AST = Num Int
| Plus AST AST
| Times AST AST | Let AST ???
71
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Representing Let
a1 AST a2 AST (Let a1 (x. a2)) AST
We no longer need a rule for variables, because they’re baked into the structure of terms.
How would we represent this AST in Haskell?
data AST =
| Plus AST AST
| LetAST(AST→AST) So let x = 3 in x +2 end becomes, in Haskell:
(Let (Num 3) (λx → Plus x (Num 2))
Num Int
| Times AST AST
72
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Substitution
We can now define substitution across all terms in the meta-logic:
Symbol[x := e] = y[x := e] = (t1 t2)[x :=e] =
(y. t)[x := e] =
Symbol
e ify=x
y otherwise
t1[x :=e] t2[x :=e]
(y. t)
(y. t[x := e]) undefined
if x = y
if y ∈/ FV(e) otherwise
73
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Substitution
We can now define substitution across all terms in the meta-logic:
Symbol[x := e] = y[x := e] = (t1t2)[x:=e] =
Symbol
e ify=x
y otherwise
t1[x := e] t2[x := e]
(y. t)[x := e] = (y. t[x := e]) undefined
Where FV(·) is the set of all free variables in a term: FV(Symbol) = ∅
FV(x)
FV(t1 t2)
FV(x. t) =
(y. t)
if x = y
if y ∈/ FV(e) otherwise
= {x}
= FV(t1) ∪ FV(t2)
74
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Substitution
We can now define substitution across all terms in the meta-logic:
Symbol[x := e] = y[x := e] = (t1t2)[x:=e] =
Symbol
e ify=x
y otherwise
t1[x := e] t2[x := e]
(y. t)[x := e] = (y. t[x := e]) undefined
Where FV(·) is the set of all free variables in a term: FV(Symbol) = ∅
FV(x) FV(t1 t2) FV(x.t)
= {x}
= FV(t1) ∪ FV(t2) = FV(t)\{x}
(y. t)
if x = y
if y ∈/ FV(e) otherwise
75
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Cheating Outrageously
Substitution capture is still a problem in the substitution we just defined but it is not our problem. Because substitution is defined in the meta-language, it’s the job of the implementors of the meta-language (if any) to deal with issues about capture.
When Haskell is our meta-language, it’s the job of the GHC developers to sort out capture.
When we are doing proofs in our meta-logic, there is no implementation, so we can just say that we assume α-equivalent terms to be equal, and therefore assume that variables are always renamed to avoid capture.
So, we have solved the problem by making it someone else’s problem. Outrageous cheating!
76
Abstract Syntax
Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Evaluating All Approaches
HOAS
Proofs Haskell Strings de Bruijn
FOAS
Capture α-equivalence Generic subst. Malformed syntax
Cheat Cheat Solved Cheat
Cheat Cheat Solved Cheat
Problem Problem Problem Problem
Solved
Solved
Problem Problem
77
Abstract Syntax
Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Evaluating All Approaches
HOAS
Proofs Haskell Strings de Bruijn
FOAS
Capture α-equivalence Generic subst. Malformed syntax
Cheat Cheat Solved Cheat
Cheat Cheat Solved Cheat
Problem Problem Problem Problem
Solved
Solved
Problem Problem
In embedded languages and in pen and paper proofs, HOAS is very common.
78
Abstract Syntax
Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Evaluating All Approaches
HOAS
Proofs Haskell Strings de Bruijn
FOAS
Capture α-equivalence Generic subst. Malformed syntax
Cheat Cheat Solved Cheat
Cheat Cheat Solved Cheat
Problem Problem Problem Problem
Solved
Solved
Problem Problem
In embedded languages and in pen and paper proofs, HOAS is very common.
In conventional language implementations and machine-checked formalisations, de Bruijn indices are more popular.
79
Abstract Syntax Parsing Bindings First Order Abstract Syntax Higher Order Abstract Syntax
Evaluating All Approaches
HOAS
FOAS
Proofs
Haskell
Strings
de Bruijn
Capture α-equivalence Generic subst. Malformed syntax
Cheat Cheat Solved Cheat
Cheat Cheat Solved Cheat
Problem Problem Problem Problem
Solved
Solved
Problem Problem
In embedded languages and in pen and paper proofs, HOAS is very common.
In conventional language implementations and machine-checked formalisations, de
Bruijn indices are more popular.
In your assignments, strings will be used
80