COMP 1130 – Lambda Calculus
based on slides by Jeff Foster, U Maryland
Motivation
• Commonly-used programming languages are large and complex
■ ANSI C99 standard: 538 pages
■ ANSI C++ standard: 714 pages
■ Java language specification 2.0: 505 pages
• Not good vehicles for understanding language features or explaining program analysis
COMP 1130
2
Goal
• Develop a “core language” that has
■ The essential features
■ No overlapping constructs
■ And none of the cruft
– Extra features of full language can be defined in terms of the core language (“syntactic sugar”)
• Lambda calculus
■ Standard core language for single-threaded procedural programming
■ Often with added features (e.g., state); we’ll see that COMP 1130 later
3
Lambda Calculus is Practical!
• An 8-bit microcontroller (Zilog Z8 encore board w/4KB SRAM)computing 1 + 1 using Church numerals in the Lambda calculus
COMP 1130
4
Tim Fraser
Origins of Lambda Calculus
• Invented in 1936 by Alonzo Church (1903-1995) ■ Princeton Mathematician
■ Lectures of lambda calculus published in 1941
■ Also know for – Church’sThesis
– All effective computation is expressed by recursive (decidable) functions, i.e., in the lambda calculus
– Church’sTheorem
– First order logic is undecidable
COMP 1130
5
Lambda Calculus
• Syntax:
e ::= x
| λx.e
| e e
variable
function abstraction function application
• Only constructs in pure lambda calculus
■ Functions take functions as arguments and return functions as results
■ I.e., the lambda calculus supports higher-order functions
COMP 1130
6
Semantics
• To evaluate (λx.e1) e2 ■ Bindxtoe2
■ Evaluate e1
■ Return the result of the evaluation
• This is called “beta-reduction”
(λx.e1) e2 →β e1[e2\x]
(λx.e1) e2 is called a redex ■ We’ll usually omit the beta
COMP 1130
■
■
7
Three Conveniences
• Syntactic sugar for local declarations
let x = e1 in e2 is short for (λx.e2) e1
• Scope of λ extends as far to the right as possible
■
λx.λy.x y is λx.(λy.(x y))
• Function application is left-associative
■ x y z is (x y) z COMP 1130
■
8
Scoping and Parameter Passing
• Beta-reduction is not yet precise
(λx.e1) e2 → e1[e2\x]
■
■ what if there are multiple x’s?
• Example:
■ let x = a in
let y = λz.x in
■
■ let x = b in y x
■ which x’s are bound to a, and which to b?
COMP 1130
9
Static (Lexical) Scope
• Just like most languages, a variable refers to the closest definition
• Make this precise using variable renaming
■ The term
– let x = a in let y = λz.x in let x = b in y x
■ is “the same” as
– let x = a in let y = λz.x in let w = b in y w
■ Variable names don’t matter COMP 1130
10
Free Variables and Alpha Conversion
• The set of free variables of a term is
FV(x) = {x}
FV(λx.e) = FV(e) – {x}
FV(e1 e2) = FV(e1) ∪ FV(e2)
• A term e is closed if FV(e) = ∅
• A variable that is not free is bound
COMP 1130
11
Alpha Conversion
• Terms are equivalent up to renaming of bound variables
λx.e = λy.(e[y\x]) if y ∊ FV(e)
• This is often called alpha conversion, and we will use it implicitly whenever we need to avoid capturing variables when we perform substitution
COMP 1130
12
■
Substitution
• Formal definition:
■ x[e\x] = e
z[e\x] = z if z ≠ x
(λz.e1)[e\x] = λz.(e1[e\x]) if z ≠ x and z ∊ FV(e)
■
■ (e1 e2)[e\x] = (e1[e\x] e2[e\x])
■
• Example:
(λx.y x) x =α (λw.y w) x →β y x
■ (We won’t write alpha conversion down in the future)
COMP 1130
■
13
A Note on Substitutions
• People write substitution many different ways
■ e1[e2\x] ■ e1[x↦e2]
■ [x/e2]e1
■ and more…
• But they all mean the same thing COMP 1130
14
Multi-Argument Functions
• We can’t (yet) write multi-argument functions
E.g., a function of two arguments λ(x, y).e
• Trick: Take arguments one at a time
■
λx.λy.e
function that, given argument y, returns e
■
■ This is a function that, given argument x, returns a
(λx.λy.e) a b → (λy.e[a\x]) b → e[a\x][b\y]
• This is often called Currying and can be used to
represent functions with any # of arguments COMP 1130
■
15
Booleans
• true = λx.λy.x
• false = λx.λy.y
• if a then b else c = a b c
• Example:
if true then b else c → (λx.λy.x) b c →(λy.b) c → b if false then b else c → (λx.λy.y) b c →(λy.y) c → c
COMP 1130
■ ■
16
Combinators
• Any closed term is also called a combinator ■ So true and false are both combinators
• Other popular combinators
I = λx.x
S = λx.λy.x
K = λx.λy.λz.x z (y z)
■ Can also define calculi in terms of combinators – E.g., the SKI calculus
COMP 1130 – Turns out the SKI calculus is also Turing complete
■ ■ ■
17
Pairs
• (a, b) = λx.if x then a else b • fst = λp.p true
• snd = λp.p false
• Then
fst (a, b) →* a snd (a, b) →* b
COMP 1130
■ ■
18
Natural Numbers (Church)
• 0 = λx.λy.y
• 1 = λx.λy.x y
• 2 = λx.λy.x(x y)
• i.e., n = λx.λy.
• succ = λz.λx.λy.x(z x y)
• iszero = λz.z (λy.false) true
COMP 1130
19
Natural Numbers (Scott)
• 0 = λx.λy.x
• 1 = λx.λy.y 0
• 2 = λx.λy.y 1
• I.e., n = λx.λy.y (n-1)
• succ = λz.λx.λy.y z
• pred = λz.z 0 (λx.x)
• iszero = λz.z true (λx.false)
COMP 1130
20
A Nonderministic Semantics
(λx.e1) e2 → e1[e2\x]
e1→ e1′ e1 e2 → e1′ e2
e → e′ (λx.e) → (λx.e′)
e2→ e2′ e1 e2 → e1 e2′
■ Why are these semantics non-deterministic? COMP 1130
21
Example
• We can apply reduction anywhere in a term
λx.(λy.y) x ((λz.w) x) → λx.(x ((λz.w) x) → λx.x w λx.(λy.y) x ((λz.w) x) → λx.((λy.y) x w) → λx.x w
• Does the order of evaluation matter?
COMP 1130
22
■ ■
The Church-Rosser Theorem
• If a →* b and a →* c, there there exists d such that b →* d and c →* d
■ Proof: http://www.mscs.dal.ca/~selinger/papers/ lambdanotes.pdf
• Church-Rosser is also called confluence
COMP 1130
23
Normal Form
• A term is in normal form if it cannot be reduced
Examples: λx.x, λx.λy.z
• By Church-Rosser Theorem, every term reduces
to at most one normal form
■ Warning: All of this applies only to the pure lambda calculus with non-deterministic evaluation
• Notice that for our application rule, the argument need not be in normal form
COMP 1130
■
24
Beta-Equivalence
Let =β be the reflexive, symmetric, and transitive closure of →
E.g., (λx.x) y → y ← (λz.λw.z) y y, so all three are beta equivalent
•
■
•
If a =β b,then there exists c such that a →* c and b →* c
•
■ Proof: Consequence of Church-Rosser Theorem
In particular, if a =β b and both are normal
forms, then they are equal COMP 1130
25
Not Every Term Has a Normal Form
• Consider
Δ = λx.x x
Then Δ Δ → Δ Δ →···
• In general, self application leads to loops ■ …which is good if we want recursion
COMP 1130
26
■ ■
A Fixpoint Combinator
• Also called a paradoxical combinator
Y = λf.(λx.f (x x)) (λx.f (x x)) ThenY F =β F (Y F)
Y F = (λf.(λx.f (x x)) (λx.f (x x))) F → (λx.F (x x)) (λx.F (x x))
→ F ((λx.F (x x)) (λx.F (x x)))
← F (Y F)
COMP 1130
•
■ ■ ■ ■
■
■ Note: There are many versions of this combinator
27
Example
• Fact n = if n = 0 then 1 else n * fact(n-1) • Let G = λf.
I.e., G = λf. λn.if n = 0 then 1 else n*f(n-1) Y G 1 =β G (YG) 1
=β (λf.λn.if n = 0 then 1 else n*f(n-1)) (Y G) 1
=β if 1 = 0 then 1 else 1*((Y G) 0)
=β if 1 = 0 then 1 else 1*(G (Y G) 0)
=β if 1 = 0 then 1 else 1*(λf.λn.if n = 0 then 1 else n*f(n-1) (Y G) 0) =β if 1 = 0 then 1 else 1*(if 0 = 0 then 1 else 0*((Y G) 0)
■
•
■ ■ ■ ■ ■ ■
=β 1*1 = 1 COMP 1130
28
In Other Words
• The Y combinator “unrolls” or “unfolds” its argument an infinite number of times
■ Y G = G (Y G) = G (G (Y G) = G (G (G (Y G))) = … ■ G needs to have a “base case” to ensure termination
• But, only works because we’re call-by-name
■ Different combinator(s) for call-by-value
– Z = λf.(λx.f (λy. x x y)) (λx.f (λy. x x y))
– Why is this a fixed-point combinator? How does its difference from Y make it work for call-by-value?
COMP 1130
29
Encodings
• Encodings are fun
• They show language expressiveness
• In practice, we usually add constructs as primitives
■ Much more efficient
■ Much easier to perform program analysis on and avoid
silly mistakes with
– E.g., our encodings of true and 0 are exactly the same, but we may want to forbid mixing booleans and integers
COMP 1130
30
Lazy vs. Eager Evaluation
• Our non-deterministic reduction rule is fine for theory, but awkward to implement
• Two deterministic strategies:
Lazy: Given (λx.e1) e2, do not evaluate e2 if x does not “need” e1
– Also called left-most, call-by-name, call-by-need, applicative, normal-order (with slightly different meanings)
Eager: Given (λx.e1) e2, always evaluate e2 fully before applying the function
■
■
– Also called call-by-value COMP 1130
31
Lazy Operational Semantics
(λx.e1) →l (λx.e1)
e1 →l λx.e e[e2\x] →l e′
e1 e2 →l e′
• The rules are deterministic and big-step
■ The right-hand side is reduced “all the way”
• The rules do not reduce under λ
• The rules are normalizing:
■ If a is closed and there is a normal form b such that
a →* b, then a →l d for some d COMP 1130
32
Eager (Big-Step) Op. Semantics
(λx.e1) →e (λx.e1)
e1 →e λx.e e2 →e e′ e[e′\x] →e e′′
e1 e2 →e e′′
• This big-step semantics is also deterministic and
and does not reduce under λ • But it is not normalizing
Example: let x = Δ Δ in (λy.y) COMP 1130
■
33
Lazy vs. Eager in Practice
• Lazy evaluation (call by name, call by need)
■ Has some nice theoretical properties
■ Terminates more often
■ Lets you play some tricks with “infinite” objects
■ Main example: Haskell
• Eager evaluation (call by value)
■ Is generally easier to implement efficiently
■ Blends more easily with side effects
■ Main examples: Most languages (C, Java, ML, etc.)
COMP 1130
34
Functional Programming
• The λ calculus is a prototypical functional programming language:
■ Lots of higher-order functions ■ No side-effects
• In practice, many functional programming languages are “impure” and permit side-effects
■ But you’re supposed to avoid using them COMP 1130
35
Functional Programming Today
• Two main camps:
■ Haskell – Pure, lazy functional language; no side effects
■ ML (SML/NJ, OCaml) – Call-by-value, with side effects
• Still around: LISP, Scheme
■ Disadvantage/advantage: No static type systems
COMP 1130
36
Influence of Functional Programming
•Functional ideas in many other languages
■ Garbage collection was first designed with Lisp; most
languages often rely on a GC today
■ Generics in Java/C++ came from polymorphism in ML and from type classes in Haskell
■ Higher-order functions and closures (used widely in Ruby; proposed extension to Java) are pervasive in all functional languages
■ Many data abstraction principles of OO came from ML’s module system
■… COMP 1130
37
Call-by-Name Example
OCaml
let cond p x y = if p then x else y
let rec loop () = loop ()
let z = cond true 42 (loop ())
infinite loop at call
Haskell
cond p x y = if p then x else y
loop () = loop ()
z = cond True 42 (loop ())
3rd argument never used by cond, so never invoked
COMP 1130
38
Two Cool Things to Do with CBN
• Build control structures with functions
cond p x y = if p then x else y
• “Infinite” data structures
COMP 1130
39
integers n = n:(integers (n+1))
take 10 (integers 0) (* infinite loop in cbv *)