G6021 Comparative Programming
Part 3 – foundations
Part 3 – foundations
G6021 Comparative Programming
1/36
The Lambda Calculus
The λ-calculus is a computational model based on the mathematical notion of a function.
Defined by the mathematician Alonzo Church in the 1930’s, as a precise notation for anonymous functions. He noticed that an expression x + y was sometimes interpreted as:
thenumberx+y
thefunctionf:x→x+y
thefunctiong:y→x+y
thefunctionh:x,y→x+y
With the lambda-calculus notation, these can be easily distinguished:
the number x +y is written just x +y
thefunctionf:x→x+yiswrittenλx.x+y
thefunctiong:y→x+yiswrittenλy.x+y
thefunctionh:x,y→x+yiswrittenλx,y.x+y
Part 3 – foundations
G6021 Comparative Programming
2/36
The Lambda Calculus
The λ-calculus is used:
to study computability (as an alternative to Turing Machines),
to define models (denotational semantics) of programming languages,
to study strategies and implementation techniques for functional languages (abstract machines),
to encode proofs in a variety of logics,
to design automatic theorem provers and proof assistants.
Part 3 – foundations
G6021 Comparative Programming
3/36
λ-calculus: Syntax
Definition:
Assume an infinite set X of variables denoted by x, y, z, . . . , then the set of λ-terms is the least set satisfying:
M::=X |(λX.M)|(MM)
which are called variable, abstraction and application
Some examples
x, (λy.y), (λx.(λy.x)), ((λz.z)(λy.y)) An intuition. The following are all the same:
fxy=x+y
f x = λy.x + y
f = λx.λy.x + y
Part 3 – foundations
G6021 Comparative Programming
4/36
λ-calculus: Conventions
write as few parentheses as possible:
(λy.(xy)) = λy.xy application associates to the left:
xyz = ((xy)z) abstractions bind as far as possible to the right
λx.(λy.y)x = (λx.((λy.y)x)) abstractions can be abbreviated:
λx.λy.M = λxy.M
Part 3 – foundations
G6021 Comparative Programming
5/36
Examples of λ-terms
x, λx.x, xy, λx.z, xz(yz), λx.λy.yx
λxy.x, λxyz.z(xy), λxyz.xz(yz)
λx.λy.x, λx.λy.y
(λx.x)y, (λx.λy.xy)(λx.x)
λf.λx.x, λf.λx.fx, λf.λx.f(fx), λf.λx.f(f(fx)) λx .xx
Note: Haskell syntax:
\x -> Misthesameasλx.M
Exercise: Write the above examples in Haskell syntax. Are they all valid in Haskell?
Part 3 – foundations
G6021 Comparative Programming
6/36
Variables
A variable is free in a λ-term if it is not bound by a λ.
More precisely, the set of free variables of a term is defined as:
FV(x) = FV(λx.M) = FV(MN) =
{x} FV(M)−{x} FV(M) ∪ FV(N)
Terms without free variables are called closed terms. We can define:
BV(x) = BV(λx.M) = BV(MN) =
∅
{x}∪BV(M) BV(M) ∪ BV(N)
Question: What is BV defining?
Exercise: Check the FV and BV of the examples.
Part 3 – foundations
G6021 Comparative Programming
7/36
α-conversion
λ-terms that differ only in the names of their bound variables will be
equated. More precisely: If y is not free in M: λx.M =α λy.M{x → y}
where M{x → y} is the term M where each occurrence of x is replaced by y (i.e. we rename every free occurrence of x to y).
IMPORTANT:
λ-terms are defined modulo α-conversion, so λx.x and λy.y are the SAME term.
α-equivalent terms represent the same computation (see below).
Part 3 – foundations
G6021 Comparative Programming
8/36
Computation
Abstractions represent functions, which can be applied to arguments.
The main computation rule is β-reduction, which indicates how to find the result of the function for a given argument.
A redex is a term of the form: (λx.M)N
It reduces to the term M{x → N} where M{x → N} is the term obtained when we substitute x by N taking into account bound variables.
β-reduction:
Note that we use the word “reduce”, but this does not mean that
the term on the right is any simpler. Why?
Notation: if M →β M1 →β M2 ···Mn then we write M →∗β Mn
(λx.M)N →β M{x → N}
Part 3 – foundations
G6021 Comparative Programming
9/36
Substitution
Substitution is a special kind of replacement: M{x → N} means replace all free occurrences of x in M by the term N.
Question: Why only the free occurrences? What happens if we replace all occurrences?
A very useful property of substitution is the following, known as the Substitution Lemma:
If x ̸∈ FV(P):
(M{x → N}){y → P} = (M{y → P}){x → N{y → P}}
Part 3 – foundations
G6021 Comparative Programming
10/36
Examples (conversion, reduction, substitution)
α-conversion: λx.x =α λy.y
λx.λy.xy =α λz1.λz2.z1z2
and β-reduction:
(λx.λy.xy)(λx.x) →β λy.(λx.x)y →β λy.y
Part 3 – foundations
G6021 Comparative Programming
11/36
Normal forms
When do we stop reducing?
Normal form (NF): Stop reducing when there are no redexes left to reduce.
A normal form is a term that does not contain any redex.
A term that can be reduced to a term in normal form is said to be normalisable.
Example:
(λx.a(λy.xy)) b c →β a(λy.by)c
which is a normal form (recall that application associates to the
left).
Weak Head Normal Form (WHNF). Stop reducing when there are no redexes left, but without reducing under an abstraction.
Part 3 – foundations
G6021 Comparative Programming
12/36
Exercises
1 What is the difference between a term having a normal form, and being a normal form? Write down some example terms.
2 If a closed term is a weak head normal form, it has to be an abstraction λx.M. Why?
3 Does the term (λx.xx)(λx.xx) have a normal form?
Part 3 – foundations
G6021 Comparative Programming
13/36
Reduction graphs
The β-reduction graph of a term M, written Gβ(M), is the set: { N | M → ∗β N }
directed by →β. If several redexes give rise to M0 →β M1, then that many directed arcs connect M0 to M1.
Example
Gβ(WWW) with W ≡ λxy.xyy is:
WWW ‘ (λy .yyy )W TT
c
(λy .Wyy )W ‘
E
(λy .(λz .yzz )y )W
Part 3 – foundations
G6021 Comparative Programming
14/36
Reduction graph examples
Exercise: Draw the reduction graph for (II)(II), where I = λx.x. (II )(II )
I (II )
III
II
c
I
Why is one arrow marked “∗”?
Part 3 – foundations
G6021 Comparative Programming
15/36
‘
‘
∗
E
E
Reduction graph examples
Gβ((λx.xx)(λx.xx)) is:
(λx .xx )(λx .xx ) T
&%
Exercise: Draw Gβ((λx.x(II))(II)), where I = λx.x
Part 3 – foundations
G6021 Comparative Programming
16/36
Properties of Computations
Confluence: If M →∗β M1 and M →∗β M2 then there exists a term M3 such that M1 →∗β M3 and M2 →∗β M3
Normalisation: there exists a sequence of reductions which terminates
Strong Normalisation (or Termination): All reduction sequences terminate
The λ-calculus is confluent but not normalising (or strongly normalising).
Confluence implies unicity of normal forms: Each λ-term has at most one normal form.
Exercise:
Find a term that is not strongly normalising (i.e. a term that does not terminate).
Part 3 – foundations
G6021 Comparative Programming
17/36
Strategies for reduction
There can be many different ways in which a term can be reduced to a normal form (resp. WHNF).
The choice that we make can make a huge difference in how many reduction steps are needed.
The leftmost-outermost strategy finds the normal form, if there is one. But it may be inefficient.
Exercise:
Indicate whether the following λ-terms have a normal form: (λx .(λy .yx )z )v
(λx .xxy )(λx .xxy )
Part 3 – foundations
G6021 Comparative Programming
18/36
Remark
Most functional programming languages adopt the following ideas:
1 reduce to WHNF (do not reduce under an abstraction). Exercise: Why?
2 evaluate arguments in a specific way
The difference between many functional languages lies in the choice taken for the second point.
Part 3 – foundations
G6021 Comparative Programming
19/36
Evaluating Arguments
1 Call-by-Value (Applicative order of reduction):
evaluate arguments first so that we substitute the reduced terms (avoid duplication of work).
2 Call-by-name (Normal order of reduction): evaluate arguments each time they are needed.
3 Lazy Evaluation: evaluate arguments at most once.
Question:
Which is the best/worst strategy? Give examples to support your claims.
Part 3 – foundations
G6021 Comparative Programming
20/36
Arithmetic in the λ-calculus: Church Numerals We can define the natural numbers as follows:
0 = λx.λy.y
1 = λx.λy.x y
2 = λx.λy.x(x y)
3 = λx.λy.x(x(x y)) ···
Using this representation we can define arithmetic functions. Example, n → n + 1, is defined by the λ-term S:
λx.λy.λz.y((x y)z)
To check it:
Sn = (λx.λy.λz.y((x y)z))(λx.λy.x . . . (x(x y)) →β λy.λz.y((λx.λy.x . . . (x(x y)) y)z)
→∗β λy.λz.y(y…(y(yz))=n+1
Part 3 – foundations
G6021 Comparative Programming
21/36
In general, to define an arithmetic function
f : Natk → Nat
we will use a λ-term λx1 . . . xk .M , which will be applied to k numbers:
(λx1 …xk.M)n1 …nk
For example, the following term defines addition:
ADD = λx.λy.λa.λb.(x a)(y a b)
Exercise:
Check that this term applied to two numbers computes their sum. Hint: reduce the term λx.λy.λa.λb.(x a)(y a b)n m
Exercises:
1 Show that the λ-term MULT = λx.λy.λz.x(yz) applied to two Church numerals m and n computes their product m × n.
2 What does the term λn.λm.m (MULT n) 1 compute?
Part 3 – foundations
G6021 Comparative Programming
22/36
Booleans
We can represent Boolean values:
False = λx.λy.y
True = λx.λy.x
and Boolean functions. The function NOT is defined by
NOT = λx.(x False) True
We can check that this definition is correct:
NOT False = (λx.(x False) True)False →β (False False)True
→∗β True
and
NOT True = (λx.(x False) True)True →β (True False)True
→∗β False
Part 3 – foundations
G6021 Comparative Programming
23/36
Conditionals
The following term implements an if-then-else:
Note that and
Example:
IF = λx.λy.λz.(x y)z
IF B E1 E2 →∗β E1 if B = True IF B E1 E2 →∗β E2 if B = False
The function is-zero? can be defined as: λn.n(True False)True
Then
is-zero? 0 →∗β True
and
is-zero? n →∗β False if n > 0.
Part 3 – foundations
G6021 Comparative Programming
24/36
The cost of computing
We have seen that different reduction strategies can change the efficiency of the computation (also termination)
We can transform algorithms into more efficient versions. We look at one way in this course:
Continuation Passing Style
Note: tail recursive, or accumulating parameter style.
Program transformation is a very rich topic. Many open research topics here…
Part 3 – foundations
G6021 Comparative Programming
25/36
Continuations
Continuations were originally introduced in the study of semantics of programming languages: to allow the formal definition of control structures.
Jumps in programs are considered a “bad” thing (difficult to reason about)
Many constructs allow controlled jumps (conditional, loops, case, etc)
Do not allow jumping into the middle of a block or function body Continuations allow some of these features to be captured in a “clean” way:
Exceptions to leave a block or function early
callcc allows a point in the program to be “marked”. throw
returns to that point to continue the evaluation.
They are an advanced control construct available in some functional languages (notably Standard ML and Scheme).
Part 3 – foundations
G6021 Comparative Programming
26/36
Continuation Passing Style (CPS)
The idea of CPS is that every function takes an extra argument: a continuation.
A continuation is a function which consumes the result of a function, and produces the final answer.
Thus, a continuation represents the remainder of the current computation.
The simplest way to understand CPS is to think about evaluating a simple functional application:
Part 3 – foundations
G6021 Comparative Programming
27/36
Example CPS: Factorial
fact n = if n==0 then 1 else n*fact(n-1)
fact 4
Consider now the CPS form:
factcps n k = if n==0 then k 1
else factcps (n-1) (\r -> k (n*r))
factcps 4 (\x -> x)
The second argument k is the continuation.
Exercise:
1 What is the relationship between:
k (fact n) and factcps n k
2 What is one main difference between fact and factcps?
Part 3 – foundations
G6021 Comparative Programming
28/36
Factorial: evaluation
fact 4 = if 4==0 then 1 else 4*fact(4-1)
= 4*fact(3)
= 4*(if 3==0 then 1 else 3*fact(3-1))
= 4*(3*fact(2))
… = 4*(3*(2*(1*1)))
factcps 4 (\x -> x)
= factcps 3 (\r -> (\x -> x) (4*r))
= factcps 3 (\r -> (4*r))
= factcps 2 (\r -> (\r -> (4*r)) (3*r))
= factcps 2 (\r -> (4*(3*r)))
= factcps 1 (\r -> (4*(3*(2*r))))
= factcps 0 (\r -> (4*(3*(2*(1*r)))))
= (\r -> (4*(3*(2*(1*r)))))1
= (4*(3*2*(1*1)))
Part 3 – foundations
G6021 Comparative Programming
29/36
Tail Recursion
It is generally well-understood in compiler technology that tail recursive programs can be implemented more efficiently (because they can be transformed into a simple loop).
A well known example: Compare the following two functions:
rev [] = []
rev (h:t) = rev t ++ [h]
revacc [] acc = acc
revacc (h:t) acc = revacc t (h:acc)
Nothing remains to be done after the recursive call (recall the definition of ++).
Formally, we can show that rev l = revacc l []
Part 3 – foundations
G6021 Comparative Programming
30/36
Continued…
rev [] = []
rev (h:t) = rev t ++ [h]
revcps [] k = k []
revcps (h:t) k = revcps t (\r -> k(r++[h]))
Exercise: Verify that rev l = revcps l (\x -> x)
Note that all the continuations here can be represented by lists: \x -> x++l for some list l. Thus revcps can be simplified to revacc.
Part 3 – foundations
G6021 Comparative Programming
31/36
CPS
The previous examples are part of a rich theory of program transformation.
Many advanced compilers perform this transformation automatically (when possible).
In addition to eliminating recursion, these transformations add additional control in the form of strategies.
On a negative note, programs become higher-order, and we might loose termination properties.
Part 3 – foundations
G6021 Comparative Programming
32/36
Worked Example: factorial again
fact n = if n==0 then 1 else n*fact(n-1)
fact 4
Consider now the CPS form:
factcps n k = if n==0 then k 1
else factcps (n-1) (\r -> k (n*r))
factcps 4 (\x -> x)
We can simplify the continuation:
factacc n acc = if n==0 then acc
else factacc (n-1) (n*acc)
factacc 4 1
Part 3 – foundations
G6021 Comparative Programming
33/36
Other uses of CPS
Many programming languages have features like: goto (in pascal like languages) longjmp/setjmp in C
exceptions in Java, Haskell, SML, etc.
which allow for the change of control of a program (to exit the current block).
Continuations are a way of expressing these issues
Achieved by passing a stack as a value to functions: this stack allows the state of the computation to be reinstated at any point—we can move to any past state in a safe way.
Such stacks are known as reified control stacks. However, this is beyond the scope of this course…
Part 3 – foundations
G6021 Comparative Programming
34/36
Summary of CPS
All functions can be written in CPS style.
Some continuations have nice representations as accumulating parameters.
Tail recursive functions can be compiled into a loop: more efficient than a recursion.
Many other program transformation techniques for functional programming
Part 3 – foundations
G6021 Comparative Programming
35/36
Summary
The λ-calculus is the foundation of functional programming (and also the foundation of many programming concepts).
It is possible to program using only the λ-calculus, but easier if we allow data types (pattern matching, richer syntax, etc.)
Test out examples in the notes, and do exercises.
Try writing some of the λ-terms in Haskell
Can you write a data type in Haskell for representing λ-terms?
Part 3 – foundations
G6021 Comparative Programming
36/36