Chapter 3
Type inference
We began Chapter 2 with the observation that the need to annotate every
variable with its type makes programming in System F𝜔 rather inconvenient.
In contrast it is often possible to write programs in OCaml without specifying
any types at all. For example, here is the the result of entering the composition
function at the OCaml top level:
# fun f g x -> f (g x) ; ;
– : ( ’ a -> ’b) -> ( ’ c -> ’a) -> ’ c -> ’b =
Comparing the code that we entered with the type that OCaml prints, we
see that OCaml has determined the following:
• f has type ’a -> ’b
• g has type ’c -> ’a
• x has type ’c
• the result of the function has type ’b
Although we didn’t specify the types of the variables f, g or x, OCaml de-
duced that f and g have function type, that the output of g has the same type
(’a) as the input of f, and that x has the same type (’c) as the input to g.
The process of determining a suitable type from an unannotated expression
is known as type inference or type reconstruction. As we shall see, the standard
algorithms for type inference have a number of appealing properties which have
led to their adoption in many typed functional programming languages. Per-
haps most importantly, the types that they infer are guaranteed to be the best
possible types (in a sense which we shall make more precise in Section 3.3.2), so
we may be confident that we do not lose generality when we allow type infer-
ence to do its job. The widespread adoption of type inference may be attributed
both to these properties and to the fact that it is able to support many con-
structs beyond the simple abstractions and applications in our example above,
including a form of polymorphism, algebraic data types, recursion (with some
restrictions) and imperative features such as mutable references.
31
32 CHAPTER 3. TYPE INFERENCE
3.1 Background
The appeal of type inference is easy to understand. Types give structure to
programs and can prevent many errors from occurring, but type annotations
can make programs quite difficult to read, as we saw in Chapter 2. If we could
somehow combine the succinctness of annotation-free code with the safety and
expressiveness of System F𝜔 then we would have the best of both worlds.
Unfortunately, it turns out that the problem of reconstructing a type for
an unannotated System F𝜔 program — or even for an unannotated System F
program — is undecidable. For type inference to be consistent and reliable we
must introduce a number of restrictions on exactly where polymorphism can
occur. In particular, we will only allow variables bound with let, not variables
bound with fun, to have polymorphic types, and we must place corresponding
restrictions on the places where quantifiers can occur. Whereas System F allow
quantified types such as ∀𝛼.𝛼 → 𝛼 to occur anywhere that types can occur, in
order to support type inference we must restrict types to prenex form — that
is, we will only allow quantifiers at the outermost level.
3.2 The calculus
We will present the type inference algorithm using the variant of 𝜆→ shown in
Figures 3.1–3.4. Since much of the calculus is now familiar we will remark only
on the changes.
In addition to base types and function types we have type variables (Fig-
ure 3.1), as in System F. However, unlike System F our calculus does not sup-
port quantifiers in types. Instead, we introduce a new category of schemes
(Figure 3.2). A scheme ∀ ̅𝛼.𝐴 is conceptually similar to a type, but supports
quantification over multiple variables — we write ̅𝛼 to denote a collection of type
variables 𝛼1, 𝛼2, … 𝛼𝑛. Further, the body of a scheme is not itself a scheme, but
a type: we cannot build schemes from other schemes. In fact, schemes occur
only in the environment: the environment rules (Figure 3.3) now associate term
variables with schemes, rather than with types.
Finally, there are four rules for constructing terms (Figure 3.4):
There is a new rule for let bindings, scheme-intro, which tells us that if M has
type A in an environment Γ then we can build a scheme from A by quantifying
over those free variables that do not occur in the environment and use the
scheme to type a second term N in an extended environment. The type of N
becomes the type of the whole term let x = M in N.
We have renamed the variable rule to scheme-elim to reflect the fact that
it is at uses of variables that we turn schemes back into types. If a variable x
is associated with the scheme ∀ ̅𝛼.𝐴 in an environment then we may give x the
type that results from instantiating the bound variables ̅𝛼 with with types �̅�.
The tvar rule from Chapter 2 is a special case of scheme-elim where ̅𝛼 is the
empty collection.
The →-intro rule is largely similar to the version we saw in Chapter 2, but
3.2. THE CALCULUS 33
ℬ-typesℬ is a type 𝛼-types𝛼 is a type 𝐴 is a type
𝐵 is a type →-types
𝐴 → 𝐵 is a type
Figure 3.1: Type formation rules
𝐴 is a type
scheme∀ ̅𝛼.𝐴 is a scheme
Figure 3.2: Scheme rule
Γ-·· is an environment Γ is an environment𝑆 is a scheme Γ-∶Γ, 𝑥∶𝑆 is an environment
Figure 3.3: Environment rules
Γ ⊢ 𝑀 ∶ 𝐴 ̅𝛼 ∉ 𝑓𝑣(Γ)
Γ, 𝑥∶ ∀ ̅𝛼.𝐴 ⊢ 𝑁 ∶ 𝐵
scheme-introΓ ⊢ let x = 𝑀 in 𝑁 ∶ 𝐵
𝑥 ∶ ∀ ̅𝛼.𝐴 ∈ Γ
Γ ⊢ 𝐵 is a type (𝐵 ∈ �̅�)
scheme-elim
Γ ⊢ 𝑥 ∶ 𝐴[ ̅𝛼 ∶= �̅�]
Γ, 𝑥∶𝐴 ⊢ 𝑀 ∶ 𝐵
→-introΓ ⊢ 𝜆𝑥.𝑀 ∶ 𝐴 → 𝐵
Γ ⊢ 𝑀 ∶ 𝐴 → 𝐵
Γ ⊢ 𝑁 ∶ 𝐴 →-elimΓ ⊢ 𝑀 𝑁 ∶ 𝐵
Figure 3.4: Typing rules
34 CHAPTER 3. TYPE INFERENCE
unify ∶ ConstraintSet → Substitution
unify(∅) = []
unify({𝐴 = 𝐴} ∪ 𝐶) = unify(𝐶)
unify({𝛼 = 𝐴} ∪ 𝐶) = unify([𝛼 ↦ 𝐴]𝐶)◦[𝛼 ↦ 𝐴] when 𝛼 ∉ 𝑓𝑡𝑣(𝐴)
unify({𝐴 = 𝛼} ∪ 𝐶) = unify([𝛼 ↦ 𝐴]𝐶)◦[𝛼 ↦ 𝐴] when 𝛼 ∉ 𝑓𝑡𝑣(𝐴)
unify({𝐴 → 𝐵 = 𝐴′ → 𝐵′} ∪ 𝐶) = unify({𝐴 = 𝐴′, 𝐵 = 𝐵′} ∪ 𝐶)
unify({𝐴 = 𝐵} ∪ 𝐶) = 𝐹𝐴𝐼𝐿
Figure 3.5: The unification algorithm
there is now no type annotation associated with the variable x. It is the job of
the inference algorithm to determine what the missing type should be.
The →-elim rule is unchanged.
It is important to note that in contrast to the typing rules of Chapter 2, the
rules in Figure 3.4 are not “algorithmic”: we need some mechanism not given in
the rules to find appropriate instantiations for the type A in the →-intro rule.
3.3 An algorithm for type inference
Before we introduce the type inference algorithm we need some auxiliary con-
cepts, substitutions and constraints, and an auxiliary algorithm, unification.
Substitutions Substitutions play a central role in the algorithm. A substitu-
tion is a map from variables to types. We write
[ a1 ↦ A1 , a2 ↦ A2 , … a𝑛 ↦ A𝑛 ]
to denote substitutions from variables a1, a2, … a𝑛 to types A1, A2, … A𝑛. For
example, the substitution
[ a ↦ ℬ , b ↦ (ℬ → ℬ) ]
maps a to ℬ and b to ℬ → ℬ.
Substitutions can be applied to types and larger objects which contain types,
such as environments. Applying a substitution to a type involves replacing
the free type variables in the type with the corresponding values in the map.
For example, applying the substitution [a ↦ ℬ, b ↦ (ℬ → ℬ)] to the type a→b→a
produces the type ℬ → (ℬ → ℬ) → ℬ.
We write 𝜎A for the application of the substitution 𝜎 to the type A. If there
is a substitution 𝜎 such that 𝜎A = B then we say that B is a substitution instance
of A.
3.3. AN ALGORITHM FOR TYPE INFERENCE 35
Constraints A constraint is a requirement that two types have a common
substitution instance. Constraints have the form A = B. For example, the fol-
lowing are all constraints:
𝑎 = 𝑏
𝑎 → 𝑏 = ℬ → 𝑏
ℬ = ℬ
ℬ = ℬ → ℬ
Some constraints are unsatisfiable. For example the last constraint above, ℬ
= ℬ → ℬ cannot be satisfied because there is no substitution that turns ℬ into
ℬ → ℬ or vice versa.
3.3.1 Unification
Unification is a partial function that either turns a set of constraints into a
substitution or fails when it encounters unsatisfiable constraints.
The unification algorithm is shown in Figure 3.5. The function unify accepts
a constraint set as input and produces a substitution (or fails). There are six
cases:
1. If there are no constraints in the set the result is the empty substitution.
2. Unifying a type A with itself has no effect: the constraint is dropped.
3,4 Unifying a variable a with a type A builds a substitution that maps a to
A. The notation 𝜎 ∘ 𝜎′ composes substitutions: applying the resulting
substitution has the same effect as applying 𝜎′ then applying 𝜎.
The check that a does not occur in the free type variables of A prevents
the construction of recursive types.
5 Unifying types A→B and A’→B’ proceeds by unifying A with A’ and B with
B’.
6 Attempting to unify any other types A and B causes unification to fail.
3.3.2 Algorithm J
The type inference algorithm described in this chapter was introduced by Robin
Milner in a 1978 paper, A theory of type polymorphism in programming. Mil-
ner’s paper presented two algorithms for reconstructing types for unannotated
expressions. Algorithm W is a purely functional algorithm that accumulates
substitutions and carefully applies them at appropriate points. Algorithm J is
an imperative algorithm that simulates W and uses a single global substitution
which is updated as type inference progresses. Algorithm W is amenable to
formal proofs, while Algorithm J is more efficient, and easier to understand. In
this chapter we will focus on Algorithm J.
One important property of both Algorithm W and Algorithm J is that they
infer principal types. A type A is principal for a term M if every other derivable
36 CHAPTER 3. TYPE INFERENCE
type for M is a substitution instance of A. The principal types property assures
us both that we can rely on the type inference algorithm to give us the best
possible type, and that any valid annotation that we add to our terms will be
compatible with the inferred type.
Figure 3.6 presents Algorithm J. The algorithm is defined by cases on the
four syntax constructors. The binding forms 𝜆x.M and let x = M in N both ex-
tend the environment, but only let bindings introduce schemes with quantifiers.
Schemes are instantiated at variable uses and unification takes place at function
applications, where we must ensure that the argument supplied to a function is
compatible with its inferred input type.
A worked example Let’s use algorithm J to find a type for the expression
let apply = 𝜆 f .𝜆x . f x in let id = 𝜆y . y in apply id
in an empty environment. The algorithm begins at the fourth case, since we’re
typing a let expression. We must find
J(·, let apply = 𝜆 f .𝜆x . f x in let id = 𝜆y . y in apply id )
The first step is to type the right hand side M, i.e. 𝜆f.𝜆x.f x:
J(·, 𝜆 f .𝜆x . f x)
The third case for J tells us how to type lambda abstractions. We pick a fresh
variable b1 for the type of f, add a binding to the environment and type the
body:
J(·, f : b1 , 𝜆x . f x)
The body is also a lambda abstraction, so we do the same again:
J(·, f : b1 ,x : b2 , f x)
Now we must type the application f x. We must first type the function term f.
We use the case for variables, which is trivial in this instance since the scheme
for f in the environment does not have any bound type variables to instantiate.
J(·, f : b1 ,x : b2 , f ) = b1
The typing for the argument term x is similar:
J(·, f : b1 ,x : b2 , x) = b2
Continuing with the case for application, we must unify, using a fresh variable
b3 for the result type:
unify ({b1 = b2 → b3 })
The result of unification is the map
[ b1 ↦ b2 → b3 ]
and the type of 𝜆x.f x is b2 → b3, and the type of 𝜆f.𝜆x.f x is b1 → b2 → b3.
Applying the substitution transforms this last type into (b2 → b3) → b2 → b3.
Returning to the let case, we can now extend the environment and type the
body. We first compute the free variables in the type and in the environment:
3.3. AN ALGORITHM FOR TYPE INFERENCE 37
Algorithm J infers a type for an expression M in an environment Γ, if there
is such a type. When the algorithm succeeds in producing a type A, the
type is guaranteed to be principal: that is, every other valid type for M in
Γ is a substitution instance of A.
J : Environment × Expression → Type
The algorithm operates on a single global substitution E.
There are four cases, corresponding to the four term constructors.
J (Γ , x) = A[ �̅�:=�̅� ]
where Γ(x) = ∀�̅�.𝐴
and �̅� are f re sh
The first case deals with variables. A variable x is given a type by instan-
tiating the scheme ∀ ̅𝛼.𝐴 associated with x in Γ — that is, by replacing the
quantified variables ̅𝛼 with fresh variables ̅𝑏, which may later be unified
with other types.
J (Γ , M N) = 𝑏
where A = J (Γ , M)
and B = J (Γ , N)
and unify ’ ({A = B → b}) succeeds
and b i s f re sh
The second case deals with applications. An application M N is given a
type by finding types A and B for M and N, then unifying A with a function
type whose argument type is B. The unify’ procedure (not shown) calls unify
to build a substitution from the constraint set argument, then applies the
substitution to E to produce the new value of E.
J (Γ , 𝜆x .M) = b → A
where A = J (Γ , x∶ 𝑏 , M)
and b i s f re sh
The third case deals with function bindings. A function 𝜆x.e is assigned a
type by choosing a fresh variable b for the argument, adding the argument
to the environment, then inferring a type A for the body M. The fresh
variable may be unified with a concrete type if x is used in M.
J (Γ , let x = M in N) = B
where A = J (Γ , M)
and B = J (Γ , x :∀�̅�.𝐴 , N)
and �̅� = ftv (A) \ ftv (Γ)
The fourth case deals with let bindings. A let binding let x = M in N is given
a type by first finding a type A for M, then generalising A by quantifying
over any free type variables ̅𝛼 in A that do not appear in Γ. The type B of
the body M is then determined in an environment extended with x:∀�̅�.𝐴.
Figure 3.6: Algorithm J
38 CHAPTER 3. TYPE INFERENCE
f tv ((b2 → b3 )→ b2 → b3 ) = {b2 , b3}
ftv ( ·) = {}
So ftv(A)\ ftv(Γ) is {b2, b3} and we extend the environment with a scheme that
quantifies over b2 and b3. We must next find
J( · , apply :∀𝛼2𝛼3 . ( 𝛼2 → 𝛼3 )→ 𝛼2 → 𝛼3 , let id = 𝜆y . y in apply id )
Once again, we first type the right hand side 𝜆y.y
J( · , apply :∀𝛼2𝛼3 . ( 𝛼2 → 𝛼3 )→ 𝛼2 → 𝛼3 , 𝜆y . y)
We pick a fresh type variable b4 for y and type the body:
J( · , apply :∀𝛼2𝛼3 . ( 𝛼2 → 𝛼3 )→ 𝛼2 → 𝛼3 ,y : b4 , y) = b4
So the type of 𝜆y.y is b4 → b4. We must generalize the type and then type the
body of the let. We have
f tv (b4 → b4 ) = {b4}
ftv (·, apply :∀𝛼2𝛼3 . ( 𝛼2 → 𝛼3 ) → 𝛼2 → 𝛼3 ) = {}
so we quantify over b4 and add the scheme to the environment. We then have
J(·, apply :∀𝛼2𝛼3.(𝛼2 → 𝛼3) → 𝛼2 → 𝛼3 , id :∀𝛼4.𝛼4 → 𝛼4 , apply id )
Typing the left and right hand sides of the application gives
J(·, apply :∀𝛼2𝛼3.(𝛼2 → 𝛼3) → 𝛼2 → 𝛼3 , id :∀𝛼4.𝛼4 → 𝛼4 , apply )
= (b5 → b6 )→ b5 → b6
J(·, apply :∀𝛼2𝛼3.(𝛼2 → 𝛼3) → 𝛼2 → 𝛼3 , id :∀𝛼4.𝛼4 → 𝛼4 , id )
= b7 → b7
We choose a fresh variable b8 for the result of the application and unify to build
a substitution:
unify ({(b5 → b6 )→ b5 → b6 = (b7 → b7 ) →b8 })
= unify ({b5 → b6 = b7 → b7 ,
b5 → b6 = b8 })
= unify ({b5 = b7 ,
b6 = b7 ,
b5 → b6 = b8 })
= {b5 ↦ b7 , b6 ↦ b7 , b8 ↦ b5 → b6}
Applying the substitution to b8 gives b7 → b7, which is the inferred type for the
whole expression.
3.4 Type inference in practice
Up to this point we have focused on the core type inference algorithm. There
are a number of additional considerations when programming in a language with
type inference.
3.4. TYPE INFERENCE IN PRACTICE 39
3.4.1 Turning let bindings into function arguments
System F𝜔 has the appealing property that it is always possible to generalize a
function by turning its free identifiers into parameters. Unfortunately, OCaml
does not share this property: whether an identifier used inside a function can
be turned into a parameter depends on whether the identifier is used polymor-
phically. For example, consider the function that “doubles” each of a pair of
lists using the list append operator @:
let double2 (( l : int l i s t ) , ( r : s t r ing l i s t ) )
= ( l @ l , r @ r )
The append operator has a type that allows it to be used both on lists of integers
and lists of strings:
val (@) : ’ a l i s t -> ’a l i s t -> ’a l i s t
We might like to generalize the function by turning the append operation into
a parameter as follows:
let double2 (@) (( l : int l i s t ) , ( r : s t r ing l i s t ) )
= ( l @ l , r @ r )
This definition is rejected by OCaml because it would require the first param-
eter of double to be given a polymorphic type, which is incompatible with type
inference. On the other hand, the following definition is accepted, since the
parameter @ is only used at a single type in the body of the function:
let double2 (@) (( l : int l i s t ) , ( r : int l i s t ) )
= ( l @ l , r @ r )
As this example illustrates, the restriction to prenex polymorphism brings a kind
of non-uniformity to languages with type inference, since only certain types of
values can be lambda-abstracted.
Chapter 6 describes some of the features available in OCaml that make it
possible to work around these limitations.
3.4.2 Type inference and recursion
Extending the calculus in Figures 3.1–3.4 to handle recursion is straightforward.
The following rule supports an OCaml-style let rec binding:
Γ, 𝑥∶𝐴 ⊢ 𝑀 ∶ 𝐴 ̅𝛼 ∉ 𝑓𝑣(Γ) Γ, 𝑥∶ ∀ ̅𝛼.𝐴 ⊢ 𝑁 ∶ 𝐵
let-recΓ ⊢ let rec x = M in N ∶ 𝐵
Unlike the rule for let binding, the environment is extended both when typing
N and when typing M, since let rec brings x into scope in both expressions.
However, the terms M and N are not typed in the same environment. It is
only when typing N that the free variables in the type A are generalized. When
we begin typing M we do not yet know the type A, and so it is not possible
to determine its free variables. In concrete terms this means that the type
inference algorithm will reject functions that are used polymorphically in their
own definitions.
40 CHAPTER 3. TYPE INFERENCE
Let’s look at an example. In Chapter 2 we saw the following data type
definition:
type ’ a per f ec t =
ZeroP : ’ a -> ’a per f ec t
| SuccP : ( ’ a * ’a) per f ec t -> ’a per f ec t
Even fairly simple functions over values of type perfect require polymorphic
recursion. For example, consider the function that determines the depth of a
tree:
let rec depthP = function
ZeroP _ -> 0
| SuccP p -> succ (depthP p)
We expect the function to have type ’a perfect -> int, but the recursive call
is given a value of type (’a * ’a) perfect. Consequently, type inference fails and
OCaml rejects the definition:
Characters 67 -68:
| SuccP p -> succ (depthP p)
^
Error : This express ion has type ( ’ a * ’a) per f ec t
but an express ion was expected of type ’a per f ec t
The type var iab le ’ a occurs ins ide ’a * ’a
We can fix the problem by telling OCaml the expected type in advance —
the equivalent of supplying a pre-generalized A to use when typing M in the
let-rec rule:
let rec depthP : ’a . ’ a per f ec t -> int = function
ZeroP _ -> 0
| SuccP p -> succ (depthP p)
While nested data types like perfect are fairly rare, polymorphic recursion is
often needed when using GADTs (Chapter 8), which are increasingly common
in OCaml programs.
3.4.3 Supporting imperative programming: the value re-
striction
A number of other OCaml features besides recursion have non-trivial interac-
tions with type inference. In particular, the presence of mutable state requires
changing the type inference algorithm to avoid unsoundness.
Mutable references in OCaml OCaml supports mutability via the mutable
keyword, which can be attached to record fields. For example, here is the
definition of the standard library ref type, which represents a mutable reference
cell:
type ’ a r e f = { mutable contents : ’ a }
The standard library also provides a number of functions for manipulating
reference cells. The ref function creates a fresh reference cell:
3.4. TYPE INFERENCE IN PRACTICE 41
val r e f : ’ a -> ’a r e f
The prefix operator ! retrieves the current contents of a reference cell:
val ( ! ) : ’ a r e f -> ’a
The infix operator := updates the contents of a reference cell:
val (:=) : ’ a r e f -> ’a -> unit
We can use this interface to construct an example that shows the unsound-
ness that results when the inference algorithm is applied to a language with
mutable state:
let r = re f None in
r := Some ”boom”;
match ! r with
None -> ()
| Some f -> f ()
The unmodified Algorithm J infers the scheme ’a .’a option ref for the expres-
sion ref None. This scheme allows r to be instantiated at the type string option ref
on the second line, which stores the value Some ”boom” in the cell, and instan-
tiated at the type (unit -> unit) option ref on the last line, which retrieves the
contents of the cell and invokes it. Treating the string stored in the cell as a
function causes the program to crash.
How can we characterise this unsoundness? If we view the inference algo-
rithm in the light of what we know about System F then things become clear.
In System F generalisation and instantiation appear as a type of function ab-
straction (Λ𝛼.M) and application (M [𝛼]). If we view the inference algorithm as
a program transformation that transforms terms to insert these introduction
and elimination forms for polymorphism then the problem is apparent: chang-
ing ref None into Λ𝛼.ref None changes the order of evaluation, since Λ𝛼.M delays
evaluation of the term M. This intuition suggests the solution: we should only
allow generalization of a type when the associated term is already in value form
— that is, when evaluating it has no effect.
This solution, was adopted by SML and by OCaml following a study in 1995
of several ML codebases by Andrew Wright. Wright discovered that almost all
terms that were used polymorphically were already syntactic values, making the
“value restriction” relatively benign. For those terms that are not in value form
𝜂-expansion often fixes the problem: we can turn
let f = g x in e
into
let f = fun y -> g x y in e
to recover the lost polymorphism. Of course, if g x does not have function type,
or if the program depends on the existing evaluation order then this solution is
not suitable, but Wright found almost no bindings that could not be straight-
forwardly fixed using this approach and the value restriction was soon adopted
for both Standard ML and OCaml.
42 CHAPTER 3. TYPE INFERENCE
3.4.4 Relaxing the value restriction
Although Wright’s study found few cases where the value restriction caused
existing code to be rejected, there are situations where restricting generalisation
to syntactic values is a source of inconvenience. For example, we might represent
a bounded list type, which can hold no more than some specified number of
elements, as a pair of the bound and the list:
type ’ a b l i s t = int * ’a l i s t
It is straightforward to create empty blist values:
let bl5 : ’ a b l i s t = (5 , [ ] )
let bl3 : ’ a b l i s t = (3 , [ ] )
We might like to abstract empty blist creation with a function:
let empty_blist : int -> ’a b l i s t = fun n -> (n , [ ] )
However, the value restriction means that we lose polymorphism by abstract-
ing. The following definition results in a non-generalized type for bl5, since the
right hand side is not a syntactic value.
let bl5 = empty_blist 5
Since the expression does not have a function type it is not possible to recover
the lost polymorphism by eta expansion.
As this example illustrates, the simple check for whether an expression is
a syntactic value is sometimes too crude a test. For this reason OCaml uses
a refinement of the value restriction which assigns polymorphic types to more
expressions. In order to understand OCaml’s approach it is first necessary to
understand the concepts of covariance and contravariance.
Covariant and contravariant type variables In a function type s -> t the
types s and t play quite different roles: s indicates the type of values that are
consumed by the function, whereas t indicates the type of values produced.
These different roles must form part of our reasoning when we consider the
effects the changes in a function’s behaviour have on callers of that function.
We can always extend a function to accept a wider range of inputs without a
negative impact on callers; similarly, it is always safe to restrict a function to
produce a narrower range of outputs. However, in neither case is the converse
true: either restricting the input set or expanding the output set breaks the
“contract” between the existing function and its callers.
For simple function types like s -> t it is straightforward to identify inputs
and outputs. However, we sometimes want to reason about more complex types,
where it is helpful to generalize the idea of input and output to contravariant
and covariant positions in a type. For example, consider the type
type ’ a pr inter = ’a -> st r ing
Since the type parameter ’a is used as an input in the definition of printer,
OCaml marks it as contravariant1. Contravriance acts like negation, so that a
1The contravariance flag is an internal feature that is not usually displayed
3.4. TYPE INFERENCE IN PRACTICE 43
contravariant parameter of a type constructor that is itself in a contravariant
position is covariant. For example, in the type
( ’ a pr inter ) pr inter
the type ’a is marked as covariant.
Here is a larger example. The following type has two type variables ’a and
’b which respectively appear only in covariant and contravariant positions:
( ’ a -> ’b) -> ’a pr inter -> ’b pr inter l i s t
We can visualize the type as a tree, where each box represents a contravariant
position:
→
→
𝛼+ 𝛽−
→
printer
𝛼+
list
printer
𝛽−
Each type variable that is surrounded by an odd number of boxes is in a
contravariant position, and every other type variable is covariant.
What does variance have to do with the value restriction? Since covariant
positions represent outputs we know that if a type variable is only used in
covariant positions it represents something that is only read, not written2. It is
safe to generalize such variables because the unsound situation in Section 3.4.3,
where a reference cell was modified at one type and accessed at another, can
never occur.
Let’s look at the blist example again in the light of this “relaxed value re-
striction”. The expression
empty_blist 5
has type ’a blist , which is equivalent to int * ’a list .In this type the variable ’a
occurs only covariantly, so it is safe to generalize, which is what OCaml does. On
the other hand the parameter of the ref type constructor is considered invariant
by OCaml, since it can be used for both input and output, so the expression
r e f None
which has type ’a option ref is not generalized.
3.4.5 Closing remarks
Historically type inference has played a large part in the development of OCaml
and other ML-family languages, and there has been reluctance to add new
2The actual justification for generalizing covariant type variables is based on a technical
argument about subtyping.
44 CHAPTER 3. TYPE INFERENCE
features for which types could not be inferred. In recent years type inference
has played a less central role, as OCaml has acquired features like GADTs
(Chapter 8) which do not support inference. This shift in focus has brought
with it a number of smaller changes: for example, recent versions of OCaml use
types to disambiguate different data constructors which have the same name.
In the next chapter we will consider the correspondence between proposi-
tions and types, and between programs and proofs. Viewed from this perspec-
tive, type inference seems a little strange. It is easy to see why we might want
a proof system where we can specify propositions and have the computer syn-
thesize proofs, but it is less clear why we might want to specify proofs and have
the computer infer propositions. The more expressive types available in modern
functional programming languages have less to do with simple data layout and
more to do with structuring programs and describing behaviour. As types ac-
quire this larger and more interesting role, we might start to question whether
leaving them to computers to determine is the right decision after all.
3.5 Exercises
1. [HH] Step through the inference algorithm for the following expression
l e t k = 𝜆x .𝜆y . x in
l e t i = 𝜆x . x in
k k i
2. [HHH] What happens in the following example if the free variables from
the environment are not removed when inferring a type for the let binding?
l e t f = 𝜆x .
l e t f = 𝜆y . x in
( f ⟨⟩) ⟨⟩
in f ⟨⟩
(Assume that ⟨⟩ is assigned the type unit and behaves like OCaml’s unit
value.)
3. [HHH] Give an example of an OCaml function that is rejected due to the
lack of polymorphic types for function arguments.
4. [HHH] OCaml rejects the following function with an error:
let rec f = fun (x , y) -> f (y , (x , x) )
Add a type annotation (between f and =) that allows the function to pass
OCaml’s type checking.
5. [HHH] Give an example of a function that is excluded by OCaml’s relaxed
value restriction but that would behave correctly when executed.
6. [HH] The check in the unification algorithm that prevents the formation
of recursive types can be disabled in OCaml by the command-line flag
3.5. EXERCISES 45
-rectypes. Find a function that is accepted by OCaml when -rectypes is
specified but rejected otherwise.
46 CHAPTER 3. TYPE INFERENCE
Further reading
• Milner’s 1978 paper introduced type inference for polymorphic pro-
grams, and describes the algorithms W and J that form the basis of
many type inference implementations:
A theory of type polymorphism in programming
Robin Milner
Journal of computer and system sciences 17, 348–375 (1978)
• Milner’s paper showed that Algorithm W is sound. The follow-up
paper shows that Algorithm W is also complete — i.e. that it infers
principal types.
Principal type-schemes for functional programs
Luis Damas and Robin Milner
Principles of programming languages (1982)
• Wright’s paper introducing the value restriction contains an interest-
ing account of the history of combining type inference with imperative
programming.
Simple imperative polymorphism
Andrew K. Wright
Lisp and symbolic computation 8(4), 343-355 (1995)
• The relaxed value restriction used by OCaml is described in the fol-
lowing paper:
Relaxing the value restriction
Jacques Garrigue
International symposium on functional and logic programming (2004)