COMP3161/COMP9164 Supplementary Lecture Notes
Type Inference
Liam O’Connor December 10, 2019
Explicitly typed polymorphic languages, where the user must make explicit type abstractions and applications, such as the version of MinHS introduced with parametric polymorphism, are very awkward to use in practice. Ideally, we would like to leave these type annotations implicit, and have the compiler infer types for us.
Considering the following expression,
recfunf x=fstx+1
We could give a number of possible types to this expression: • (Int×Int)→Int
• (Int×Bool)→Int
• (Int×0)→Int
The exact type inferred must depend on the surrounding context, that is, the argument to which this function is applied. If the function is applied to many different arguments, then we would need to generalise the type to ∀a. (Int × a) → Int.
In order to make type annotations implicit, we will start with polymorphic MinHS but remove the following features:
• type signatures from recfun, let, etc.
• explicit type abstractions, and type applications (the @ operator).
• recursive types, because there is no unique most general type (principal type) for a given term if we have general recursive types.
Our types may still contain type variables quantified by the ∀ operator, however now the compiler, not the user, determines when to generalise and specialise types.
1 Implicitly-typed MinHS
The basic constructs of implicitly-typed MinHS are identical to explicitly-typed MinHS:
x:τ ∈Γ Γ⊢e1 :τ1 →τ2 Γ⊢e2 :τ1 Γ⊢x:τVar Γ⊢e e :τ
App
122 Γ⊢e1 :Bool Γ⊢e2 :τ Γ⊢e3 :τ
If
Γ⊢(Ife1 e2 e3):τ
1
For simplicity, however, we will treat constructors and primitive operations as functions, whose types are available in the environment. Uses of these operations and constructors are then just function applications:
(+) : Int → Int → Int, Γ ⊢ (App (App (+) (Num 2)) (Num 1)) : Int
Other functions are defined as usual with recfun, but now types are not mentioned in the term:
x : τ1,f : τ1 → τ2,Γ ⊢ e : τ2 Γ⊢(Recfun(f.x.e)):τ →τ Func
The two constructs for polymorphism, type abstraction (type) and application (the @ operator), have now been removed. But, we still have the typing rules that allow us to specialise a polymorphic type (replacing @):
Γ ⊢ e : ∀a.τ
Γ ⊢ e : τ[a := ρ]AllE
And to quantify over any variable that has not already been used (replacing type)1: Γ⊢e:τ a∈/TV(Γ)
12
2 An Algorithm
Γ ⊢ e : ∀a. τ
AllI
We want a fully deterministic algorithm for type inference, which has a clear input and output. We could imagine interpreting our existing rules as an algorithm, where the context and expression are the input and the type is the output:
Γ⊢e1 :τ1 →τ2 Γ⊢e2 :τ1 Γ⊢e1 e2 :τ2
infer :: Context → Expr → Type
However this causes problems when we examine the rules for polymorphism (AllE and AllI). Neither the rule to introduce nor the rule to eliminate ∀ quantifiers is syntax directed. They can be applied at any time. For example, our AllI rule:
Γ⊢e:τ a∈/TV(Γ) Γ ⊢ e : ∀a. τ
Because this rule works on any expression and context, we have an infinite number of possible types for every possible expression. Num 5 could be of type Int or ∀a. Int or ∀a. ∀b. Int etc.
In order to have an algorithmic set of rules, we need to fix not just when these rules are applied but also how they are applied. For example, the rule to specialise a polymorphic type replaces a quantified type variable with any type ρ, where this type is not able to be determined from the input context and expression:
Γ ⊢ e : ∀a.τ
Γ ⊢ e : τ[a := ρ]AllE
If the compiler makes the wrong decision when applying this rule, it can lead to typing errors even for well-typed programs:
Γ ⊢ fst : ∀a. ∀b. (a × b) → a · · · Γ⊢fst:(Bool×Bool)→Bool Γ⊢(Pair1True):(Int×Bool)
Γ ⊢ (Apply fst (Pair 1 True)) : ???
In the above example, we instantiated the type variable a to Bool, even though the provided pair
is actually Int × Bool.
1Where TV(Γ) here is all type variables occurring free in the types of variables in Γ
AllI
App
2
The Solution
To start with, we will make two decisions:
1. ∀ quantified type variables will be instantiated to particular types as soon as a polymorphic type is found in the context for a particular term variable. That is, we shall merge the AllE and Var rules, and not have a separate AllE rule.
2. ∀ quantifiers will only be introduced for the types of variables bound in let expressions. So, we will not have a separate AllI rule either.
Leaving the second decision aside for a moment, we still have a problem with the first. We have fixed when the rule is applied but not how: If we instantiate each ∀-quantified variable to a particular type as soon as possible, we will not (yet) know what type to instantiate it to. For example, looking up the type of fst in the context gives us a type ∀a. ∀b. (a × b) → a, but we do not know at that point what a and b should be replaced with.
To resolve this, we allow types to include unknowns, also known as unification variables or schematic variables. These are placeholders for types that we haven’t worked out yet. We shall use α, β etc. for these variables. For example, (Int × α) → β is the type of a function from tuples where the left side of the tuple is Int, but no other details of the type have been determined yet.
As we encounter situations where two types should be equal, we unify the two types to deter- mine what the unknown variables should be, producing a substitution to these unknowns.
Γ ⊢ fst : ∀a. ∀b. (a × b) → a · · ·
Γ ⊢ fst : (α × β) → α Γ ⊢ (Pair 1 True) : (Int × Bool)
Γ ⊢ (Apply fst (Pair 1 True)) : γ
In the above example, we instantiated the quantified variables a and b in the type of fst to α and β, and used a placeholder γ to refer to the return type of the overall function application. Once we inferred the type of the argument as (Int × Bool), we must now unify the type of the function we inferred ((α × β) → α) and the type of the function we expect based on the type of the argument we inferred (Int × Bool) → γ:
(α×β)→α ∼ (Int×Bool)→γ Once we unify these two types, we get the unifier substitution:
[α := Int,β := Bool,γ := Int]
Observe that if this substitution is applied to the two types above, they become the same.
Unifiers
A substitution S to unification variables is a unifier of two types τ and ρ iff Sτ = Sρ. Furthermore, it is the most general unifier, or mgu, of τ and ρ if there is no other unifier S′
where Sτ ⊑ S′τ.
Wewriteτ∼U ρifUisthemguofτandρ.
Sometimes two types do not have a unifier. A clear example is Int and String — both types
are concrete, and no amount of substitution to unknown variables will make them the same.
We can compute unifiers by structurally matching them. Our unify function would have a type like below, where the Type arguments do not include any ∀ quantifiers and the Unifier returned
is the mgu:
unify :: Type → Type → Maybe Unifier
3
We shall discuss cases for unify τ1 τ2:
1. Both are type variables: τ1 = v1 and τ2 = v2:
• v1 = v2 ⇒ empty unifier • v1 ̸=v2 ⇒ [v1 :=v2]
2. Both are primitive type constructors: τ1 = C1 and τ2 = C2: • C1 = C2 ⇒ empty unifier
• C1 ̸= C2 ⇒ no unifier
3. Bothareproducttypesτ1 =τ11×τ12 andτ2 =τ21×τ22.
(a) Compute the mgu S of τ11 and τ21.
(b) Compute the mgu S′ of Sτ12 and Sτ22.
(c) Return S ∪ S′
(same for sum, function types)
4. One is a type variable v, the other is just any term t. • v occurs in t ⇒ no unifier
• otherwise ⇒ [v := t]
5. Any other case ⇒ no unifier.
Try the algorithm out on the following examples:
1. α×(α×α)
2. (α×α)×β
3. Int+α ∼
4. (α×α)×α ∼ α×(α×α)
∼ β×γ ∼ β×γ α+Bool
[β := α, γ := (α × α)]
[γ := (α × α), β := (α × α)] (no unifier)
(no unifier)
The last example is particularly interesting because if we ignore the “occurs check” in case 4 of the algorithm, and naively try to structurally match, we end up with a substitution:
[α := (α × α)]
But, applying this substitution to both sides of the original problem yields:
((α×α)×(α×α))×(α×α) ∼ (α×α)×((α×α)×(α×α))
And both type terms are still not the same. Even worse, trying again yields the exact same
substitution we started with. This is called an infinite type error.
4
Type Inference Rules
We will decompose the typing judgement to allow for an additional output — a substitution that contains all the unifiers we have found about unknowns so far.
Inputs Expression, Context Outputs Type, Substitution
We will write this as SΓ ⊢ e : τ, to make clear how the original typing judgement may be reconstructed.
Our new, combined variable and instantiation rule replaces all quantified variables with fresh unknown variables. Here “fresh” just indicates that the variable name has never been used before:
(x : ∀a1. ∀a2. …∀an. τ) ∈ Γ Γ⊢x:τ[a1 :=α1,a2 :=α2,…,an =αn]
(α1…αn fresh)
Observe that when the variable’s type is not polymorphic (i.e. no quantifiers), then the above rule
simplifies to our previous Var rule.
S1Γ⊢e1 :τ1 S2S1Γ⊢e2 :τ2 S2τ1 ∼U (τ2 →α)
US2S1Γ ⊢ (Apply e1 e2) : Uα
(α fresh)
Our rule for function application above mirrors the process we used informally in the previous example. A type is inferred for the function (τ1) and a type for its argument τ2. We generate a new placeholder α for the overall type of the application, and unify the type of the function with the type we expect given the type of the argument. We also apply the substitution S2 we get from inferring τ2 to the type τ1 here, so that any information we learn about unification variables during the inference of τ2 is applied before we attempt to unify the two types. Ultimately, we return the unifier applied to the α placeholder as our type, and the union of all of the substitutions computed so far as our returned substitution.
S(Γ,x:α1,f :α2)⊢e:τ Sα2 ∼U (Sα1 →τ) USΓ ⊢ (Recfun (f.x. e)) : U(Sα1 → τ)
(α1, α2 fresh)
For functions, we generate two placeholders for the type of the function and its argument respec- tively, and then unify the function’s type with the expected one based on the inferred return type τ.
Let Generalisation
Earlier we decided to use let expressions as the syntactic point for ∀-generalisation. If we consider this example:
let f = (recfun f x = (x,x)) in (fst (f 4),fst (f True))
Just examining the inner recfun, we would compute a type like α → (α × α). The placeholder α would not be in use anywhere else — it would not be mentioned in the context outside of the recfun. We would expect the function f in the context to have a type like ∀a. a → (a × a). Thus, we can define our generalisation operation to take all free placeholder variables in the type that are not still in use in our context, and ∀ quantify them. More formally, we define Gen (Γ, τ ) = ∀(TV (τ ) \ TV (Γ)). τ
Then our rule for let expressions generalises the type before adding it to the context: S1Γ ⊢ e1 : τ S2(S1Γ,x : Gen(S1Γ,τ)) ⊢ e2 : τ′
S2S1Γ ⊢ (Let e1(x. e2)) : τ′
This means that let expressions are now not just sugar for a function application, they actually
play a vital role in the language’s syntax, as a place for generalisation to occur. 5
Overall
We’ve specified Robin Milner’s algorithm W for type inference, also called Damas-Milner type in- ference. Many other algorithms exist, for other kinds of type systems, including explicit constraint- based systems. This algorithm is restricted to the Hindley-Milner subset of decidable polymorphic instantiations, and requires that polymorphism is top-level — polymorphic functions are not first class.
6