COMP3161/COMP9164 Supplementary Lecture Notes
Parametric Polymorphism
Liam O’Connor November 7, 2019
Polymorphism is a prominent part of most modern programming languages. It allows some form of generic programming, where values of different types can be manipulated by the same function. Parametric polymorphism, sometimes called generics in OO languages, is the simplest form of polymorphism1, where a function can declared to operate over any type at all. For example,
suppose we had a swap function:
swap (x,y)=(y,x)
What would the type of this function be? In a monomorphic language like MinHS, we couldn’t write this function generically. We would have to have a variety of functions, swapBI : Bool×Int → Int×Bool, swapIB : Int×Bool → Bool×Int, and so on — a total of T2 functions where T is the number of types in the language2. This is obviously highly impractical, seeing as all these functions have the same implementation. What we want is to express a type swap : a × b → b × a for all types a and b. That is what parametric polymorphism gives us.
1 Type Parameters
Currently, all functions in MinHS take some values of a concrete type, and return a value of a concrete type. The literature on typed λ-calculus denotes these functions from values to values with the notation λ(x : τ). y, which is analogous to recfun (f : τ → τ′) x = y in MinHS. A function that constructs a pair of two integers could be written with this notation as follows:
mkIntIntPair = (λ(x : Int). (λ(y : Int). (x, y)))
This function takes an argument x of type Int, and returns a function, which, given a y of type Int, will produce a pair of x and y. This nesting of functions is how we achieve n-ary functions in Haskell and similar languages, and is called currying.
In order to get parametric polymorphism, we extend functions slightly. In addition to having functions from values to values, like above, we include functions from types to values, usually written like Λt. v. These Λ binders introduce type variables which can be used in type signatures for values wherever it is in scope. For example, a generic mkPair function could be written like this:
mkPair = Λa. Λb. λ(x : a). λ(y : b). (x,y)
Applying a type to one of these generic functions is called type application or specialisation, and is written a variety of ways in the literature, including mkPair@Int@Int, mkPair [Int] [Int] or mkPair {Int} {Int}.
1And yet, it remains one of the worst-implemented features of all time in C++, and it simply doesn’t exist in Go
2Since we have products and sums, T = ∞
1
Universal Quantification
To give a type to our new Λt. e form, and thus to our mkPair function, we need to reflect the type variables introduced by the Λ on to the type level, where they are introduced by the universal quantifier, ∀:
Γ⊢e:τ t∈/FV(Γ)
Γ ⊢ Λt. e : ∀t. τ
Note that we add a requirement that the introduced variable t is not in the set of free type variables
in the environment Γ. This ensures that we don’t have clashing type variable names. This generalisation rule allows us to provide a type to our mkPair function.
x:a;y:b⊢x:a x:a;y:b⊢y:b
x : a; y : b ⊢ pair(x, y) : a × b
x : a ⊢ λ(y : b). pair(x, y) : b → a × b
⊢ λ(x : a). λ(y : b). pair(x, y) : a → b → a × b
⊢ Λb. λ(x : a). λ(y : b). pair(x, y) : ∀b. a → b → a × b
⊢ Λa. Λb. λ(x : a). λ(y : b). pair(x, y) : ∀a. ∀b. a → b → a × b
To type the application of our mkPair function, we need a type for the type application form, e@τ: e : ∀a.τ′
e@τ : τ′[a := τ]
This states that we can substitute the type variable a in the type for e with the type after the @
and get a well-typed result. Now we can type a term like mkPair@Int@Bool 3 True:
· · · ⊢ mkPair : ∀ab. a → b → a × b
· · · ⊢ mkPair@Int : ∀b. Int → b → Int × b
···⊢mkPair@Int@Bool:Int→Bool→Int×Bool ···⊢3:Int
· · · ⊢ mkPair@Int@Bool 3 : Bool → Int × Bool · · · ⊢ True : Bool
· · · ⊢ mkPair@Int@Bool 3 True : Int × Bool
This lets us define functions that are generic over their arguments, as required, so now let us examine what extensions we need to add to MinHS to make parametric polymorphism possible in MinHS programs.
2 Applying to MinHS 2.1 New Syntax
We introduce two new forms of expression syntax: type a. e, which is analogous to the Λa. e notation from earlier, and e@τ for type application.
We also extend type syntax with the universal quantifier forall a. τ and type variables a, b, etc. This means our static semantics must ensure that types are well formed – that all type variables have an accompanying quantifier:
tbound∈∆
∆ ⊢ t Ok ∆ ⊢ Int Ok ∆ ⊢ Bool Ok
∆, a bound ⊢ τ Ok ∆⊢forall a.τOk
2.2 Typing Rules
The typing rules for type a. e are the same as the typing rules for Λα. e, only using our explicit wellformedness checking for types (which necessitates an additional context of type variables):
a bound, ∆; Γ ⊢ e : τ ∆; Γ ⊢ type a. e : ∀a. τ
∆⊢αOk ∆⊢βOk ∆ ⊢ α → β Ok
2
Similarly for type application:
∆;Γ⊢e:foralla.τ ∆⊢ρok ∆; Γ ⊢ e@ρ : τ [a := ρ]
2.3 Dynamic Semantics
Firstly, we evaluate the expression in the left-hand side as much as possible:
e →M e′ e@τ →M e′@τ
Once it has fully evaluated, we expect to see a type-abstraction remaining. Then we apply a substitution, similarly to normal function application (i.e. β-reduction):
(type a. e)@τ →M e[a := τ] 3 Implementation Techniques
Our simple dynamic semantics belies a complex implementation headache.
While we can easily define functions that operate uniformly on multiple types, when this is compiled to machine code the results may differ depending on the size of the type in question on the machine stack. For example, a function that takes a int parameter will allocate a different
amount of stack space than a function that takes a long parameter. There are two main approaches to solve this problem:
Template Specialisation In this approach, we automatically generate a monomorphic copies of each polymorphic function, one for each of the different types applied to it. For example, if we defined our polymorphic swap function:
swap = type a. type b.
recfun swap :: (a × b) → (b × a)
p = (snd p,fst p)
Then a type application like swap@Int@Bool would be replaced statically by the compiler
with the monomorphic version:
swapIB = recfun swap :: (Int × Bool) → (Bool × Int)
p = (snd p,fst p)
A new copy is made for each unique type application.
This approach is not too difficult to implement, produces code with minimal run-time over- head, and is relatively easy for a programmer to understand and debug. It also allows the compiler to make optimisations based on the specific types used for a polymorphic function. For example, a list of booleans could be represented more efficiently as a bit vector.
However, as a copy is made for each type application, the binary sizes end up rather large, which can slow compilation times. More importantly, it imposes a severe restriction on the kinds of polymorphism that can be allowed — we must be able to determine all of the possible type instantiations statically, which rules out polymorphic recursion and other situations where the type instantiations is dependent on run-time information.
Boxing An alternative to the copy-paste-heavy template instantiation approach is to make all types represented on the stack in the same way. Thus, a polymorphic function only requires one function in the generated code.
3
Typically this is done by boxing each type. That is, all data types are represented as a pointer to a data structure on the heap. If everything is a pointer, then all values use exactly 32 (or 64) bits of stack space.
The extra indirection has a run-time penalty, and it can make garbage collection more necessary, but it results in smaller binaries and unrestricted polymorphism.
4 Generality and Parametricity
If we need a function of type Int → Int, a polymorphic function of type forall a. a → a will do just fine, we can just instantiate the type variable to Int. But the reverse is not true. This gives rise to an ordering.
We say that a type τ is more general than a type ρ, often written ρ ⊑ τ, if type variables in τ can be instantiated to give the type ρ. For example, in the below example, the most general type is ∀a. a, as a could be instantiated to any type, therefore this type is more general than all other types.
Int→Int ⊑ ∀z.z→z ⊑ ∀xy.x→y ⊑ ∀a.a
As we get more and more general, the number of possible total, terminating implementations of
the type gets smaller and smaller:
This means that the more general we make our type, the more constrained we are in imple- mentation. This allows us to conclude some things about a function just by looking at its type signature.
For example, consider this list function foo:
foo :: ∀a. [a] → [a]
This function has a type that immediately lets me conclude that the output list consists only of elements in the input list. This is because foo is defined regardless of what the type a is. Therefore, the only values of type a that foo knows about are those values that are in the input. Therefore, the output list can only be made from elements in the input.
The formal principle that captures this intuition is called parametricity.
Formally, we can say that if we apply any arbitrary function f to the polymorphically-typed inputs of a polymorphic function, then that is equivalent to applying that same function to the polymorphically-typed outputs of the same function. For example, for foo above, the parametricity theorem is, for any f and ls:
map f (foo ls)=foo (map f ls) The identity function, id :: ∀a. a → a, has an even simpler theorem:
id (f x)=f(id x) There are many other examples, like head:
Type
Size
Int → Int ∀z. z → z ∀a b. a → b ∀a. a
(232 )232 1
0
0
Which has the following theorem:
head :: ∀a. [a] → a
f (head l) = head (map f l) 4
The list appending function: Which has the following theorem:
(++) :: ∀a. [a] → [a] → [a]
Concatenating a list of lists:
map f (a ++ b) = map f a ++ map f b concat :: ∀a. [[a]] → [a]
Has the following theorem:
map f (concat ls) = concat (map (map f) ls)
It even works for higher order functions, like filter:
filter :: ∀a. (a → Bool) → [a] → [a]
Which has the following theorem:
filter p (map f ls) = map f (filter (p ◦ f) ls)
5