COMP3161/COMP9164 Supplementary Lecture Notes
Data Types
Liam O’Connor, Gabriele Keller November 5, 2019
1 Composite Data Types
Up to now, we only discussed primitive data types, like integers boolean values, and function types. This is not only an inconvenience, but it seriously restricts the expressiveness of the language. For example, we cannot define a function which returns multiple values at once, or, depending on the input, a boolean or an integer value. In this section we will extend MinHS to have a type system that is able to encode more complex and interesting data types.
1.1 Products
Many languages include ways to group types together into one composite type – a struct in C, a class in Java. In Haskell, we can bundle a fixed number of values of possibly different types by simply using tuples. For example, the function
foo :: Int -> (Bool, Int )
foo x = (even x, x * x )
takes an int-value as argument and return a pair of a boolean and an integer value. In this example, we can see that ( , ) is overloaded to be textbf0h a type constructor (in the type annotation, constructing the type pair of Bool and Int) and a data constructor (in the function body, constructing a new value by bundling a boolean value and an int-value). Similarly, we can construct tuples of (theoretically) arbitrary size in Haskell.1 More values can be bundled, if this is for some reason necessary, by using nested tuples. There are also ways to define record types, similar to C-structs in that the fields can be referred to by names, not just their position.
In MinHs, we keep it as simple as possible, and only introduce a pair constructor. The type of pairs is called a product type, for reasons that will become clear later2. The type of the pair of τ1 and τ2 is therefore written τ1 × τ2, in contrast to Haskell.
We also introduce a data constructor, ( , ) to MinHS, which produces values of a product type:
Γ⊢e1 :τ1 Γ⊢e2 :τ2 Γ ⊢ (e1, e2) : τ1 × τ2
Note that we can combine more than two types simply by using nested products: a type Int × (Int × Int) bundling three integer values can be constructed with (3, (4, 5)).
1In practice, most compilers only support tuples up to some fixed size, but according to the standard, it should not be less than 15.
2Product types are analogous to cartesian products in set theory, for those that are familiar with sets.
1
Once we have a pair value, how do we get each component out of it? We introduce a bit more syntax for some destructors, fst and snd, which, given a pair, return the first or second component of the pair respectively:
Γ⊢e:τ1 ×τ2 Γ⊢e:τ1 ×τ2 Γ ⊢ fst e : τ1 Γ ⊢ snd e : τ2
Evaluation rules are omitted as they are straightforward.
1.2 Sums
Sum types are less common than product types in programming languages, but they can be simulated with tagged unions in C and interface inheritance in Java. In essence τ1 + τ2 allows either a value of type τ1 or a value of type τ2, but not textbf0h. In Haskell, an equivalent definition would be:
data Sum a b = InL a | InR b
So, we will introduce similar constructors, InL and InR, for our sum type in MinHS:
Γ ⊢ e : τ1 Γ ⊢ e : τ2
Γ ⊢ InL e : τ1 + τ2 Γ ⊢ InR e : τ1 + τ2
Once again, we can combine multiple types to give n-ary sums via nesting.
Instead of function-like destructors for sum types, we will introduce a new Case expression, similar to Haskell’s equivalent, which provides two alternative expressions. One for InL, and one
for InR. As an example:
recfun sumToProd :: (Int + Int -> Bool * Int) e =caseeofInLu =(False,u)
InR v = (True, v);
The abstract syntax of a case expression is as follows: Case τ1 τ2 e (x.e1) (y.e2), where the input sum value e is of type τ1 + τ2, and e1 is the alternative for InL, and e2 is the alternative for InR. Typing rules:
Γ⊢e:τ1 +τ2 x:τ1,Γ⊢e1 :τ y:τ2,Γ⊢e2 :τ Γ⊢Caseτ1 τ2 e(x.e1)(y.e2):τ
Big step evaluation rules:
e⇓InLv e1[x:=v]⇓r e⇓InRv e2[y:=v]⇓r
Case τ1 τ2 e (x.e1) (y.e2) ⇓ r Case τ1 τ2 e (x.e1) (y.e2) ⇓ r
With this, we can also make a four-valued type: Bool + Bool has values InL False, InR False,
InL True, InR True. 1.3 Unit Type
Could we define a type with only three values? At the moment, the smallest type we’ve got is Bool, which has two values, and textbf0h products and sums of Bool give us a type too large with four values.
2
The way we resolve this is to introduce a new type, called 1, sometimes written ⊤, which has exactly one value – (). It can be thought of as the “empty” tuple, and corresponds to a void type in C.
Typing rules are obvious:
Γ ⊢ () : 1
With this type, we can construct a type of three values using a sum type: The type 1 + (1 + 1)
has three values: InL (), InR (InL ()), InR (InR ()). 1.4 Empty Type
We add another type, called 0, that has no inhabitants. This can be useful in a function’s type, as a function like Int → 0 indicates that the function will not return3.
Because it is empty, there is no way to construct it. We do have a way to eliminate it, however:
Γ⊢e:0 Γ⊢absurde: τ
If I have a variable of the empty type in scope, we must be looking at an expression that will never be evaluated. Therefore, we can assign any type we like to this expression, because it will never be executed.
1.5 Type Isomorphism
The above type of three values could be represented as 1 + (1 + 1) as shown, but it could also be represented as 1 + Bool (as it also has three values). We will define the notion of type isomorphism to capture the similarity between these terms.
Formally, two mathematical objects A and B are considered isomorphic if there exists a structure preserving mapping (or a morphism) f from A to B, and an inverse morphism f−1 from B to A such that f ◦ f−1 is an identity morphism IA from A to A (i.e f(f−1(x)) = x) and f−1 ◦f is an identity morphism IB from B to B (i.e f−1(f(y)) = y). It is analogous to bijection in set theory or logic, and is often written as A ≃ B where A is isomorphic to B.
Seeing as types can be thought of as (slightly restricted) sets of values, we can construct such mappings to show types A and B isomorphic if and only if A has the same number of values as B.
As an example, we could map InL () to InL (), InR (InL ()) to InR False and InR (InR ()) to InR True (and vice-versa) and thus show that 1+(1+1) ≃ 1+Bool. Generally, it is sufficient to show that the two types have the same number of values in order to show that they are isomorphic.
As you have probably already realised, computing the size of types can be achieved easily by using this function:
3Some functions are expected to not return, for example functions that throw exceptions.
3
1.6
Type Algebra
|0| =0 |1| =1
|Bool| |Int|
|τ1 ×τ2| |τ1 +τ2| |τ1 → τ2|
= 2
= 232
= |τ1| × |τ2| = |τ1| + |τ2| = |τ2 ||τ1 |
for total, terminating functions only
It is worth noting that seeing as type equivalence is based simply on the number of values within the types, and sum and product types correspond to addition and multiplication of type size, types follow all the same algebraic laws as the natural numbers. This is called a commutative semiring.
Laws for (τ, +, 0):
• Associativity: (τ1 + τ2) + τ3 ≃ τ1 + (τ2 + τ3) • Identity: 0+τ ≃ τ
• Commutativity: τ1 + τ2 ≃ τ2 + τ1
Laws for (τ, ×, 1)
• Associativity: (τ1 × τ2) × τ3 ≃ τ1 × (τ2 × τ3) • Identity: 1×τ ≃ τ
• Commutativity: τ1 × τ2 ≃ τ2 × τ1
Combining × and +:
• Distributivity: τ1 ×(τ2 +τ3) ≃ (τ1 ×τ2)+(τ1 ×τ3) • Absorption: 0 × τ ≃ 0
1.7 Curry-Howard Correspondence
Before moving on, let us gather all the typing rules we have defined so far, including typing rules for λ-functions and application:
Γ⊢e:0
Γ ⊢ absurd e : τ
Γ ⊢ e : τ1
Γ ⊢ InL e : τ1 + τ2
Γ ⊢ () : 1 Γ ⊢ e : τ2
Γ ⊢ InR e : τ1 + τ2 Γ⊢e:τ1 +τ2 x:τ1,Γ⊢e1 :τ y:τ2,Γ⊢e2 :τ
Γ ⊢ (case e of InL x → e1;InR y → e2) : τ
Γ⊢e1 :τ1 Γ⊢e2 :τ2 Γ⊢e:τ1 ×τ2 Γ⊢e:τ1 ×τ2 Γ ⊢ (e1, e2) : τ1 × τ2 Γ ⊢ fst e : τ1 Γ ⊢ snd e : τ2
Γ⊢e1 :τ1 →τ2 Γ⊢e2 :τ1 x:τ1,Γ⊢e:τ2
Γ⊢e1 e2 :τ2
Γ⊢λx.e:τ1 →τ2
4
If I remove all the terms, that is, everything between the ⊢ and the type, and just leave contexts and types in the rules, I get something quite remarkable:
Γ⊢0
Γ⊢τ Γ⊢1
Γ ⊢ τ1 Γ ⊢ τ2 Γ⊢τ1 +τ2 Γ⊢τ1 +τ2
Γ⊢τ1 +τ2 Γ⊢τ2
τ2,Γ⊢τ Γ⊢τ1 ×τ2
Γ ⊢ τ2 τ1,Γ⊢τ2
Γ ⊢ τ1 → τ2
τ1,Γ⊢τ Γ⊢τ Γ⊢τ1 ×τ2
Γ ⊢ τ1 Γ⊢τ1
Γ⊢τ1
Γ ⊢ τ1 × τ2
Γ⊢τ1 →τ2
Γ ⊢ τ2
As it happens, the rules for our types correspond exactly to the rules of constructive propositional logic!
The type 1 corresponds to True or ⊤, the type 0 corresponds to False or ⊥, sum types corre- spond to disjunctions, and product types to conjunctions. Functions correspond to implication.
Γ⊢⊥
Γ⊢P Γ⊢⊤
Γ ⊢ P1 Γ ⊢ P2 Γ⊢P1 ∨P2 Γ⊢P1 ∨P2
Γ⊢P1 ∨P2 Γ⊢P1 Γ⊢P2
Γ ⊢ P1 ∧ P2 Γ⊢P1 →P2
Γ ⊢ P2
P1,Γ⊢P Γ⊢P Γ⊢P1 ∧P2
Γ ⊢ P1 Γ⊢P1
P2,Γ⊢P Γ⊢P1 ∧P2
Γ ⊢ P2 P1,Γ⊢P2
Γ ⊢ P1 → P2
This means that constructing a well-typed program is equivalent to constructing a proof of the proposition encoded by its type. Programs are proofs, and types are propositions.
This correspondence goes by many names, but is usually attributed to Haskell Curry and William Howard. It is a very deep result, carrying through every aspect of programs and proving:
It turns out, no matter what logic you want to define, there is always a corresponding λ-calculus, and vice versa.
5
Programming
Logic
Types Programs Evaluation
Propositions Proofs
Proof Simplification
Constructive Logic Classical Logic Modal Logic Linear Logic Separation Logic
Typed λ-Calculus Continuations Monads
Linear Types, Session Types Region Types
Philip Wadler has written a much more detailed exposition of this correspondence in this very readable paper, located here:
http://homepages.inf.ed.ac.uk/wadler/papers/propositions-as-types/propositions-as-types.pdf
He has also given a nice talk on the topic at Strange Loop in 2015:
Some examples of proofs of basic logical properties are given below:
To prove that A ∧ B → B ∧ A, i.e. commutativity of conjunction, we can just flip the order
of the values in the pair:
andComm :: A × B → B × A andComm p = (snd p, fst p)
We can prove transitivity of implication as well:
transitive :: (A → B) → (B → C) → (A → C)
transitive f g x=g (f x) Transitivity of implication is just function composition.
1.7.1 Caveats
This correspondence is predicated on the condition that all functions we define have to be total and terminating. Otherwise we get an inconsistent logic that lets us prove false things:
proof 1 :: P = NP proof 1 = proof 1
proof2 :: P ̸= NP proof 2 = proof 2
Most common calculi correspond to constructive logics, not classical ones, so principles like the law of excluded middle or double negation elimination do not hold:
¬¬P → P
It is possible to construct λ-calculi where they do hold, however for the purposes of this course,
we will concern ourselves primarily with constructive logics.
2 Recursive Types
What about if we wanted to write a common linked-list? In Haskell, this would be easily defined:
data ListOfInt = Cons Int ListOfInt
| Nil
In MinHS, however, we are stuck – we have no name by which we can introduce recursive types. All types in MinHS are anonymous. What we want to be able to declare is the type τ = (Int × τ ) + 1 (which is isomorphic to the Haskell definition above).
6
To solve this, we introduce a new type system feature, called Rec (sometimes written μ). It would allow a linked list type to be written as follows:
Rect.(Int×t)+1
We also need to introduce constructors and destructors for this Rec construct. We call then Roll
and Unroll. The typing rules for Roll are straightforward, and mostly uninteresting: Γ ⊢ x : τ [t := Rec t. τ ]
Γ ⊢ Roll x : Rec t. τ
Unroll, however, gives us an insight into how these recursive types actually work:
Γ ⊢ x : Rec t. τ
Γ ⊢ Unroll x : τ [t := Rec t. τ ]
That is, when we Unroll a recursive type, the recursive name variable (t) is replaced with the entire recursive type. Each Unroll eliminates one level of recursion.
We can define a value of our linked list type, written in Haskell as [3, 4], as follows: Roll (InL ( (3, (Roll (InL ( (4, (Roll (InR ())))))))
Not very pretty, but isomorphic to the Haskell list.
3 Haskell Datatypes
MinHS’s type system, with these extensions, is now nearly as powerful as Haskell’s. To show this, I will demonstrate a technique which gives an isomorphic type in MinHS to any (non- polymorphic) type in Haskell. Suppose we have the following Haskell type:
data Foo = Bar Int Bool | Baz | Herp Foo First, replace all data constructors with 1:
data Foo = 1 Int Bool | 1 | 1 Foo Then, multiply the constructors with all of their arguments:
data Foo = 1 × Int × Bool | 1 | 1 × Foo Then, replace all the alternatives with sums:
dataFoo=1×Int×Bool + 1 + 1×Foo Apply algebraic simplification (usually just 1 × x ≃ x):
dataFoo=Int×Bool + 1 + Foo Then, replace the recursive references (if any) with a Rec construct:
Rect.Int×Bool + 1 + t
As can be seen from the above example, this technique will convert non polymorphic, non
parameterised Haskell types into isomorphic MinHS ones. A useful trick for your exam 😉 7