OCaml Datatypes
OCaml Datatypes
slides copyright 2017, 2018, 2019, 2020
Author David Walker, updated by Amy Felty
permission granted to reuse these slides for non-commercial educational purposes
OCaml So Far
• We have seen a number of basic types:
– int
– float
– char
– string
– bool
• We have seen a few structured types:
– pairs
– tuples
– options
– lists
• In this lecture, we will see some more general ways to define
our own new types and data structures
2
Type Abbreviations
• We have already seen some type abbreviations:
type point = float * float
3
Type Abbreviations
• We have already seen some type abbreviations:
• These abbreviations can be helpful documentation:
• But they add nothing of substance to the language
– they are equal in every way to an existing type
type point = float * float
let distance (p1:point) (p2:point) : float =
let square x = x *. x in
let (x1,y1) = p1 in
let (x2,y2) = p2 in
sqrt (square (x2 -. x1) +. square (y2 -. y1))
4
Type Abbreviations
• We have already seen some type abbreviations:
• As far as OCaml is concerned, you could have written:
• Since the types are equal, you can substitute the definition for
the name wherever you want
– we have not added any new data structures
type point = float * float
let distance (p1:float*float)
(p2:float*float) : float =
let square x = x *. x in
let (x1,y1) = p1 in
let (x2,y2) = p2 in
sqrt (square (x2 -. x1) +. square (y2 -. y1))
5
DATA TYPES
6
Data types
• OCaml provides a general mechanism called a data type for
defining new data structures that consist of many alternatives
type my_bool = Tru | Fal
a value with type my_bool
is one of two things:
• Tru, or
• Fal
read the “|” as “or”
7
Data types
• OCaml provides a general mechanism called a data type for
defining new data structures that consist of many alternatives
type my_bool = Tru | Fal
a value with type my_bool
is one of two things:
• Tru, or
• Fal
read the “|” as “or”
Tru and Fal are called
“constructors”
8
Data types
• OCaml provides a general mechanism called a data type for
defining new data structures that consist of many alternatives
type my_bool = Tru | Fal
type color = Blue | Yellow | Green | Red
there’s no need to stop
at 2 cases; define as many
alternatives as you want
9
Data types
• OCaml provides a general mechanism called a data type for
defining new data structures that consist of many alternatives
type my_bool = Tru | Fal
type color = Blue | Yellow | Green | Red
• Creating values:
let b1 : my_bool = Tru
let b2 : my_bool = Fal
let c1 : color = Yellow
let c2 : color = Red
use constructors to create values
10
Data types
type color = Blue | Yellow | Green | Red
let c1 : color = Yellow
let c2 : color = Red
• Using data type values:
let print_color (c:color) : unit =
match c with
| Blue ->
| Yellow ->
| Green ->
| Red ->
use pattern matching to
determine which color
you have; act accordingly
11
Data types
type color = Blue | Yellow | Green | Red
let c1 : color = Yellow
let c2 : color = Red
• Using data type values:
let print_color (c:color) : unit =
match c with
| Blue -> print_string “blue”
| Yellow -> print_string “yellow”
| Green -> print_string “green”
| Red -> print_string “red”
12
Data types
type color = Blue | Yellow | Green | Red
let c1 : color = Yellow
let c2 : color = Red
• Using data type values:
let print_color (c:color) : unit =
match c with
| Blue -> print_string “blue”
| Yellow -> print_string “yellow”
| Green -> print_string “green”
| Red -> print_string “red”
Why not just use strings to represent colors instead of defining a new type?
13
Data types
type color = Blue | Yellow | Green | Red
| Blue -> print_string “blue”
| Yellow -> print_string “yellow”
| Red -> print_string “red”
Warning 8: this pattern-matching is not exhaustive.
Here is an example of a value that is not matched:
Green
oops!:
let print_color (c:color) : unit =
match c with
14
Data types
type color = Blue | Yellow | Green | Red
| Yellow -> print_string “yellow”
| Red -> print_string “red”
oops!:
let print_color (c:color) : unit =
match c with
| Blue -> print_string “blue”
Warning 8: this pattern-matching is not exhaustive.
Here is an example of a value that is not matched:
Green
OCaml’s datatype mechanism allow you to create types
that contain precisely the values you want!
15
Data Types Can Carry Additional Values
• Data types are more than just enumerations of constants:
• Read as: a simple_shape is either:
– a Circle, which contains a pair of a point and float, or
– a Square, which contains a pair of a point and float
type point = float * float
type simple_shape =
Circle of point * float
| Square of point * float
(x,y)
s (x,y)
r
16
Data Types Can Carry Additional Values
• Data types are more than just enumerations of constants:
type point = float * float
type simple_shape =
Circle of point * float
| Square of point * float
let origin : point = (0.0, 0.0)
let circ1
let circ2
: simple_shape = Circle (origin, 1.0)
: simple_shape = Circle ((1.0, 1.0), 5.0)
let square : simple_shape = Square (origin, 2.3)
17
Data Types Can Carry Additional Values
• Data types are more than just enumerations of constants:
type point = float * float
type simple_shape =
Circle of point * float
| Square of point * float
let simple_area (s:simple_shape) : float =
match s with
| Circle (_, radius) -> 3.14 *. radius *. radius
| Square (_, side) -> side *. side
18
Compare
• Data types are more than just enumerations of constants:
type point = float * float
type simple_shape =
Circle of point * float
| Square of point * float
let simple_area (s:simple_shape) : float =
match s with
| Circle (_, radius) -> 3.14 *. radius *. radius
| Square (_, side) -> side *. side
type my_shape = point * float
let simple_area (s:my_shape) : float =
(3.14 *. radius *. radius) ?? or ?? (side *. side)
19
More General Shapes
r1
r2
Square s =
Ellipse (r1, r2) =
s2
s1RtTriangle (s1, s2) =
v2
v1 v3
v4v5
Polygon [v1; …;v5] =
type point = float * float
type shape =
Square of float
| Ellipse of float * float
| RtTriangle of float * float
| Polygon of point list
s
20
More General Shapes
type point = float * float
type radius = float
type side = float
type shape =
Square of side
| Ellipse of radius * radius
| RtTriangle of side * side
| Polygon of point list
Type abbreviations can
aid readability
r1
r2
Square s =
Ellipse (r1, r2) =
s2
s1RtTriangle (s1, s2) =
v2
v1 v3
v4v5
RtTriangle [v1; …;v5] =
s
21
More General Shapes
type point = float * float
type radius = float
type side = float
type shape =
Square of side
| Ellipse of radius * radius
| RtTriangle of side * side
| Polygon of point list
let sq : shape = Square 17.0
let ell
let rt
: shape = Ellipse (1.0, 2.0)
: shape = RtTriangle (1.0, 1.0)
let poly : shape = Polygon [(0., 0.); (1., 0.); (0.; 1.)]
they are all shapes;
they are constructed in
different ways
Polygon builds a shape
from a list of points
(where each point is itself a pair)
Square builds a shape
from a single side
RtTriangle builds a shape
from a pair of sides
22
Ellipse builds a shape
from a pair of radiuses
More General Shapes
type point = float * float
type radius = float
type side = float
type shape =
Square of side
| Ellipse of radius * radius
| RtTriangle of side * side
| Polygon of point list
let area (s : shape) : float =
match s with
| Square s ->
| Ellipse (r1, r2)->
| RtTriangle (s1, s2) ->
| Polygon ps ->
a data type also defines
a pattern for matching
23
More General Shapes
type point = float * float
type radius = float
type side = float
type shape =
Square of side
| Ellipse of radius * radius
| RtTriangle of side * side
| Polygon of point list
let area (s : shape) : float =
match s with
| Square s ->
| Ellipse (r1, r2)->
| RtTriangle (s1, s2) ->
| Polygon ps ->
Square carries a value
with type float so s is
a pattern for float values
RtTriangle carries a value
with type float * float
so (s1, s2) is a pattern
for that type
a data type also defines
a pattern for matching
24
More General Shapes
type point = float * float
type radius = float
type side = float
type shape =
Square of side
| Ellipse of radius * radius
| RtTriangle of side * side
| Polygon of point list
let area (s : shape) : float =
match s with
| Square s -> s *. s
| Ellipse (r1, r2) -> pi *. r1 *. r2
| RtTriangle (s1, s2) -> s1 *. s2 /. 2.
| Polygon ps -> ???
a data type also defines
a pattern for matching
25
Computing Area
• How do we compute polygon area?
• For convex polygons:
– Case: the polygon has fewer than 3 points:
• it has 0 area! (it is a line or a point or nothing at all)
– Case: the polygon has 3 or more points:
• Compute the area of the triangle formed by the first 3 vertices
• Delete the second vertex to form a new polygon
• Sum the area of the triangle and the new polygon
v2
v1 v3
v4v5
= +
26
Computing Area
• How do we compute polygon area?
• For convex polygons:
– Case: the polygon has fewer than 3 points:
• it has 0 area! (it is a line or a point or nothing at all)
– Case: the polygon has 3 or more points:
• Compute the area of the triangle formed by the first 3 vertices
• Delete the second vertex to form a new polygon
• Sum the area of the triangle and the new polygon
• Note: This is a beautiful inductive algorithm:
– the area of a polygon with n points is computed in terms of a
smaller polygon with only n-1 points!
v2
v1 v3
v4v5
= +
27
Computing Area
v2
v1 v3
v4v5
let area (s : shape) : float =
match s with
| Square s -> s *. s
| Ellipse (r1, r2) -> r1 *. r2
| RtTriangle (s1, s2) -> s1 *. s2 /. 2.
| Polygon ps -> poly_area ps
let rec poly_area (ps : point list) : float =
match ps with
| p1 :: p2 :: p3 :: tail ->
tri_area p1 p2 p3 +. poly_area (p1::p3::tail)
| _ -> 0.
= +
This pattern says the
list has at least 3 items
28
Computing Area
let tri_area (p1:point) (p2:point) (p3:point) : float =
let a = distance p1 p2 in
let b = distance p2 p3 in
let c = distance p3 p1 in
let s = 0.5 *. (a +. b +. c) in
sqrt (s *. (s -. a) *. (s -. b) *. (s -. c))
let rec poly_area (ps : point list) : float =
match ps with
| p1 :: p2 :: p3 :: tail ->
tri_area p1 p2 p3 +. poly_area (p1::p3::tail)
| _ -> 0.
let area (s : shape) : float =
match s with
| Square s -> s *. s
| Ellipse (r1, r2)-> pi *. r1 *. r2
| RtTriangle (s1, s2) -> s1 *. s2 /. 2.
| Polygon ps -> poly_area ps
29
INDUCTIVE DATA TYPES
30
Inductive data types
• We can use data types to define inductive data
• A binary tree is:
– a Leaf containing no data
– a Node containing a key, a value, a left subtree and a right subtree
31
type key = int
type value = string
type tree =
Leaf
| Node of key * value * tree * tree
Inductive data types
• We can use data types to define inductive data
• A binary tree is:
– a Leaf containing no data
– a Node containing a key, a value, a left subtree and a right subtree
32
type key = int
type value = string
type tree =
Leaf
| Node of key * value * tree * tree
Inductive data types
let rec insert (t:tree) (k:key) (v:value) : tree =
33
type key = int
type value = string
type tree =
Leaf
| Node of key * value * tree * tree
Inductive data types
let rec insert (t:tree) (k:key) (v:value) : tree =
match t with
| Leaf ->
| Node (k’, v’, left, right) ->
Again, the type definition
specifies the cases you must
consider
34
type key = int
type value = string
type tree =
Leaf
| Node of key * value * tree * tree
Inductive data types
let rec insert (t:tree) (k:key) (v:value) : tree =
match t with
| Leaf -> Node (k, v, Leaf, Leaf)
| Node (k’, v’
,
left, right) ->
35
type key = int
type value = string
type tree =
Leaf
| Node of key * value * tree * tree
Inductive data types
match t with
| Leaf -> Node (k, v, Leaf, Leaf)
let rec insert (t:tree) (k:key) (v:value) : tree =
| Node (k’, v’, left, right) ->
if k < k' then
Node (k', v', insert left k v, right)
else if k > k’ then
Node (k’, v’, left, insert right k v)
else
Node (k, v, left, right)
36
type key = int
type value = string
type tree =
Leaf
| Node of key * value * tree * tree
Inductive data types
match t with
| Leaf -> Node (k, v, Leaf, Leaf)
let rec insert (t:tree) (k:key) (v:value) : tree =
| Node (k’, v’, left, right) ->
if k < k' then
Node (k', v', insert left k v, right)
else if k > k’ then
Node (k’, v’, left, insert right k v)
else
Node (k, v, left, right)
37
type key = int
type value = string
type tree =
Leaf
| Node of key * value * tree * tree
Inductive data types
match t with
| Leaf -> Node (k, v, Leaf, Leaf)
let rec insert (t:tree) (k:key) (v:value) : tree =
| Node (k’, v’, left, right) ->
if k < k' then
Node (k', v', insert left k v, right)
else if k > k’ then
Node (k’, v’, left, insert right k v)
else
Node (k, v, left, right)
38
Inductive data types: Another Example
• We can use the type “int” to represent natural numbers
– but that is kind of broken: it also contains negative numbers
– we have to use a dynamic test and a default value:
let double (n : int) : int =
if n < 0 then
0
else
double_nat n
39
Inductive data types: Another Example
• We can use the type "int" to represent natural numbers
– but that is kind of broken: it also contains negative numbers
– we have to use a dynamic test and a default value:
– or raise an exception
– it would be nice if there was a way to define the natural
numbers exactly, and use OCaml's type system to guarantee no
client ever attempts to double a negative number
let double (n : int) : int =
if n < 0 then
raise (Failure "negative input!")
else
double_nat n
40
Inductive data types
• Recall, a natural number n is either:
– zero, or
– m + 1
• We use a data type to represent this definition exactly:
41
Inductive data types
• Recall, a natural number n is either:
– zero, or
– m + 1
• We use a data type to represent this definition exactly:
type nat = Zero | Succ of nat
42
Inductive data types
• Recall, a natural number n is either:
– zero, or
– m + 1
• We use a data type to represent this definition exactly:
type nat = Zero | Succ of nat
let rec nat_to_int (n : nat) : int =
match n with
Zero -> 0
| Succ n -> 1 + nat_to_int n
43
Inductive data types
• Recall, a natural number n is either:
– zero, or
– m + 1
• We use a data type to represent this definition exactly:
type nat = Zero | Succ of nat
let rec nat_to_int (n : nat) : int =
match n with
Zero -> 0
| Succ n -> 1 + nat_to_int n
let rec double_nat (n : nat) : nat =
match n with
| Zero -> Zero
| Succ m -> Succ (Succ (double_nat m))
44
Summary
• OCaml data types: a powerful mechanism for defining
complex data structures:
– They are precise
• contain exactly the elements you want, not more elements
– They are general
• recursive, non-recursive (mutually recursive and polymorphic)
– The type checker helps you detect errors
• missing cases in your functions
45
PARAMETERIZED TYPE DEFINITIONS
46
type (‘key, ‘val) tree =
Leaf
| Node of ‘key * ‘val * (‘key, ‘val) tree * (‘key, ‘val) tree
type ‘a inttree = (int, ‘a) tree
type istree = string inttree
type ‘x f = body
arg f
use:
General form:
definition:
type f x = body
f arg
A more conventional notation
would have been (but is not ML):
definition:
use:
47
Note: istree is an abbreviation for (int,string) tree
Take-home Message
• Think of parameterized types like functions:
– a function that take a type as an argument
– produces a type as a result
• Theoretical basis:
– System F-omega
– a typed lambda calculus with general type-level functions as
well as value-level functions
48
OCaml Datatypes
OCaml So Far
Type Abbreviations
Type Abbreviations
Type Abbreviations
DATA TYPES
Data types
Data types
Data types
Data types
Data types
Data types
Data types
Data types
Data types
Data Types Can Carry Additional Values
Data Types Can Carry Additional Values
Data Types Can Carry Additional Values
Compare
More General Shapes
More General Shapes
More General Shapes
More General Shapes
More General Shapes
More General Shapes
Computing Area
Computing Area
Computing Area
Computing Area
INDUCTIVE DATA TYPES
Inductive data types
Inductive data types
Inductive data types
Inductive data types
Inductive data types
Inductive data types
Inductive data types
Inductive data types
Inductive data types: Another Example
Inductive data types: Another Example
Inductive data types
Inductive data types
Inductive data types
Inductive data types
Summary
PARAMETERIZED TYPE DEFINITIONS
Slide Number 47
Take-home Message