OCaml Datatypes
CSI 3120
Amy Felty University of Ottawa
slides copyright 2017, 2018, 2019, 2020 Author David Walker, updated by Amy Felty permission granted to reuse these slides for non-commercial educational purposes
2
OCaml So Far
• We have seen a number of basic types: – int
– float – char
– string – bool
• We have seen a few structured types: – pairs
– tuples – options – lists
• In this lecture, we will see some more general ways to define our own new types and data structures
3
Type Abbreviations
• We have already seen some type abbreviations:
type point = float * float
Type Abbreviations
• We have already seen some type abbreviations:
type point = float * float
• These abbreviations can be helpful documentation:
4
let distance (p1:point) (p2:point) : float = let square x = x *. x in
let (x1,y1) = p1 in
let (x2,y2) = p2 in
sqrt (square (x2 -. x1) +. square (y2 -. y1))
• But they add nothing of substance to the language – they are equal in every way to an existing type
Type Abbreviations
• We have already seen some type abbreviations:
type point = float * float
• As far as OCaml is concerned, you could have written:
• Since the types are equal, you can substitute the definition for the name wherever you want
5
let distance (p1:float*float) (p2:float*float) : float =
let square x = x *. x in
let (x1,y1) = p1 in
let (x2,y2) = p2 in
sqrt (square (x2 -. x1) +. square (y2 -. y1))
– we have not added any new data structures
DATA TYPES
6
7
Data types
• OCaml provides a general mechanism called a data type for defining new data structures that consist of many alternatives
type my_bool = Tru | Fal
a value with type my_bool is one of two things:
• Tru, or
• Fal
read the “|” as “or”
Data types
• OCaml provides a general mechanism called a data type for defining new data structures that consist of many alternatives
type my_bool = Tru | Fal
8
a value with type my_bool is one of two things:
• Tru, or
• Fal
Tru and Fal are called “constructors”
read the “|” as “or”
9
Data types
• OCaml provides a general mechanism called a data type for defining new data structures that consist of many alternatives
type my_bool = Tru | Fal
type color = Blue | Yellow | Green | Red
there’s no need to stop
at 2 cases; define as many alternatives as you want
10
Data types
• OCaml provides a general mechanism called a data type for defining new data structures that consist of many alternatives
type my_bool = Tru | Fal
• Creating values:
type color = Blue | Yellow | Green | Red
let b1 : my_bool = Tru let b2 : my_bool = Fal let c1 : color = Yellow let c2 : color = Red
use constructors to create values
11
Data types
type color = Blue | Yellow | Green | Red
let c1 : color = Yellow let c2 : color = Red
• Using data type values:
let print_color (c:color) : unit = match c with
| Blue ->
| Yellow ->
| Green ->
| Red ->
use pattern matching to determine which color you have; act accordingly
• Using data type values:
Data types
12
type color = Blue | Yellow | Green | Red
let c1 : color = Yellow let c2 : color = Red
let print_color (c:color) : unit = match c with
| Blue -> print_string “blue”
| Yellow -> print_string “yellow” | Green -> print_string “green”
| Red -> print_string “red”
Data types
• Using data type values:
Why not just use strings to represent colors instead of defining a new type?
13
type color = Blue | Yellow | Green | Red
let c1 : color = Yellow let c2 : color = Red
let print_color (c:color) : unit = match c with
| Blue -> print_string “blue”
| Yellow -> print_string “yellow” | Green -> print_string “green”
| Red -> print_string “red”
14
Data types
type color = Blue | Yellow | Green | Red oops!:
let print_color (c:color) : unit = match c with
| Blue -> print_string “blue”
| Yellow -> print_string “yellow” | Red -> print_string “red”
Warning 8: this pattern-matching is not exhaustive. Here is an example of a value that is not matched: Green
15
Data types
type color = Blue | Yellow | Green | Red oops!:
let print_color (c:color) : unit = match c with
| Blue -> print_string “blue”
| Yellow -> print_string “yellow” | Red -> print_string “red”
Warning 8: this pattern-matching is not exhaustive. Here is an example of a value that is not matched: Green
OCaml’s datatype mechanism allow you to create types that contain precisely the values you want!
16
Data Types Can Carry Additional Values
• Data types are more than just enumerations of constants:
• Read as: a simple_shape is either:
– a Circle, which contains a pair of a point and float, or – a Square, which contains a pair of a point and float
type point = float * float
type simple_shape = Circle of point * float
| Square of point * float
s
(x,y)
(x,y)
r
17
Data Types Can Carry Additional Values • Data types are more than just enumerations of constants:
type point = float * float
type simple_shape = Circle of point * float
| Square of point * float
let origin : point = (0.0, 0.0)
let circ1 : simple_shape = Circle (origin, 1.0)
let circ2 : simple_shape = Circle ((1.0, 1.0), 5.0) let square : simple_shape = Square (origin, 2.3)
18
Data Types Can Carry Additional Values • Data types are more than just enumerations of constants:
type point = float * float
type simple_shape = Circle of point * float
| Square of point * float
let simple_area (s:simple_shape) : float =
match s with
| Circle (_, radius) -> 3.14 *. radius *. radius | Square (_, side) -> side *. side
Compare
• Data types are more than just enumerations of constants:
19
type point = float * float
type simple_shape = Circle of point * float
| Square of point * float
let simple_area (s:simple_shape) : float =
match s with
| Circle (_, radius) -> 3.14 *. radius *. radius | Square (_, side) -> side *. side
type my_shape = point * float
let simple_area (s:my_shape) : float =
(3.14 *. radius *. radius) ?? or ?? (side *. side)
20
More General Shapes
type point = float * float
type shape =
Square of float
| Ellipse of float * float
| RtTriangle of float * float
| Polygon of point list
Square s =
s RtTriangle (s1, s2) = s1
Ellipse (r1, r2) =
r1
r2 Polygon [v1; …;v5] =
s2
v2 v1
v5
v3 v4
21
More General Shapes
type point = float * float
type radius = float
type side = float
type shape =
Square of side
| Ellipse of radius * radius
| RtTriangle of side * side
| Polygon of point list
Type abbreviations can aid readability
Square s =
s RtTriangle (s1, s2) = s1
Ellipse (r1, r2) =
r1
r2 RtTriangle [v1; …;v5] =
s2
v2 v1
v5
v3 v4
22
More General Shapes
type point = float * float
type radius = float
type side = float
type shape =
Square of side
| Ellipse of radius * radius
| RtTriangle of side * side
| Polygon of point list
Square builds a shape from a single side
Ellipse builds a shape from a pair of radiuses
RtTriangle builds a shape from a pair of sides
let sq : shape = Square 17.0
let ell : shape = Ellipse (1.0, 2.0)
let rt : shape = RtTriangle (1.0, 1.0)
let poly : shape = Polygon [(0., 0.); (1., 0.); (0.; 1.)]
they are all shapes; they are constructed in
different ways
Polygon builds a shape
from a list of points
(where each point is itself a pair)
23
More General Shapes
type point = float * float
type radius = float
type side = float
type shape =
Square of side
| Ellipse of radius * radius
| RtTriangle of side * side
| Polygon of point list
a data type also defines a pattern for matching
let area (s : shape) : float =
match s with
| Square s ->
| Ellipse (r1, r2)->
| RtTriangle (s1, s2) ->
| Polygon ps ->
24
More General Shapes
type point = float * float
type radius = float
type side = float
type shape =
Square of side
| Ellipse of radius * radius
| RtTriangle of side * side
| Polygon of point list
a data type also defines a pattern for matching
let area (s : shape) : float =
match s with
| Square s ->
| Ellipse (r1, r2)->
| RtTriangle (s1, s2) ->
| Polygon ps ->
Square carries a value with type float so s is
a pattern for float values
RtTriangle carries a value with type float * float
so (s1, s2) is a pattern for that type
25
More General Shapes
type point = float * float
type radius = float
type side = float
type shape =
Square of side
| Ellipse of radius * radius
| RtTriangle of side * side
| Polygon of point list
a data type also defines a pattern for matching
let area (s : shape) : float =
match s with
| Square s -> s *. s
| Ellipse (r1, r2) -> pi *. r1 *. r2
| RtTriangle (s1, s2) -> s1 *. s2 /. 2.
| Polygon ps -> ???
26
Computing Area
• How do we compute polygon area?
• For convex polygons:
– Case: the polygon has fewer than 3 points:
• ithas0area! (itisalineorapointornothingatall)
– Case: the polygon has 3 or more points:
• Compute the area of the triangle formed by the first 3 vertices • Delete the second vertex to form a new polygon
• Sum the area of the triangle and the new polygon
v2 v1
v3
=
+
v5
v4
27
Computing Area
• How do we compute polygon area?
• For convex polygons:
– Case: the polygon has fewer than 3 points:
• ithas0area! (itisalineorapointornothingatall)
– Case: the polygon has 3 or more points:
• Compute the area of the triangle formed by the first 3 vertices • Delete the second vertex to form a new polygon
• Sum the area of the triangle and the new polygon
• Note: This is a beautiful inductive algorithm:
– the area of a polygon with n points is computed in terms of a
smaller polygon with only n-1 points! v2
v1 v5
v3
v4
=
+
Computing Area
28
let area (s : shape) : float =
match s with
| Square s -> s *. s
| Ellipse (r1, r2) -> r1 *. r2
| RtTriangle (s1, s2) -> s1 *. s2 /. 2.
| Polygon ps -> poly_area ps
let rec poly_area (ps : point list) : float =
match ps with
| p1 :: p2 :: p3 :: tail ->
tri_area p1 p2 p3 +. poly_area (p1::p3::tail)
| _ -> 0.
v2 v1
v3
=
+
v5
v4
This pattern says the list has at least 3 items
29
Computing Area
let tri_area (p1:point) (p2:point) (p3:point) : float =
let a = distance p1 p2 in
let b = distance p2 p3 in
let c = distance p3 p1 in
let s = 0.5 *. (a +. b +. c) in
sqrt (s *. (s -. a) *. (s -. b) *. (s -. c))
let rec poly_area (ps : point list) : float =
match ps with
| p1 :: p2 :: p3 :: tail ->
tri_area p1 p2 p3 +. poly_area (p1::p3::tail)
| _ -> 0.
let area (s : shape) : float =
match s with
| Square s -> s *. s
| Ellipse (r1, r2)-> pi *. r1 *. r2
| RtTriangle (s1, s2) -> s1 *. s2 /. 2.
| Polygon ps -> poly_area ps
INDUCTIVE DATA TYPES
30
31
Inductive data types
• We can use data types to define inductive data
• A binary tree is:
– a Leaf containing no data
– a Node containing a key, a value, a left subtree and a right subtree
32
Inductive data types
• We can use data types to define inductive data
• A binary tree is:
– a Leaf containing no data
– a Node containing a key, a value, a left subtree and a right subtree
type key = int
type value = string
type tree =
Leaf
| Node of key * value * tree * tree
Inductive data types
33
type key = int
type value = string
type tree =
Leaf
| Node of key * value * tree * tree
let rec insert (t:tree) (k:key) (v:value) : tree =
Inductive data types
34
type key = int
type value = string
type tree =
Leaf
| Node of key * value * tree * tree
let rec insert (t:tree) (k:key) (v:value) : tree = match t with
| Leaf ->
| Node (k’, v’, left, right) ->
Again, the type definition specifies the cases you must consider
Inductive data types
35
type key = int
type value = string
type tree =
Leaf
| Node of key * value * tree * tree
let rec insert (t:tree) (k:key) (v:value) : tree = match t with
| Leaf -> Node (k, v, Leaf, Leaf)
| Node (k’, v’, left, right) ->
Inductive data types
36
type key = int
type value = string
type tree =
Leaf
| Node of key * value * tree * tree
let rec insert (t:tree) (k:key) (v:value) : tree = match t with
| Leaf -> Node (k, v, Leaf, Leaf)
| Node (k’, v’, left, right) ->
if k < k' then
Node (k', v', insert left k v, right)
else if k > k’ then
Node (k’, v’, left, insert right k v)
else
Node (k, v, left, right)
Inductive data types
37
type key = int
type value = string
type tree =
Leaf
| Node of key * value * tree * tree
let rec insert (t:tree) (k:key) (v:value) : tree = match t with
| Leaf -> Node (k, v, Leaf, Leaf)
| Node (k’, v’, left, right) ->
if k < k' then
Node (k', v', insert left k v, right)
else if k > k’ then
Node (k’, v’, left, insert right k v)
else
Node (k, v, left, right)
Inductive data types
38
type key = int
type value = string
type tree =
Leaf
| Node of key * value * tree * tree
let rec insert (t:tree) (k:key) (v:value) : tree = match t with
| Leaf -> Node (k, v, Leaf, Leaf)
| Node (k’, v’, left, right) ->
if k < k' then
Node (k', v', insert left k v, right)
else if k > k’ then
Node (k’, v’, left, insert right k v)
else
Node (k, v, left, right)
39
Inductive data types: Another Example
• We can use the type “int” to represent natural numbers
– but that is kind of broken: it also contains negative numbers – we have to use a dynamic test and a default value:
let double (n : int) : int =
if n < 0 then 0
else
double_nat n
40
Inductive data types: Another Example
• We can use the type "int" to represent natural numbers
– but that is kind of broken: it also contains negative numbers
– we have to use a dynamic test and a default value:
– or raise an exception
– it would be nice if there was a way to define the natural numbers exactly, and use OCaml's type system to guarantee no client ever attempts to double a negative number
let double (n : int) : int =
if n < 0 then
raise (Failure "negative input!")
else
double_nat n
41
Inductive data types
• Recall, a natural number n is either: – zero, or
–m+1
• We use a data type to represent this definition exactly:
42
Inductive data types
• Recall, a natural number n is either: – zero, or
–m+1
• We use a data type to represent this definition exactly:
type nat = Zero | Succ of nat
Inductive data types
• Recall, a natural number n is either: – zero, or
–m+1
• We use a data type to represent this definition exactly:
43
type nat = Zero | Succ of nat
let rec nat_to_int (n : nat) : int = match n with
Zero -> 0
| Succ n -> 1 + nat_to_int n
44
Inductive data types
• Recall, a natural number n is either: – zero, or
–m+1
• We use a data type to represent this definition exactly:
type nat = Zero | Succ of nat
let rec nat_to_int (n : nat) : int = match n with
Zero -> 0
| Succ n -> 1 + nat_to_int n
let rec double_nat (n : nat) : nat = match n with
| Zero -> Zero
| Succ m -> Succ (Succ (double_nat m))
45
Summary
• OCaml data types: a powerful mechanism for defining complex data structures:
– They are precise
• contain exactly the elements you want, not more elements
– They are general
• recursive, non-recursive (mutually recursive and polymorphic)
– The type checker helps you detect errors • missing cases in your functions
PARAMETERIZED TYPE DEFINITIONS
46
General form:
definition: definition:
47
type (‘key, ‘val) tree =
Leaf
| Node of ‘key * ‘val * (‘key, ‘val) tree * (‘key, ‘val) tree
type ‘a inttree = (int, ‘a) tree
type istree = string inttree
Note: istree is an abbreviation for (int,string) tree
A more conventional notation would have been (but is not ML):
type ‘x f = body
use: use:
type f x = body
arg f
f arg
48
Take-home Message
• Think of parameterized types like functions: – a function that take a type as an argument – produces a type as a result
• Theoretical basis:
– System F-omega
– a typed lambda calculus with general type-level functions as well as value-level functions