Chapter 2: Basic Concepts 2.3 Data Types and Pattern Matching
programmer might use f instead of fact or fact_tr. This is the first simple instance of information hiding – this is an important software engineering principle we will see also later in these notes.
2.3 Data Types and Pattern Matching
2.3.1 Introduction to Non-Recursive and Recursive Data Types
Copyright By PowCoder代写 加微信 powcoder
Often it is useful to define a collection of elements or objects and not encode all data we are working with using our base types of integers, floats or strings. For example, we might want to model a simple card game. As a first step, we want to define the suit and rank present in the game. In OCaml, we define a finite, unordered collection of elements using a (non-recursive) data type definition. Here we define a new type suit that contains the elements Clubs, Spades, Hearts, and Diamonds.
1 type suit = Clubs | Spades | Hearts | Diamonds
We also say Clubs, Spades, Hearts, and Diamonds inhabit the type suit. Often, we simply say Clubs, Spades, Hearts, and Diamonds are of type suit. When we talk about the input or outputs of a given program, we also might simply say “given a suit” meaning that we are given an element of type suit.
When we define a collection of elements the order does not matter. Further, OCaml requires that each element begins with a capital letter.
Elements of a given type are defined by constructors, i.e. constants that allow us to construct elements of a given type. In our previous definition of our type suit we have four constructors, namely Clubs, Spades, Hearts, and Diamonds. In fact, these are simply user-defined constants that have type suit.
How do we write programs about elements of type suit? – The answer is pattern matching using a match-expression.
match
|
|
We call the expression that we analyze the scrutinee. Patterns characterize the shape of values the scrutinee might have by considering the type of the scrutinee. Let’s make this more precise by looking at an example. We want to write a function dom that takes in a pair s1 and s2 of suit. If s beats of is equal to suit s2 we return true – otherwise we return false. We will use the following ordering on suits:
Spades > Hearts > Diamonds > Clubs
19 c B. Pientka – Fall 2020
1 2 3 4 5 6
Chapter 2: Basic Concepts 2.3 Data Types and Pattern Matching
The type of the function dom is suit * suit -> bool.
let rec dom (s1, s2) = match (s1, s2) with
| (Spades, _)
| (Hearts, Diamonds) | (Hearts, Clubs)
| (Diamonds, Clubs) | (s1, s2)
-> true ->s1= s2
In the example above, we want to form a pattern of type suit * suit, i.e. a pair of two suits. A pattern for a pair of type suit * suit consists of two parts and can be written as ( patt1 , patt2 ) where both patt1 and patt2 must be a pattern for a suit. What are patterns of type suit? – All constants, i.e. Spades, Hearts, Diamonds, and Clubs , that inhabit suit are valid patterns. We can also write generic patterns such as a variable (as in (s1, s2) or an under score or wild card (as in _).
In general, a pattern can be built from constructors. What possible patterns we can be formed is guided by the type of the pattern and can be described inductively.
A pattern pat of type T can be characterized as follows:
• A pattern variable x (or y, s1, s2, etc.) is a pattern of type T.
• A wild card_ is a pattern of typeT.
• A constant c of type T is a pattern of type T.
• Apair(pat1,patt2)isapatternoftypeT*S,ifpat1isapatternoftypeTand pat2 is a pattern of type S.
What is the operational behavior of writing a program using patterns and pattern matching? – When we execute the function dom we must pass it a pair of two suit, for example (Hearts, Diamonds), and we will find a case in that matches, i.e. it is an instance of the given pattern.
Let’s look at a few examples.
• The pair (Hearts,Diamonds) matches the second pattern, since it is equal to it.
• The pair (Spades, Diamonds) matches the first pattern, as a wild card matches anything.
• The pair (Hearts, Spades) matches the last pattern (s1, s2) and binds s1 to the value Hearts and s2 to the value Spades.
20 c B. Pientka – Fall 2020
Chapter 2: Basic Concepts 2.3 Data Types and Pattern Matching
Operationally, OCaml will try each pattern in the order specified. Indeed, we will only hit the last case where (s1,s2) only if none of the previous cases matched. Hence we are guaranteed that s1 cannot be Spades, for example.
In our given examples, we always return true for each of the first four cases. We hence can write the pattern matching even more compactly grouping all the first cases together.
1 2 3 4 5 6
To describe a card, we introduce a type definition or abbreviation. 1 type card = rank * suit
For now this allows us to write in our type specification simply card instead of rank * suit and makes type specifications easier to read.
To actually play a simple card game, we also need to be able to describe a collec- tion of cards we may hold in our hand. This is not a finite collection – we may hold no cards, one card, or any number of cards! In our setting, we do not want to restrict the number of cards we can hold, as this may also differ from game to game.
We can say that a hand (i.e. the collection of cards we hold in a given hand) is either empty or it consists of a card followed by the rest of the hand.
21 c B. Pientka – Fall 2020
let rec dom (s1, s2) = match (s1, s2) with
| (Spades, _) | (Hearts, Diamonds)
| (Hearts, Clubs) | (Diamonds, Clubs) -> true
| (s1, s2) -> s1 = s2
Patterns are a flexible and compact way to compare data. In particular, we can write deep patterns. You might have noticed that our definition of patterns was inductive and allows us to inspect deeply nested pairs. For example, let’s build a tuple of type (suit * int) * (suit * int) where we pair every suit with an integer to describe its value and we now want to add up the values of matching suits and return 0 otherwise.
let rec add c1 c2 = match c1, c2 with | (Hearts, v1), (Hearts , v2)
| (Diamonds, v1), (Diamonds,v2)
| (Spades, v1), (Spades, v2)
| (Clubs, v1) , (Clubs, v2) -> v1 + v2 | _ , _ -> 0
Imagine writing this program with if-expressions in the traditional way! This will be long, cumbersome and difficult to read.
Let’s continue with our example. We define the rank of cards similarly to the suit of cards introducing a new type cards.
type rank =
Two | Three | Four | Five | Six | Seven | Eight | Nine | Ten | Jack | Queen | King | Ace
1 2 3 4 5 6 7 8
Chapter 2: Basic Concepts
hand1 = Hand
(Ace, Hearts)
2.3 Data Types and Pattern Matching
hand2 = Hand
(Queen, Diamonds) Hand
(Ace, Hearts)
Figure 2.1: Representations for hand1 and hand2 as degenerate binary trees
We will now define a recursive data type hand that characterizes the collection of
cards we hold more precisely:
• The constant Empty is of type hand. It describes the empty collection of cards one holds in a given hand.
• Ifc is acard andh is of typehand thenHand (c, h) is of typehand.
• Nothing else is of type hand.
in the above definition Empty and Hand are constructors. Empty is a constant, i.e. a constructor that takes in no arguments. Hand is a constructor that takes in a tuple consisting of a card and another hand.
In OCaml, we can define the recursive data types hand as follows: 1 type hand = Empty | Hand of card * hand
Here are some examples of hands of type hand we can construct.
let hand0:hand = Empty
let hand1:hand = Hand((Ace, Hearts), Empty)
let hand2:hand = Hand((Queen, Diamonds), hand1) let hand5:hand = Hand((Ace, Spades),
Hand((Ten, Diamonds), Hand((Seven, Clubs),
Hand((Queen, Spades),
Hand((Eight, Clubs), Empty)))))
We can think of the type hand as describing a binary tree. The leaf is described by the constant Empty. The constructor Hand builds a binary tree with two children: the left child is a card and the right child is another binary tree. The tree is a degenerate tree that will only grow to the right (see Fig. 2.1).
Let’s see how we use pattern matching on elements of type hand. We want to write a function extract of type suit -> hand -> hand which given a suit and a hand it extracts
22 c B. Pientka – Fall 2020
Chapter 2: Basic Concepts 2.3 Data Types and Pattern Matching
from the input hand all those cards that match the given suit and returns a new hand containing only those cards.
To illustrate here is the expected behaviour when we execute the function extract
# extract Spades hand5;;
– : hand = Hand ((Ace, Spades), Hand ((Queen, Spades), Empty)) # extract Diamonds hand5;;
– : hand = Hand ((Ten, Diamonds), Empty)
# extract Hearts hand5;;
– : hand = Empty
1 2 3 4 5 6
1 extract : suit -> hand -> hand not as
1 extract : suit * hand -> hand
There is a subtle, but important difference between these two types. The first one says, we pass to the function extract first a suit followed (possibly later) by a hand. Indeed when we called the function extract in the above example we simply passed the two inputs one after the other. The second one says, we pass to the function extract a tuple consisting of a suit and a hand, i.e. we pass at the same time both the suit together with the hand. For now, you simply should understand that they describe different ways of defining functions. We will see later that in fact we can translate between these two ways of defining functions and how we can exploit the first kind of definition.
Now that we understand what we want to write, let’s write the function extract. We need to recursively analyze all possible hands. What are possible hands? – The recursive definition of hand tells us that they are either built by Empty or by Hand(c, h) where c is a card and h is a hand. If we have an Empty hand, there is nothing to extract and we simply return Empty. When we have Hand(c,h), then we must compare whether the card c has the appropriate suit. If it does not, we recursively analyze the remaining hand h. If it does, we want to keep the card c and recursively analyze the remaining hand h.
Before we define the function extract, let’s pay attention to the type of the func- tion. We defined it as
This recipe translates directly into a recursive program in OCaml.
let rec extract (s:suit) (h:hand) = match h with | Empty -> Empty
| Hand ((_, s’) as c, h’) ->
if s = s’ then Hand(c, extract s h’) else extract s h’
In the code above, we annotated the variable names s and h with their correspond- ing type. This is not necessary since in general OCaml will infer their types, however it sometimes is useful to express the type information to document the code.
23 c B. Pientka – Fall 2020
Chapter 2: Basic Concepts 2.3 Data Types and Pattern Matching
We proceed by pattern matching on h. If h is Empty, we simply return Empty. To analyze a hand that is not Empty, we write the pattern Hand ((_, s’) as c, h’). As we need to know the suit of the card c, we use deep pattern matching and write directly a tuple (_,s’) for the card. As we want to also refer to the card by the variable name c in the body of the branch, we name the tuple using the as keyword writing (_,s’) as c. In the body of the branch, we now simply compare the suit s’ with the given suit s. If the two suits are equal, we need to build a new hand keeping the card c.
Note that we do not modify the input hand h. We are simply analyzing it and returning a new hand copying some cards from the input hand.
As a last example, we want to find the first card in a hand of a given rank. For example, given the hand5 earlier we want to find the first card whose rank is Queen and return the corresponding suit, in this case Spades. Intuitively, we need to search through a hand and inspect the rank of each card. If we find a card whose rank matches the given one, then we return the corresponding suit. But what shall we return, if there is no such card? – We might raise an error, but here we explore another option: an optional data type which is in fact built-in in OCaml. To illustrate we define it here:
1 type ’a option = None | Some of ’a
The option type is a non-recursive type. It describes a collection of elements where we can tag elements of type ’a with the constructor Some and also include the default element None. This type is polymorphic and can be used for any type. This is particu- larly convenient to describe optional values. In our given example, we can use it to return some suit, if we find a card with the given rank and None otherwise.
Let us return to writing the function find whose type is: 1 find: rank * hand -> suit option
It takes a tuple consisting of a (r:rank) and a (h:hand) as input and returns an optional value of type suit. If our given hand h contains a card of rank r’ and suit s’ where r = r’ then we return Some s’ – otherwise the result will be None. We implement this function recursively in OCaml by pattern matching on the hand h.
let rec find (r, h) = match h with
| Empty -> None
| Hand ((r’, s’), h’) -> if r = r’ then Some s’
else find (r, h’)
2.3.2 Recursive Data Types: Lists
One of the most convenient and most used data types are lists. We can define poly- morphic lists of type ’a list inductively similarly to how we defined a hand.
24 c B. Pientka – Fall 2020
Chapter 2: Basic Concepts 2.3 Data Types and Pattern Matching
• The constant [] is of type ’a list. It describes the empty list.
• Ifh is an element of type’a andt is of type’a list thenh :: t is of type’a list. • Nothing else is of type list.
Here, ’a describes a type variable. Our inductive definition for lists is polymorphic (from the Greek meaning ”having multiple forms”). In object-oriented language such as Java, it is also referred to as generic.
A polymorphic data type definition allows us to construct many different instances using the same constructors. Concretely, we can for example use it to create lists of floating point numbers:
1 let fl0 : float list = 8.6::5.4::[]
describes the list of floating point numbers [8.6 ; 5.4]. We could also have simply
defined fl0 as
1 let fl10 = [8.6 ; 5.4]
where we omitted the type annotation on fl0 and wrote the list in a more compact form. We can think of the latter way of simply writing [8.6 ; 5.4] as syntactic sugar for the first definition which makes it clear how the lists are built. Similar to how we built a hand, here :: is the constructor which we use as an infix operator to build lists.
# let l0 = 1::2::3::[];;
val l0 : int list = [1; 2; 3]
# let l1 = “a”::”b”::”c”::”d”::[];;
val l1 : string list = [“a”; “b”; “c”; “d”] # let l2 = “x”::”y”::”z”::[];;
val l2 : string list = [“x”; “y”; “z”]
1 2 3 4 5 6
Figure 2.2: Representation of lists
c B. Pientka – Fall 2020
Building lists can also be thought of as a bi- nary tree where the leaf is the constant [] and the constructor :: builds a binary tree with two children: the left child is an expression of type ’a and the right hand side is another binary tree. The tree is a degenerate tree that will grow to the right and has a similar shape as the degenerate tree describing a hand (see Fig. 2.2).
As lists are defined polymorphically, we can build not only lists containing floating point numbers, but also lists containing strings, lists containing integers, even lists containing other lists!
Chapter 2: Basic Concepts 2.3 Data Types and Pattern Matching
# let l3 = l1 :: l2 :: [];;
val l3 : string list list = [[“a”; “b”; “c”; “d”]; [“x”; “y”; “z”]] # let l4 = [3;4;0;7];;
val l4 : int list = [3; 4; 0; 7]
Appending Lists Let’s see how we write some simple recursive programs about lists using pattern matching. The first example we consider is appending two lists, i.e. given the list l0 and l4 we get [1; 2; 3; 3; 4; 0; 7]. Appending two lists simply glues them together. To achieve this, we recursively traverse the first list until it is empty and we return the second list. As we come back out of the recursion we paste each element of the first list back onto the computed result.
let rec append l1 l2 = match l1 with |[] ->l2
| x::t -> x::append t l2
OCaml has the append operation built-in. One can use @ as an infix operator to append two lists simply writing l0 @ l4. However, it is useful to understand how the append function is written.
To emphasize the value of pattern matching, let’s pause for a moment and think about how we could write this program without pattern matching in OCaml. From a functional programmer’s perspective this is terrible code, but sometimes it is useful to see the ugly side of things to appreciate the beauty of a concept.
In languages without sophisticated pattern matching, we might first define two destructors head and tail which allow us to take a list apart. We can define these destructors as functions in OCaml as follows:
(* head : ’a list -> ’a *)
let head (h::t) = h
(* tail: ’a list -> ’a list *)
let tail l = match l with | [] -> []
| h::t -> t
1 2 3 4 5 6 7
When type checking the program head in OCaml, you will get a warning:
# let head (h::t) = h;; ^^^^^^^^^^
Warning P: this pattern-matching is not exhaustive. Here is an example of a value that is not matched: []
OCaml not only will infer the type of a program expression, but it also checks whether you have covered all cases. Here, the function head is not defined for the empty list. If you were to call head [], OCaml raises a non-exhaustive match exception and will abort the execution safely.
26 c B. Pientka – Fall 2020
Chapter 2: Basic Concepts 2.3 Data Types and Pattern Matching
This illustrates an important point about pattern matching, namely that OCaml can give you useful warning messages which allow you to revise your code and take care of some corner cases. It is a great debugging tool that statically analyzes your program before ever running it.
Let’s revisit now how to append two lists using if-expressions. In a language without pattern matching you might write:
1 2 3 4 5 6 7
let rec app (l1, l2) = if l1 = [] then l2
head(l1)::(app (tail(l1), l2))
Arguably this program is harder to read and harder to understand than the pro- gram using pattern matching.
This is the second important benefit of pattern matching: it makes programs easier to read and understand.
Sorting To further illustrate the usefulness of pattern matching and give more ex- ample of recursive programs, let’s implement merge sort. The idea behind merge sort is to split a given list in two, then sort the sublists recursively and merge their results.
For example, given a list [4;1;2;7;5;3] we split the list into two parts: list a = [4;2;5] and list b =[1;7;3]. For simplicity, we simply put elements at even positions into the list a and the elements at odd positions into the list b.
Recursively sorting the list a will return [2;4;5] and recursively sorting the list b will give us [1;3;7]. Now we need to simply merge the results into a sorted list.
To implement merge sort, we hence write two auxiliary functions: split which takes a list and splits it into two and merge which given two sorted lists returns the sorted combination of the two input lists.
The split function has type ’a list -> (’a list * ’a list). Recall that we want to put elements at even positions into the first list and elements at odd positions into the second list. This is elegantly accomplished by inspecting the first two elements of a given input list at the same time. Hence, we consider the following three cases: either the input list is empty, the input list contains one element, and the input list contains at least two elements. In the last case we simply split recursively the remaining elements and then glue the elements back to the respective lists.
(* split: ’a list -> (’a list * ’a list) *)
let rec split l = match l with |[] -> ([],[]) | [h] -> ([h],[ ]) | (h1::h2::t) ->
let (left, right) =
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com