COMP1100/1130 [2020 S1]:
PROGRAMMING AS PROBLEM SOLVING Research School of Computer Science
Menu
menu
» Labs » Week 10: Type Classes, Ad Hoc Polymorphism, Binary Search Trees
In this lab, we cover type classes and ad hoc polymorphism, and how we can use these concepts to generalise functions that require some assumptions about the input type. We also continue the topic of trees from last lab, and introduce binary search trees, which are trees with a special ordering constraint that gives them a great advantage over binary trees in terms of computational eUciency.
Table of Contents
Pre-Lab Checklist
Types
Type Classes
Applications of Type Classes Instances
Deriving Type Classes Binary Search Trees Extensions
Appendix: Bisection Search
Pre-Lab Checklist
You have completed Lab 9, understand binary trees, and how to write recursive functions over them.
You are comfortable with the concept of parametric polymorphism.
You are comfortable with the recursive data type Nat from Lab 06
Getting Started
1. Go to the project’s dashboard under the Project tab and click on the Fork button. The gitlab url for this lab is https://gitlab.cecs.anu.edu.au/comp1100/2020s1student_les/lab10
2. You will be asked where to fork the repository. Click on the user or group to where you’d like to add the forked project. You should see your own name here. Once you have successfully forked a project, you will be redirected to a new webpage where you will notice your name before > project-name. This means you are now the owner of this repository. The url in your browser should re`ect the change: https://gitlab.cecs.anu.edu.au/uXXXXXXX/lab10
3. Finally, clone your forked repository to your computer according to the instructions in the Week 2 Lab.
4. Put your exercise solutions in the _le Lab10.hs. This _le is pre-_lled with un_nished functions that match the exercises below.
Types
We’ve seen before in Lab06 that some functions can be written to work on an arbitrary type, like length:
The actual values of the elements in the list do not concern us, we only want to know how many there are, which is why this function can be written in a way that’s parametrically polymorphic, that is, it will work regardless of the type of the elements in the list.
Now we would like to generalise other functions. Consider the following function sumInteger:
It’s possible to write the same function for types other than Integer, like Double.
Clearly this function cannot be made parametrically polymorphic, as the addition operator + isn’t de_ned for all types. (We can’t add together strings or booleans, for example.)
So we want a way to de_ne the sum function to operate over all lists that contain types that “act like numbers”, that is, types where addition is well de_ned. This is a weaker form of polymorphism, called ad-hoc polymorphism.
If we had just tried to make the type of sum as general as possible by changing the type signature to
sumEverything :: [a] -> a then we get the error message
No instance for (Num a) arising from a use of `+’
This is a rather cryptic way of saying “The input type can be anything, but addition is only de_ned on things that are numbers, that is, types that belong to the type class of Num.” To solve this problem, we need to introduce type classes.
Type Classes
Haskell handles ad-hoc polymorphism by using type classes, grouping types together into classes that all share similar properties, (they all act like numbers, they all have equality de_ned, etc.)
The main type classes you’re likely to see in Haskell are as follows:
Some types depend on others, and the type classes form a hierarchy, as shown in the following diagram.
We’ve left out some functions here, if you want to see the full list, or if you ever want to know more about a type, you can type in :info TypeName into GHCi. It will also tell you what other types are instances, or belong to, this type class.
Here, we can see that members of the type class Eq have the functions (==) and (/=) de_ned. We can also see that a few familar types, Double, Char, Bool, Int all have equality de_ned on them. Digging deeper, we _nd the interesting lines
which tell us that if a type a has equality de_ned, then so too does the type [a]. This is very powerful, as it saves on extra code that we don’t need to write. As soon as you’ve told haskell how to compare integers, it can automatically work out that [1,2] == [1,3] should evaluate to False.
The same goes for tuples: since Integer and Char both have equality de_ned, so does (Integer, Char) without any further work.
Applications of Type Classes
Consider the following function:
The notation
Num a => [a] -> a
means that the function has type [a] as input, and returns an element of type a as output, where a is any type from the type class Num a. So [Integer] -> Integer, or [Double] -> Double are special cases of Num a => [a] -> a, but [Double] -> Integer or [Bool] -> Bool is not.
If we want a type to be a member of two or more type classes, we can enforce this.
Here, we are enforcing that the input type has equality de_ned, and can be turned into a string. Sometimes specifying two type classes is redundant, e.g. (Ord a, Eq a) => … is the same as Ord a => … as Ord is a superclass of Eq.
Exercise 1 (Optional)
See if you can work out what type classes the following types belong to. You might like to experiment and try out some operations in GHCi (like trying to compare two elements with <=, or trying to add two elements together) to see what is de_ned.
Exercise 2
You can check your answers by asking GHCi what it had inferred the type to be, by entering :type functionName.
Submission required: Exercise 2 inferring types Instances
When we create a new type, we would like to tell Haskell what type class our new type shall belong to. We use the following code,
which breaks down as:
instance is a reserved keyword indicating we are adding a new type to a type class
TypeClass is the type class we want our new type to belong to
MyNewType is the name of our new type
We then have to de_ne all the functions that a member of that type class should have de_ned.
For the Eq type class, we need only (==), (and we get (/=) for free.)
For Ord, we need to already be in Eq, and then also de_ne either (<=) or compare (and Haskell can work out all the other operators like (<), (>), (>=) for free.)
If you’re interested, you can look at the documentation for the Prelude, and under each type class, the minimal complete de_nition tells you the fewest functions that you need to de_ne, and then the rest are automatically derived for you.
For example, (see Fruit.hs) suppose we de_ne a new type for fruit, data Fruit = Apple | Banana | Orange
and we wanted to de_ne equality on our fruit. We need to tell Haskell what (==) means for fruit.
So we de_ne equality in the obvious way. All fruits are equal to themselves, and any fruit is not equal to a different fruit.
I can also de_ne what it means to show a Fruit.
So when I type Banana into GHCi, a Banana is printed to the terminal using my custom de_nition for show.
I can also de_ne ordering on fruit. I’m going to order fruit by how much I enjoy eating them: Banana being the best, Apple being okay, and Orange is the worst. (Deepest apologies to those that love oranges.)
If we give a full de_nition of the (<=) function:
then Haskell can use that to de_ne all the usual ordering operators:
You’ll note how tedious it is to write out all the cases. Surely if Orange <= Apple and Apple <= Banana, then Orange <= Banana, right?
Sadly, Haskell doesn’t place any sensible constraints on the <= function we de_ne. It’s up to the user to de_ne <= in such a way that the usual ordering properties you would expect, are satis_ed.
Deriving Type Classes
Rather than doing the hard work ourselves, we can ask Haskell to de_ne a lot of these properties for us, using the deriving keyword.
Haskell will assume the equality you wanted is the “obvious” implementation, that every fruit is equal to itself, and that no two different fruits are equal.
It will also place a ordering on fruit, on the order they appear in the de_nition of Fruit, so Orange is less than Apple, which is in turn less than Banana.
The default de_nitions of equality and ordering Haskell provides satisfy all the nice properties you would expect them to. The default implementation of show will also print each fruit exactly as they are de_ned.
Type classes where you can ask Haskell to do the hard work (de_ning all the functions automatically) for you are called derivable classes. Not all type classes are derivable. If we tried to add arithmetic to our fruit by making it a member of the type class Num,
then Haskell will throw an error
and inform you that it doesn’t know how to de_ne arithmetic on Fruit. What should Apple + Banana be? What about Banana * Orange? There’s no obvious way to de_ne it.
Exercise 3A
Recall our de_nition of Nat from Lab 06.
We’d like to de_ne equality on natural numbers: Two arbitrary natural numbers are equal if they have the successor S applied the same number of times.
Using the template given, de_ne the function (==) for the type Nat. Do not use deriving Eq here. You should de_ne equality yourself.
Exercise 3B
It’s possible to place an ordering on natural numbers as follows:
Z < (S Z) < S (S Z) < ...
So if we want to ensure Nat belongs to the type class Ord, we need only to de_ne a de_nition of (<=), and Haskell can work out corresponding de_nitions for all the other operators de_ned on Ord: (<), (>), max, min, compare,.
Using the template given, de_ne the function (<=) for the type Nat. Do not use deriving Ord here. You should de_ne equality yourself.
Submission required: Exercise 3A, 3B, instance Eq, Ord Binary Search Trees
We saw in Lab 9 how we can write functions over binary trees, and over rose trees, but we haven’t yet seen how eUcient the tree data structure can be. You might have noticed that for the exercises in last lab, you were forced to look through all the elements in a tree to check if something was there, or to _nd the maximum/minimum element, as you would have to do with a list. So far, trees appear to just be lists, but more annoying to deal with.
We need to add one extra property to provide trees with much more power than they currently have, a property we call the binary search ordering constraint.
A binary tree satis_es the binary search ordering constraint, if either:
It is a Null tree
Every element in the left hand subtree is strictly less than the root, and every element in the right hand subtree is strictly more than the root. Furthermore, both subtrees also satisfy the binary search ordering constraint.
Binary trees that meet this constraint are called binary search trees.
Note that it is very important for the subtrees to also pass the ordering constraint, given
the following example tree. (This tree is included for you as notBStree.)
See how on the left subtree, it is true that all elements {1, 3, 2} are less than the root node of 5. But if we consider the left subtree in isolation:
We can see that the left subtree does not satisfy the ordering constraint, as the root node 3, is bigger than the biggest element in the right subtree, 2.
This means the original tree is not a binary search tree.
In Haskell, binary search trees are no different structurally from binary trees, and so we de_ne them appropriately
type BSTree a = BinaryTree a
while keeping in mind that any instance of a BSTree had better satisfy the ordering
constraint, as there’s no easy way to enforce that property in the type itself.
This also means the type a that a BSTree contains must be a member of the type class
Ord.
This ordering property is what gives binary search trees a huge improvement in performance on lists. We can search through the tree, moving left if the root node is too large, and moving right if it’s too small. To see an example, see the appendix at the end of this lab. Otherwise, you can move on with the exercises.
Exercise 4
elemBSTree :: (Ord a) => a -> (BSTree a) -> Bool
Do not use the same function as before, you should be able to search more eUciently. As a reminder, this function should take an element, and a binary search tree, and check if that element is present in the tree.
Submission required: Exercise 4 elemBSTree Exercise 5
> treeBSMax goodTree
31
> treeBSMin goodTree
1
Submission required: Exercise 5 treeMaximum, treeMinimum Exercise 6
isBSTree :: (Ord a) => BinaryTree a -> Bool
>isBSTree goodTree
True
>isBSTree badTree
True
>isBSTree notBSTree
False
Submission required: Exercise 6 isBSTree Exercise 7
treeInsert :: (Ord a) => BSTree a -> a -> BSTree a
(If the element is already in the tree, leave the tree unchanged.)
Submission required: Exercise 7 treeInsert Exercise 8
(That is, when a binary search tree is `attened, the resulting list should be sorted.)
flattenTreeOrd :: BinaryTree a -> [a]
> flattenTreeOrd smallTree
[1,5,10]
> flattenTreeOrd notBSTree
[1,3,2,5,6,7,9]
Submission required: Exercise 8 flattenTreeOrd
Extensions [COMP1100 Optional, COMP1130
Compulsory] Extension 1
Write a function
treeDelete :: (Ord a) => (BSTree a) -> a -> (BSTree a)
that takes a binary search tree, and an element, and removes it from the tree (if it is present).
Submission required: Extension 1 treeDelete Extension 2 (Tricky)
Write a function
treeBalance :: (Ord a) => BSTree a -> BSTree a
that takes a binary search tree of integers, and rearranges the structure of the tree so it is now balanced. You may have to do some research as to how to implement this. (Really Tricky: Do it without rebuilding the entire tree from scratch.)
Submission optional: Extension 2 treeBalance Extension 3
Nat acts like a number, so we’d also like it to be a member of the type class Num.
Note that for an instance a of Num:
abs :: a -> a is the absolute value function
signum :: a -> a returns the sign of a number, +1 if the number is positive, 0 if the number is zero, and −1 if it is negative.
fromInteger :: Integer -> a takes an Integer, and converts it to the new number type a.
You can get away without a de_nition for negate, as once (-) is de_ned, Haskell can work out negate for free. (Given that natural numbers cannot be negative anyway, the negate function is not very useful to us.)
Submission required: Extension 3 Nat as an instance of Num Extension 4
This lab contains additional exercises for COMP1130 students. COMP1100 students are encouraged to attempt them once they have completed the compulsory exercises to receive the participation marks for the lab.
This is a long lab, and there are two main topics, both of which are of high importance. If you are still working on exercises 1-3 after the _rst hour, it is recommended that you skip to Exercise 4. Don’t stress though! This is just so you have a chance to ask your tutor questions about both typeclasses, and binary search trees. As always, you can _nish up any remaining questions afterwards before next week’s lab.
length :: [a] -> Integer length list = case list of
[] -> 0
_:xs -> 1 + length xs
sumInteger :: [Integer] -> Integer sumInteger list = case list of
[] -> 0
x:xs -> x + sumInteger xs
sumDouble :: [Double] -> Double sumDouble list = case list of
[] -> 0
x:xs -> x + sumDouble xs
Type Class
(some of the) functions de_ned
Description
Depends on?
Show
show
Can be printed as a string
N/A
Eq
(==), (/=)
Has equality de_ned
N/A
Ord
<=, max, compare
Has ordering de_ned
Eq
Num
(+), (*), (-), abs, negate, signum, fromInteger
Numbers, can perform arithmetic
N/A
Fractional
(/), ect.
Fractional numbers, can be divided
Num
Floating
sqrt, exp, log, sin,cos, pi, ect.
Acts like real numbers, many mathematical functions de_ned
Fractional
Real
toRational
Numeric types that can be written as a rational
Num,Ord
Enum
succ, pred, toEnum, fromEnum, ect.
Sequentially ordered types
N/A
Integral
div, mod, ect.
Acts like whole numbers, with integer division
Real, Enum
Prelude> :info Eq class Eq a where
(==) :: a -> a -> Bool
(/=) :: a -> a -> Bool
instance Eq a => Eq [a] — Defined in ‘GHC.Classes’ instance Eq Word — Defined in ‘GHC.Classes’ instance Eq Ordering — Defined in ‘GHC.Classes’ instance Eq Int — Defined in ‘GHC.Classes’ instance Eq Float — Defined in ‘GHC.Classes’ instance Eq Double — Defined in ‘GHC.Classes’ instance Eq Char — Defined in ‘GHC.Classes’ instance Eq Bool — Defined in ‘GHC.Classes’
…
instance Eq a => Eq [a] — Defined in ‘GHC.Classes’
instance (Eq a, Eq b) => Eq (a, b) — Defined in ‘GHC.Classes’
sum :: Num a => [a] -> a sum list = case list of
[] -> 0
x:xs -> x + sum xs
areTheyEqual :: (Eq a, Show a) => a -> a -> String areTheyEqual x y
|x == y = (show x) ++ ” is equal to ” ++ (show y) |otherwise = (show x) ++ ” is not equal to ” ++ (show y)
This exercise is optional, and is here just to help you further get your head around type classes. If you feel con_dent, feel free to skip ahead to Exercise 2.
Integer
Double
Char
Bool
(Integer, Integer) (Integer, Double) String
[Integer]
Integer -> Integer Maybe Integer
Try and work out the type of the following functions. Try to make the type as general as possible.
— cow :: ???
cow x y z = x && (y `elem` z) — foo :: ???
foo x y z = x `elem` y && y `elem` z
— bar :: ???
bar x y = case y of Nothing -> x
Justz ->x+z
— snap :: ???
snap x y
|x > y = show x
|otherwise = show y
instance TypeClass MyNewType where …
instance Eq Fruit where Apple == Apple = True
Banana == Banana = True Orange == Orange = True _ == _ = False
Note that there was nothing forcing me to de_ne equality in this way, I could have de_ned Apple == Banana to be True but Banana == Apple to be False. This doesn’t mean that we should, though. To do so would be very poor style, and go against the Haskell standard.
instance Show Fruit where
show fruit = case fruit of
Apple -> “An apple.”
Banana -> “A banana! Yum!” Orange -> “Yuck, an orange.”
> Banana
A banana! Yum!
instance Ord Fruit where
(<=) Orange Apple = True
(<=) Apple Banana = True (<=) Orange Banana = True
(<=) Banana Apple = False (<=) Apple Orange = False (<=) Banana Orange = False
(<=) Banana Banana = True (<=) Apple Apple = True (<=) Orange Orange = True
> (Banana > Orange)
True
data Fruit = Orange | Apple | Banana deriving (Ord, Show)
>show Banana “Banana”
data Fruit = Orange | Apple | Banana deriving (Ord, Show, Num)
Fruit.hs:4:14:
Can’t make a derived instance of `Num Fruit’:
`Num’ is not a derivable class
data Nat = Z | S Nat deriving Show
> Z >= (S Z)
False
> max (S Z) (S (S Z)) (S (S Z))
> (S Z) == (S Z)
True
> (S Z) == (S (S Z)) False
Rewrite the elemTree function from Lab 9 but this time you may assume that the input tree satis_es the binary search ordering constraint.
> elemBSTree 5 goodTree True
> elemBSTree 30 goodTree False
Rewrite the treeMaximum and treeMinimum functions, again assuming the input tree is a binary search tree. Be eUcient!
treeBSMax :: (Ord a) => BSTree a -> a treeBSMin :: (Ord a) => BSTree a -> a
Write a function isBSTree that takes a binary tree as input, and checks if the binary search constraint holds.
Write a function treeInsert that takes a binary search tree, and an element, and inserts that element into the tree, ensuring the binary search ordering constraint still holds.
You may _nd the printTree function useful for this exercise, so that you can verify that the function inserts a new element into the correct place.
> treeInsert smallTree 3
Node (Node Null 1 (Node Null 3 Null)) 5 (Node Null 10 Null) > treeInsert smallTree 20
Node (Node Null 1 Null) 5 (Node Null 10 (Node Null 20 Null)) > treeInsert smallTree 1
Node (Node Null 1 Null) 5 (Node Null 10 Null)
Write a function flattenTreeOrd that `attens a binary tree, but preserves the ordering.
Note that this is much harder than insertion. Take the example tree above. Deleting 5 is obvious, as we just trim off that subtree and replace it will Null. But what should we do if I wanted to delete 4, or even worse, delete the root note, 3? The entire tree falls into two pieces, and must be restructured into a new tree that contains the remaining elements, while maintaining the ordering property. I recommend drawing out a few examples by hand to get your head around it _rst.
The following extension exercise is optional, and is NOT required to receive the participation marks for COMP1100 or for COMP1130. The content within is non- assessable.
Depending on how much you care about eUciency, this problem can be very diUcult to implement eUciently, and may require some external research. Many different kinds of trees (Red-Black, AVL, ect.) are speci_cally de_ned to make certain operations like balancing easier to do.
For those that are mathematically inclined, the operators (+) and (*) can actually be far more general than just addition and multiplication. The general term for any structure containing a set of values, a de_nition for + and for × is called an algebraic ring, which you may see if you go on to study formal mathematics. In general, (+) and (*) can be any operator that satis_es a set of “nice proprieties”, such as the law of distributivity: x * (y + z) == (x * y) + (x * z). The full set of laws that (+) and (*) should satisfy is available here. As an example, if we choose (+) to be logical exclusive or, and (*) to be logical and, then we can de_ne Bool as an instance of Num, and all the nice properties hold.
Using the template given, de_ne the functions (+),(-), (*),abs,signum,fromInteger for the Nat type.
Note that some of the functions required to make Nat an instance of Num may be sometimes unde_ned. There is no “minus one” in Nat, after all. Try to ensure as many functions are de_ned as possible, for as many inputs as possible. You should throw an error for attempting to perform any operation unde_ned in the natural numbers, like trying to compute 2 – 3.
> (S Z) + (S (S Z))
(S (S (S Z)))
> (S Z) – (S Z)
Z
> (S (S Z)) * (S (S (S Z)))
S (S (S (S (S (S Z)))))
> (fromInteger 2) :: Nat
S (S Z)
> (S Z) – (S (S Z))
*** Exception: Can’t have negative natural numbers
We’d like to be able to de_ne equality on binary trees based on the elements in the tree. If we just used deriving Eq, then two trees would be identical if they have precisely the same elements and structure.
For the purposes of this question, we will de_ne equality on binary trees to be the following weaker de_nition: Two trees are de_ned to be equal if they share the same elements, with the same ordering (although the structure need not be the same).
Since we want this de_nition to work on any tree with elements who have equality de_ned on them, we use the following notation.
which means that we are de_ning what == means for anything of type BinaryTree a, where a must be a member of the type class Eq.
Submission required: Extension 4 instance (Eq a) => Eq (BinaryTree a) Appendix: Bisection Search
Imagine you were trying to _nd a name in the telephone book, and it was stored as a list. You’d have to go through the names, one by one, till you found the one you were looking for. Not very eUcient. In practise when we are trying to _nd a name in the telephone book, we go to the middle of the book, and see if the name we are looking for comes before or after the letter M, which is in the middle of the alphabet. We then take the section that’s left over, and divide it in half too. By using this method of bisection search, so called because we are halving, or bisecting, the search space in half each time, we can _nd a single name in a book of millions, in only a minute or so. Not bad. As we will see, binary search trees share some of these properties, which makes them very useful for quick retrieval of data.
Take for example the binary search tree below, included as goodTree. (The dashed circles represent nodes that we haven’t explored yet.)
If we were searching to see if the number 16 was present or not, we would look at the root node, and see if it was 16 (which it isn’t). 16 is bigger than 15, so if 16 was in the tree, it must be to the right, by the ordering constraint. So we can immediately discard the entire left subtree, and only check the right.
Is 23 the element we are looking for? No, it’s too big, so that means if 16 exists, it has to be to the left.
17 is still bigger, so we keep looking left.
There it is! And look at how much of the tree we never needed to look at! Had we been looking for 26, we could abandon the search even earlier, as we would not need to go down to the deepest layer. Had these numbers been in a list instead, we would have to manually check every single element until we found 16.
This is an ideal case however, as our tree was perfectly balanced, that is, this tree cannot be restructured in any other way without increasing the depth. We could have the same elements arranged in the following tree (included as badTree):
Add an instance of equality for binary trees that satis_es this weaker de_nition of equality.
instance (Eq a) => Eq (BinaryTree a) where — (==) …
> goodTree == badTree
True
Try come up with some other test cases to verify this function works. You may wish to draw them out by hand _rst.
This tree is still a valid binary search tree, all elements to the left are smaller, all elements to the right are bigger. It’s just not a very useful tree, as we’d have to go through half of the elements in the tree to _nd 31, much more than the balanced tree. However, 1 is much easier to _nd, only after a single comparison.
In the absolute worst case, all the elements are listed one by one in the right subtrees, and all left subtrees are always Null. We call such a tree a degenerate tree, as it’s essentially the same as a list.
Usually, we want our trees to be balanced, to minimise the access time of any element.
Might there be circumstances when we would we want an unbalanced tree in this fashion? Think about this and discuss with a neighbour.
For a balanced tree with n elements, on average, how many lookups would you have to perform to check if a particular element is present? What if the tree was degenerate? We will revisit these questions in more depth in Lab 11.
Updated: 06 Jun 2020
Responsible OUcer: Director, RSCS
Page Contact:
Course Convenor
Contact ANU
Copyright
Disclaimer
Privacy
Freedom of Information
+61 2 6125 5111
The Australian National University, Canberra CRICOS Provider : 00120C ABN : 52 234 063 906