CSci 2041 Computer Laboratory 7
Computer Laboratory 7
CSCI 2041: Advanced Programming Principles
March 9, 2021
0. Introduction.
In this laboratory assignment, you will write functions that implement a
hash table which resolves collisions by chaining. (You may have discussed
such hash tables in a data structures course before CSCI
2041.) It is similar to the hash table that was discussed in the lectures
for use in memoization. However, this hash table is implemented in a
somewhat different way, using mutable objects that contain variables.
1. Theory.
A hash table is a data structure that maps keys to
their values. It supports operations that (1) add a new key and
value to the table, (2) change the value associated with a key, and (3)
delete a key and its value entirely.
One way to implement a hash table uses an
array. Each element of the array is a linear linked list of nodes, called a
bucket. Each node in the bucket contains a key, its value, and a
pointer to the next node in the bucket (if there is one). The pointer to the
next node lets us access the remaining nodes in the bucket that follow the
first node.
Now suppose that we want to find the value
associated with a key. We call a hash function on the key,
giving an index into the array. Then we obtain a bucket from the array at
that index. We search the bucket for a node that contains the key. If we
find it, then the node will also contain the key’s value. If we
don’t find it, then the hash table assigns no value to the key.
The buckets are needed because the hash
function may send many different keys to the same bucket. This is called a
collision. However, if the hash function is correctly designed,
then each key has an approximately equal probability of being sent to any
bucket within the array. If we make the array big enough, then most buckets
will either be empty, or will have only one or two nodes.
As a result, buckets can be searched very
quickly. It is possible to implement a hash table so that its operations can
be performed in O(1) key comparisons, on average.
2. Implementation.
For this laboratory assignment, you will implement a hash table as an array
of buckets, as described in the previous section. You must write an OCaml
type and a few OCaml functions, described below. Unlike most Ocaml code
written for this course, you will use variables to avoid copying.
(‘key, ‘value) pair
The type pair describes a list of key-value pairs for each
hash table bucket. It must have two constructors, called
NoPair and Pair. The constructor
NoPair makes a pair with no parts,
representing an empty list of pair’s. The constructor
Pair makes a pair with three parts: a
key, a variable that holds a value, and a
variable that holds a pointer to the next pair in the
bucket. Since the value and the next pair are
variables, they can be changed by assignment.
hashDelete table key
Use the function hash to find a bucket in
table. Search that bucket for a Pair that
contains key. If there is such a Pair, then
delete it from the bucket. If there is no such Pair, then
do nothing. Hint: do not copy the Pair’s in the
bucket, but rather delete only the Pair that contains
key. See below for a longer version of this hint.
hashGet table key
Use the function hash to find a bucket in
table. Search that bucket for a Pair that
contains key and its value. Return that value. If there is
no such Pair, then raise the exception
NoSuchKey. Hint: the value will be a variable, so make sure
to return the variable’s value, not the variable itself.
hashHas table key
Use the function hash to find a bucket in
table. Search that bucket for a Pair that
contains key. If there is such a Pair, then
return true, otherwise return false. Hint:
this will be much like hashGet.
hashPut table key value
Use the function hash to find a bucket in
table. Search that bucket for a Pair that
contains key and a variable that contains its value. If
there is such a Pair, then reset the variable to
value. If there is no such Pair, then add a
new Pair to the front of the bucket. It contains
key and a variable whose value is value.
Here are two helper functions that may be useful. The function
hashMake returns a hash table. It is implemented as an array
of length modulus whose elements are empty buckets
(NoPair). Here modulus is assumed to be a prime
number.
let hashMake modulus =
Array.make modulus NoPair ;;
The function hash returns the index of a bucket within
table that may have a Pair containing
key. Here table was created by calling
hashMake.
let hash table key =
abs ((Hashtbl.hash key) mod (Array.length table)) ;;
And here is a hint about how to write hashDelete. Use a helper
function, which might be called hashDeleting, to search a
bucket from table. The function hashDeleting has
three possible cases.
If the bucket is empty, then return an empty bucket (NoPair).
If the first Pair in the bucket contains key,
then return the rest of the bucket, not including the first
Pair. There is a pointer to the rest of the bucket in the
first Pair.
If the first Pair in the bucket does not contain
key, then call hashDeleting on the rest of the
bucket. Set pointer to the rest of the bucket, in the first
Pair, to the result returned by hashDeleting.
The function hashDelete should use hash to find
the bucket in table that may contain key. It
then calls hashDeleting on that bucket. It finally replaces
the original bucket in table with the result returned from
hashDeleting. In this way, hashDelete can delete
the first pair from the bucket, in the same way that it
deletes any other Pair, without needing a special case.
3. Deliverables.
The file
tests7.ml
contains definitions of some helper functions along with some tests, worth
45 points. Insert your code into this file, then run it with
OCaml. When you think your code is correct, submit it to Gradescope and/or
Canvas. If you do not know how to submit your work, then please ask your lab
TA’s. It must be submitted by 11:55 PM
on Tuesday, March 16, 2021.
NOTE: increasing numbers
of students have asked for more time to submit their work, claiming that
they have ‘‘submitted the wrong file.’’ We are
unsympathetic to most such claims, because this excuse is often used by
cheaters to get additional time they are not entitled to. Please check that
you have submitted the correct file when you turn in your work! This
probably takes no more effort than a couple of mouse clicks. Thanks.