COMP 2230 – Data Structures
Seminar #6: Hashing
Due Date: Beginning of the lecture on Thursday March 30, 2017
Purpose: To get some practice with a hashing using a 2D and an overflow array.
You can do all this in one driver program if you like, just include the methods in the driver file.
This assignment will give you practice using the fixed-size buckets hashing method.
1. Create a hash table capable of holding 250 data elements in 50 buckets each having 5
s1ots. Allow an overflow area of 50 storage units.
2. Use a random number generator to generate 200 data values between 1 and 99999. Using a hash function try to store the 200 random numbers in the hash table. Use the overflow if a bucket gets full; just put the data element in the next available spot in the overflow.
3. Data stored in a hash table and overflow must be unique (key values). The random number generator may generate a few duplicates but we’ll pretend it didn’t.
4. Print the contents of the hash table and the overflow after you’ve added 200 values to the hash table and overflow.
5. Delete items from the hash table and overflow. Be sure to test all possible situations that could occur with this hash table and overflow. Circle the numbers in the printout from #4 that will be deleted. Decide what you will do with the “gaps” created in the hash table and overflow; pick some strategy that will ensure that the next search and delete is as fast as possible.
6. Print the contents of the hash table and overflow after all deletions have been made. Make written comments on the print out to explain what happened when you deleted items.
7. In the seminar we will estimate the order of this hashing technique.
0.9*(3)+0.1*(5+20/2) = 2.7 + 1.5 = 4.2
3 and 20/2 are average
Good order of hashing is between 3-5
8. Is this a good hashing strategy? Why? Could it be improved?
Load factor = 200/250 = 80%, which is a bit high and java uses 75% load factor.
We can expand the capacity of array by creating more number of rows than number of columns in order to lower the load factor.
9. Marks will be assigned for the following:
I) program accuracy, efficiency, and readability including documentation
II) efficiency of the deletion strategy used in part 5
III) proper selection of data to delete
IV) explaining on the print outs, what items were deleted and the results of those deletions.
V) answering the questions in #8
NOTE: random numbers are more appropriately called pseudo-random numbers. The numbers themselves exhibit no pattern, so it is not possible to guess the next number given what you’ve already observed. However the actual sequence of numbers you get is completely determined by the seed value, which is used to initialize the random number generator. Typically the seed value is determined by the computer’s internal clock so it is extremely unlikely that you’ll get the same sequence of numbers if you run the program twice.
The programmer however can set the seed value, and thus ensure that the same set of numbers is generated every time the program is run. The best seed values are prime numbers. Here’s the info from the Java website:
public Random(long seed)
Creates a new random number generator using a single long seed. The seed is the initial value of the internal state of the pseudorandom number generator which is maintained by method next(int).
The invocation Random rnd = new Random(seed) is equivalent to:
Random rnd = new Random();
rnd.setSeed(seed);
Assignment Submission:
Submit a print-out of the program source code and a sample of the output including the 2 printouts described above.
Notes:
If arr row is full, need to check if anything from overflow can be placed back to the arr
If arr row has empty spaces in middle, ie. X X null X, shift to left so no space between