CS325 programming assignment 1
Electronic submission of Code and Report to TEACH at 11:59PM Feb 1st
January 13, 2019
TA in charge William Maxwell (maxwellw@oregonstate.edu)
General Instructions.
• Code should be submitted through the TEACH website.
• Students are encouraged to work in groups. Each group can contain up
to 3 students.
• Each group should create a single submission. In your submission, please explicitly state the names of all group members.
• You can choose any of the following programming languages:
java, python, c/c++.
If you wish to use a language that is outside of this list, you will need to obtain advance approval from the TA in charge.
• Your report should be typed and submitted in pdf format.
Closest pair of points In this assignment, you will solve the problem of finding the closest pair of points among a set of n points in the plane, which is an important computational geometry problem that has many applications in graphics, computer vision, GIS and air traffic control etc. In particular the main idea you will explore for this problem is divide and conquer, which I briefly describe below.
The divide-and-conquer algorithm works by dividing the point set into two nearly-equal halves, achieved by splitting the set at the median x-coordinate. Let xm be the x-coordinate of the dividing line separating the two point sets.
We recursively compute the closest pair in each half, and then solve the merge step based on their results. More specifically, suppose d1 and d2, respectively, are the closest pair distances for the left and right subproblems, and let d = min(d1,d2). Determine the set of points whose x-coordinates lie in the range [xm − d, xm + d]. Call this set M, for the middle. Order the points of M in ascending order of their y-coordinates. We find the closest pair of points in M
1
by comparing each point p to only those whose y-coordinate differs from p by at most d. Let their distance be dm, we return min(d,dm).
For this assignment, you will implement three different algorithms for this problem.
1. Brute-force. This algorithm works simply by computing the distance be- tween all n pairs of points and finding the closest pair. This algorithm
2. Naive Divide and Conquer. Implement the above outlined divide-and- conquer algorithm for computing the closest pair. In this naive version, you will sort the points within M based on y-coordinates in each recursive call from scratch.
3. Enhanced divide and conquer. In this version, you will eliminate the re- peated sorting by pre-sorting all the points just once based on x-coordinates and once based on y-coordinates. All other ordering operations can then be performed by copying from these master sorted lists.
Empirically testing the correctness of your algorithm. Your program should take an input text file of points and output its results to an output text file. Your program will be run by the TAs, so please leave instructions on the steps to compile/run your program in a “README.txt” file.
Suppose you are given a file “example.input”, and it contains the following points:
00 55 98 89 18 17 95 40 96 97
Your program should take these and output a file “output.txt” in the fol- lowing format:
1.0 1718 9596 9697 9798
The minimum distance is printed at the top, followed by all of the corre- sponding minimum pairs of points. In the case of ties (like in this example),
2
2
will have O(n2) run time.
you should sort the matching points in order1. Here, you see 4 pairs of points that happened to achieve the minimum distance of 1.0. 2
Your output file should look like the above, and you will be given “exam- ple.input” to help test your algorithms. When it comes time to grade, the TAs will test your code using a separate file.
Make a separate runnable command for each algorithm (Brute Force, Divide and Conquer, Enhanced DnC). This could be different command-line flags, or even different programs. Mention how to run your programs in a text file called “README.txt”.
So for example, we might need to run:
> ./bruteforce example.input
> ./divideandconquer example.input
> ./enhanceddnc example.input
To run the each algorithm. And then these programs could output “out- put bruteforce.txt”, “output divideandconquer.txt”, “output enhanceddnc.txt” respectively.
Empirical analysis of run time. For this part, you need to generate your own random inputs (for your input, please make sure that no points have the same x value or y value) of different sizes using a random number generator and run your algorithms on these inputs to measure their empirical run time. The suggested range for input size n is 102, 103, 104, 105. If time allows, you are encouraged to try even larger input sizes for the divide and conquer algorithms. For each size, you should generate 10 random inputs and run each algorithm on them and report the average run time over the ten runs.
Write up In addition to submitting the source code (with clear instructions for running them), you will also need to submit a brief write up, which should include the following:
• Pseudo-code. Please provide the Pseudo-code for each of the three al- gorithms.
• Asymptotic Analysis of run time. Please analyze the runtime for each of your three algorithms.
• Plotting the runtime. Plot the empirically measured runtime of the three algorithms as a function of the input size. Your plot should have clearly labeled axes and legends.
1The sorting should be done by X value, then by Y value. For example, if you have 2 sorted points (x1, y1), (x2, y2), then x1 < x2 with ties broken by y1 < y2. Don’t worry, this should be the default sorting behavior if you call some library’s array.sort() function on an array of points.
2For simplicity, you can assume that all data points will have distinct x and y values. This avoids complications that might arise in the median calculation.
3
• Interpretation and discussion Discuss the runtime plot. Do the growth curves match your expectation based on their theoretical bounds? Dis- cuss and provide possible explanations for any discrepancy between the experimental runtime and the asymptotic runtime.
4