1. Submit a report and a single set of programs.
2. You are required to use a single node of Stampede for this homework assignment.
3. You are permitted to look up algorithms and study code that you find online, but all the work that is submitted must be your own. Copying code you find online, or making simple modifications to code you find online, or linking to libraries, is not permitted. If you base your approach on existing algorithms or code, please document the sources carefully.
4. Serial programs and sample input instances are available in the homework folder.
1. Sorting 32-bit floating point values
(a) Given an array of 32-bit floating-point values, write a parallel program (using OpenMP or Cilk Plus) to sort them in ascending order. The sort does not need to be in-place (i.e., you can use additional storage for the output). The sort need not be stable. You may use any parallel sorting algorithm (see references below).
(15 points: 7 for correctness, 8 for scalability/speedup/performance w.r.t. serial code)
(b) Discuss the parallelization strategy in the report. What assumptions did you make about the input array? Are there any drawbacks to your parallelization approach?
(10 points)
(c) Report performance as you vary thread concurrency, problem size (array size), and input array characteristics (e.g., in addition to uniform random floats, consider input sequences that are already sorted, almost sorted, sorted in reverse, with high percentage of repeats).
(10 points)
References:
1. The Wikipedia page on sorting algorithms: http://en.wikipedia.org/wiki/Sorting_algorithm
The page lists several possible approaches for sorting. Avoid the ones with worst case O(n^2) complexity, such as bubble sort and insertion sort. Quicksort, merge sort, and radix sort are nice candidates for parallelization.
2. Kale, Solomonik, Parallel Sorting Pattern. http://parlab.eecs.berkeley.edu/wiki/_media/patterns/sortingpattern.pdf
This article gives a high-level overview of issues with parallel sorting.
3. The Thurst library: http://thrust.github.com/
Thrust has several optimized implementations of parallel sort, mostly for GPUs.
4. A tutorial on OpenMP with hints on parallelizing quicksort.
http://msdn.microsoft.com/en-us/magazine/cc163717.aspx
5. Articles by Victor Duvanenko on parallel merge sort and radix sort in Dr. Dobb’s.
http://www.drdobbs.com/parallel/parallel-merge-sort/229400239
http://www.drdobbs.com/parallel/parallel-in-place-n-bit-radix-sort/226600004
6. There are a lot of recent technical papers on parallel sorting. Here’s a recent tech report:
http://www.intel.com/content/dam/www/public/us/en/documents/technology-briefs/intel-labs-radix-sort-mic-report.pdf
2. Finding unique strings
(a) You are given a text file as input, with a string in each line of the file. You cannot make assumptions on the lengths of the strings, or the size of the file. Your goal is to write a parallel program to determine the number of unique strings in the file, and the number of times each unique string appears in the file. You may use a set representation for this purpose, or alternately, sort all strings in some ordering, and count the unique strings. All steps of the approach need to be parallelized, no matter which scheme (set representation or sorting) you use. I have provided serial code for reference.
(15 points: 7 for correctness, 8 for scalability/speedup/performance w.r.t serial code)
(b) Discuss the parallelization strategy in the report. What assumptions did you make about the input string sequences? Are there any drawbacks to your parallelization approach?
(10 points)
(c) Report performance as you vary thread concurrency, problem size (I will provide a few representative text files), and input text file characteristics (e.g., consider text files with strings that are already ordered, strings that are almost sorted, files with all unique strings, files with a high percentage of duplicates, etc.).
(10 points)
/docProps/thumbnail.jpeg