代写 C++ algorithm Java javascript statistic ECS 60 Winter 2019 Dept. of Computer Science, UC Davis

ECS 60 Winter 2019 Dept. of Computer Science, UC Davis
Program #2
Programs must be written in C++ and are to be submitted using handin on the CSIF by the due date using the command:
handin cs36c P2 file1 file2 … fileN
Your programs must compile and run on the CSIF. Use handin to submit all 􏰁les that are required to compile (even if they come from the prompt). Programs that do not compile with make on the CSIF will lose points and possibly get a 0.
There are 100 example inputs along with their expected outputs in the directory as well as the autogragrader to download.
The Example*-TimedAlgorithmsSolution.csv have 0’s for the running time 􏰁elds (since this is variable, you do not need to match it), and your solutions should have the actual running time in these 􏰁elds.
1 Overview & Learning Objectives
In this program you will study and experiment with InsertionSort, MergeSort, and QuickSort. There are multiple objectives of this assignment:
1. introduce the JSON (JavaScript Object Notation) and CSV (comma sep- arated values) data formats, which are frequently used in professional settings,
2. examinethewallclockrunningtimeofInsertionSort,MergeSort,andQuick- Sort,
3. understand 􏰁ne-grained details of algorithm analysis (wallclock vs. worst- case Big-O vs. average-case Big-O),
4. introduce automated testing, and
5. use your automated tests to detect code with bugs.
1

2 Data Formats
JSON
We will call each array that is to be sorted a sample. You will be running your programs on one or two input 􏰁les that contain a number of samples. These 􏰁les will be formatted in JavaScript Object Notation (or JSON). This data format, and many like it, is frequently used in industry, and every computer science student should be exposed to it. In this class, we will use the third-party library https://github.com/nlohmann/json. See JSON.pdf on Canvas for a tutorial on JSON and this library.
CSV
Measurements of the sorting algorithms will be recorded in CSV 􏰁les. This data
format, and many like it, are frequently used in industry, and every computer
1 science student should be exposed to it. There are many types of “CSV” 􏰁les ,
and we describe one of the simplest versions here.
A comma separated values 􏰁le, or CSV 􏰁le, consists of a header row on the 􏰁rst line of the 􏰁le, followed by data records on subsequent lines. The header row consists of a collection of column names separated by commas. The data records consists of data (this may be strings, numbers, etc.) separated by commas. For example, the contents of a CSV 􏰁le for student information:
Name,ID,email,year
AlexGrothendieck,423518,alexg@myuni.edu,3
EmmyNoether,4245534,emmynoether@myuni.edu,2
JuliaRobinson,23634563,jrob@myuni.edu,2
MartinDavis,2359830,mdavis@myuni.edu,1
In the above example, the header row consists of four column names, Name, ID, email, and year. Each data record therefore has four data, and the order is signi􏰁cant: on line 2 (the 􏰁rst data record), the Name is AlexGrothendieck, the ID is 423518, the email is alexg@myuni.edu, and the year is 3.
3 Executables
Executable #1
Executable Name: sortedverification Source: sortedverification.cxx Usage: sortedverification file.json
This program takes the name of a JSON 􏰁le as a command-line argument file.json that represents the output of a sorting algorithm and veri􏰁es that
1
For example, there are CSV formats that allow tab delimiters instead of commas.
2

each sample is a sorted array. If a sample array is not sorted, there must be some position i such that the ith element is equal to or larger than the i + 1st element. We call this a consecutive inversion. For example, if A = [−2, 0, 3, 2, 5] there is a consecutive inversion at location i = 2 because A[2] = 3 > 2 = A[3]. For example, the samples
Sample1=[−1641818748,1952682320,−195384256,−1702150187], and Sample2 = [−683761375, −406924096, −362070867, −592214369]
are de􏰁ned by the following input 􏰁le SampleExample.json:
{
“Sample1”: [-319106570,811700988,1350081101,1602979228],
“Sample2”: [-319106570,811700988,797039,-1680733532],
“metadata”: {
“arraySize”:4,
“numSamples”:2
}
}
Sample2 has consecutive inversions at index 1 and 2, and running
./sortedverification SampleExample.json
prints the contents of a JSON object to the screen (i.e. to stdout):
{
“Sample2”:{
“ConsecutiveInversions”:{
“1”:[
811700988,
797039 ],
“2”:[
797039,
-1680733532
]
},
“sample”:[
-319106570,
811700988,
797039,
-1680733532
] },
“metadata”:{
3

} }
“arraySize”:4,
“file”:”SampleExample.json”,
“numSamples”:2,
“samplesWithInversions”:1
Sample1 has no inversions so its data is not printed to the JSON output above. Notice that if the consecutive inversions of a sample are added to the JSON object, the sample data (the array) is also added to the JSON object.
Executable #2
Executable Name: consistentresultverification
Source: consistentresultverification.cxx
Usage: consistentresultverification file1.json file2.json.
This program takes two command-line arguments file1.json and file2.json that contain JSON objects representing the output of two sorting algorithms, and veri􏰁es that these 􏰁les represent the same samples or reports their di􏰀er- ences.
I have copied SampleExample.json to AlmostSampleExample.json and mod- i􏰁ed the second and third entries of Sample1 in AlmostSampleExample.json. These di􏰀erences are output when I run
./consistentresultverification.sh SampleExample.json AlmostSampleExample.json
The program outputs the following:
{
“Sample1”: {
“AlmostSampleExample.json”: [
-319106570,
8117009,
13500811,
1602979228
],
“Mismatches”: {
“1”: [
811700988,
8117009 ],
“2”: [
1350081101,
13500811
] },
4

“SampleExample.json”: [
-319106570,
811700988,
1350081101,
1602979228
] },
“metadata”: {
“File1”: {
“arraySize”: 4,
“name”: “SampleExample.json”,
“numSamples”: 2
}, “File2”: {
“arraySize”: 4,
“name”: “AlmostSampleExample.json”,
“numSamples”: 2
},
“samplesWithConflictingResults”: 1
}
}
The metadata 􏰁eld now contains information about the 􏰁les being read in. The key Sample1 has information because it di􏰀ers between SampleExample.json and AlmostSampleExample.json. Its value contains the sample from each 􏰁le along with the di􏰀erences between the asmples. Di􏰀erences are listed in the Mismatches key, which contains a list of positions that mismatch and their contents. Note that the key-value pair “1”: [ 811700988, 8117009 ] exists because the second entry of Sample1 in SampleExample.json is 811700988 and AlmostSampleExample.json is 8117009.
Executable #3
Executable Name: timealgorithms Source: timealgorithms.cxx Usage: timealgorithms file.json
This program takes the name of a JSON 􏰁le as a command-line argument (file.json) that represents a collection of arrays to be sorted (an input 􏰁le for the sorting algorithms) and runs InsertionSort, MergeSort, and QuickSort on all samples in the 􏰁le, measures various statistics, and prints these statistics to a CSV 􏰁le.
Do not implement your own versions of the algorithm. Use the code given in inerstionsort.cpp, mergesort.cpp, and quicksort.cpp. Slight vari- ations of these algorithms will not work with the autograder, as you will be gathering speci􏰁c statistics about how the algorithms behave.
Collect the following statistics:
5

Running Time: i.e. wallclock time. I used clock and CLOCKS_PER_SEC from the library for this. The autograder won’t check this 􏰁eld; so the only important part about this 􏰁eld is your ability to get a sense of which algorithm is fastest on a given input.
Number of Comparisons: A count of how often an algorithm compares at least one element from the array it is sorting to something else. The following lines of code both count as a single comparison:
(*numbers)[i] < (*numbers)[j] (*numbers)[i] < a You will need to add lines of code to the sorting algorithms to achieve this. If necessary, take lazy evaluation into account. Number of memory accesses: A count of how often an algorithm accesses the array it is sorting. In the above example, the 􏰁rst line counts as two memory accesses while the second line counts as one. If necessary, take lazy evaluation into account. These statistics are then printed to the screen in CSV format (to save to a 􏰁le, use output redirection). Your header row for your CSV 􏰁le must have the following column names (see TimeOutputExample.csv for an example): Sample: The name of the sample that pertains to this row's statistics (e.g. Sample1) InsertionSortTime: The wallclock time of running InsertionSort on this row's sample InsertionSortCompares: ThenumberofcomparesusedwhenrunningInsertionSort on this row's sample InsertionSortMemaccess: The number of memory accesses when running InsertionSort on this row's sample MergeSortTime: The wallclock time of running MergeSort on this row's sam- ple MergeSortCompares: ThenumberofcomparesusedwhenrunningMergeSort on this row's sample MergeSortMemaccess: ThenumberofmemoryaccesseswhenrunningMergeSort on this row's sample QuickSortTime: The wallclock time of running QuickSort on this row's sam- ple QuickSortCompares: ThenumberofcomparesusedwhenrunningQuickSort on this row's sample QuickSortMemaccess: ThenumberofmemoryaccesseswhenrunningQuickSort on this row's sample 6 4 Files To Submit Submit the following 􏰁les for your program: Makefile, mergesort.cpp, mergesort.h, consistentresultverification.cxx, quicksort.cpp, createdata.cxx, quicksort.h, insertionsort.cpp, sortedverification.cxx, insertionsort.h, timealgorithms.cxx. Your program must compile on the CSIF using make without warnings. Code that compiles with warnings will lose 5%. 7