DS2000
Spring 2020
HW6 (File processing)
Assigned: F ebruary 14 ♡, 2020 Deadline: February 21, 2020 at 9:00am
Please read this handout on files before you start. To submit your solution, compress all files together into one .zip file. Login to handins.ccs.neu.edu with your Khoury account, click on the appropriate assignment, and upload that single zip file. You may submit multiple times right up until the deadline; we will grade only the most recent submission.
Your solution will be evaluated according to our grading rubric (https://course.ccs.neu.edu/ds2000/ds2000rubric.pdf). The written component accounts for 15% of your overall HW6 score; the programming component for 85%. We’ll give written feedback when appropriate as well; please come to office hours with any questions about grading.
You are permitted two “late day” passes in the semester; each one allows you submit a homework up to 24 hours late. You may apply both to this homework, or split them between two homeworks. The Hand-in Server automatically applies one late-day token for every 24 hours (or part of 24 hours) you submit past the deadline.
Written Component (15% of this HW)
● Filename: written.txt
This time, you’ll actually have to do the written part after you’ve written your code. Submit a text file with the answers to the following questions:
1. In how many weeks was the entire state of California in some kind of drought condition (D0 or higher; no part of the state categorized as “no drought condition”)?
2. In how many weeks was Wyoming doing pretty well — 95% or more of the state in “no drought condition” or just “abnormally dry”?
3. Which state had the most weeks in “exceptional” drought, i.e., the percent of the state in D4 condition was more than 10.
For full credit on the written component, you must submit the code you used to answer these questions (i.e., demonstrate that you didn’t get the answers by looking through the files by hand or googling).
Programming Component
● Filename(s): up to you this time. Full credit for readability only if functions and driver are in separate files and well-organized.
For this assignment, we’ve gathered a bunch of data related to drought conditions in the western United States. You’ll read the data in from the provided files and generate a bar chart representing the drought conditions in a given state over a chosen number of weeks (1-50).
1
Here’s an example of what you’ll produce.
About the Data
We gathered the data used for this assignment from the US Drought Monitor (https://droughtmonitor.unl.edu/). Spend some time reviewing the website and reading about what kind of data is stored and what the drought levels (None; D0 — D4) mean.
We’ve provided 6 comma-separated values (CSV) files, which you’ll download and work with:
● arizona.csv
● california.csv ● colorado.csv
● nevada.csv
● newmexico.csv ● wyoming.csv
Each file contains drought-condition data for that state for the 50 weeks from 1/30/18 to 1/9/19. Each row represents one week’s worth of data.
The first row contains the column headers. The columns are organized as follows:
● Week (date of the starting week).
● None. Percent of the state with no drought condition.
● D0-D4. Percent of the state with drought condition D0 or worse.
● D1-D4. Percent of the state with drought condition D1 or worse.
● D2-D4. Percent of the state with drought condition D2 or worse.
● D3-D4. Percent of the state with drought condition D3 or worse.
● D4. Percent of the state with drought condition D4.
● DSCI. Ignore this one.
These data are not quite perfectly what we want to use for this visualization, so you’ll need to manipulate the input when it comes to the drought percentages: The data is organized such that, apart from the None column, the percentages are cumulative. We want to render individual, non-cumulative info:
● Percent of the state with drought condition D0, but not D1, D2, D3, or D4.
● Percent of the state with drought condition D1, but not D0, D2, D3, or D4.
● …and so on.
The State + Visualization
2
The first thing you’ll do is prompt the user for which state they want to see. Present a menu with different state options, in alphabetical order. Once they pick a state, have the user pick a number of weeks to display, between 1 and 50 inclusive. Like this:
Use Python Turtle to draw each row from the CSV file as a bar with width 5 pixels. Stack from the bottom up, with the following colors (these colors all exist with the names given in Python turtle):
● None: gray
● D0: yellow
● D1: wheat
● D2: orange
● D3: red
● D4: maroon
Once you’ve drawn one date, move the turtle over to the right and do the next one. Repeat until all
requested weeks are represented.
Requirements: For full credit, your program must:
● Give the user a menu of possible states to draw. If they enter an invalid option, re-prompt them until they give you a correct one. Same thing with the number of weeks to draw.
● Render the bar charts using Turtle. Your sizes/setup don’t need to be identical to our example, but they need to convey the same information.
● Draw one bar per week, starting at the beginning (January 2018) and going week-by-week until you’ve provided the amount of data the user requested.
● Define your functions in a separate file than your driver.
● Submit any functions/files you used to answer the questions in the written component.
AMAZING points: These final two points may be awarded if you’ve completed the rest of the assignment perfectly and blown us away with…
● Add to your written.txt file one or two questions of your own, and answer them. These can be any questions you like, as long as they are (1) relevant to the data at hand, and (2) meaningful.
3