Assignment 1: Rent-a-Bike
Due: Saturday November 10th before 11:00pm
Partnerships:
You may work alone or in a group of two. Declare your partnership on MarkUs BEFORE you begin coding.
Once a partnership is set up, I will not change groupings (barring extraordinary circumstances), so make sure it is someone you want to work with.
It is often best to work with your partner in the same room, rather than splitting up the assignment and letting each person work on their own. This is because when working alone, each person doesn’t know what’s happening on other parts of the assignment, and that can lead to issues and bugs when putting the code together (and, a much longer time to finish it!). Try working together in the same room and using the driver/navigator method we use in lab. Previous experience has shown that this will lead to finishing the assignemnt more quickly, since both people will be communicating their thoughts as the code is written and you can help each other out when needed.
Advice:
Start early! This assignment will take a considerable amount of time to complete. Under no circumstances should you leave this to the last few days before it is due.
It is recommended that you print out this handout and highlight key points. Make sure you understand what each part of the assignment is asking you to do, and ask questions if you are stuck on something. Don’t forget that we have a discussion board!Remember that the Teaching Labs in Bahen are open 24/7 for students taking CSC courses and you can work on the assignment in any of the lab rooms (if they’re not taken up by a class).
Before you start: A warning about academic offenses
The CS Department has software that is used to compare similaraties between different submissions, and checks for similaties to code found online, and submissions of similar assignments from previous terms and courses.
As stated in the course information sheet, you must hand in your own work. You should only show and discuss your code with your partner (if you choose to have one), the CSC120 TAs, and the instructor. Do not submit code that is not yours or that you found somewhere. Do not show your solution to other students in the course, and do not post your code on the discussion board.
Please see the syllabus for information about academic offenses – they are taken very seriously by the University.
Introduction
Reminder: If you’re working with a partner, declare your partnership on MarkUs before you start coding, and make sure the person who is invited to form a team accepts the invitation.
Toronto’s bike share network debuted in 2011, offering rental bikes to Torontonians and visitors in the downtown core. This network consists of hundreds of docking stations scattered around downtown. Bikes can be rented from any docking station and returned to any docking station in the city. In this assignment, you will write several functions to help manage and track bike rentals across this network. Using real data from Toronto’s bike share system, your functions will simulate bike rentals and returns as well as keep track of the current state of the network and even provide directions to riders.
The data that you will work with is provided by the Toronto bike share network. The data contains information about the docking stations, such as the location of the station and how many bikes are currently available. More information about the data provided and where it comes from is given later in this handout.
The purpose of this assignment is to give you practice using the programming concepts that you have seen in the course so far, including (but not limited to) strings, lists and list methods, and loops.
This handout explains the problem you are to solve, and the tasks you need to complete for the assignment. Please read it carefully.
Files to Download
Please download the Assignment 1 Data Files and extract the zip archive. A description of each of the files that we have provided is given in the paragraphs below:
Starter code: bikes.py
The bikes.py
file contains some constants, and a couple of complete helper functions that you may use. You must not modify the provided helper functions.
The bikes.py
file also contains function headers and docstrings for the A1 functions to which you are required to add function bodies. For each function, read the header and docstring (especially the examples) to learn what task the function performs. Doing so may help you to determine what you need to do for each required function. To gain a better understanding of each function, you may want to add another example to the docstring.
Data: stations.csv
The stations.csv
file contains bike share data in comma-separated values (CSV) format. You must not modify this file.
Checker: checker.py
We have provided a checker program (checker.py
) that tests two things:
- whether your functions have the correct parameter and return types, and
- whether your code follows the Python and CSC120 style guidelines.
The checker program does not test the correctness of your functions, so you must do that yourself.
More details on the checker are found later in this handout.
The Data
For this assignment, you will use data from a Comma Separated Value (CSV) file named stations.csv
. Each row of this file contains the following information about a single docking station:
- station ID: the unique identification (ID) number of the station
- name: the name of the station (not necessarily unique)
- latitude: the latitude of the station location
- longitude: the longitude of the station location
- capacity: the total number of bike docks (empty or with bike) at the station
- bikes available: the number of bikes currently available to rent at the station
- docks available: the number of docks at the station that currently do not have a bike attached to them
- is renting: whether or not a station is currently allowing bike rentals
- is returning: whether or not a station is currently allowing bike returns
Note: While the sum of the number of bikes available at a station and the number of docks available at a station will usually equal the station’s capacity, this need not be the case. When a bike or a dock is broken, the sum of the two availabilty numbers will not match the capacity.
We have provided a function named csv_to_list
, which reads a CSV file and returns its contents as a List[List[str]]
. As you develop your program, you can use the csv_to_list
function to produce a larger data set for testing your code. See the main block at the end of bikes.py
for an example.
Your Tasks
Imagine that it is your job to manage Toronto’s bike share system. As the manager, you need to know everything about the system. But, there are hundreds of docking stations, which is way too many to keep track of in your head. To make your life easier, you will write Python functions to help you manage the system.
Your functions will fall into three categories: functions for data cleaning, functions for data queries, and functions for data modification.
Data cleaning
We provided a function named csv_to_list
that reads data from a CSV file and returns it in a List[List[str]]
. Here is a sample of the type of list returned:
[['7000', 'Ft. York / Capreol Crt.', '43.639832', '-79.395954', '31', '20', '11', 'True', 'True'],
['7001', 'Lower Jarvis St / The Esplanade', '43.647992', '-79.370907', '15', '5', '10', 'True', 'True']]
Notice that all of the data in the inner lists are represented as strings. You are to write the function clean_data
, which should make modifications to the list according to the following rules:
- If and only if a string represents a whole number (ex:
'3'
), convert it to anint
- If and only if a string represents a number that is not a whole number (ex:
'3.14'
), convert it to afloat
- If and only if a string is either
'True'
or'False'
, convert it to the appropriatebool
(True
orFalse
) - If and only if a string is
'null'
or the empty string, convert it toNone
- Otherwise, leave it as a
str
After applying the clean_data
function to the example list, it should look like:
[[7000, 'Ft. York / Capreol Crt.', 43.639832, -79.395954, 31, 20, 11, True, True],
[7001, 'Lower Jarvis St / The Esplanade', 43.647992, -79.370907, 15, 5, 10, True, True]]
Before you write the clean_data
function body, please note:
- you must not use the built-in function
eval
, and - this function is one of the more challenging functions in A1, so we suggest that you don’t start with it.
Function name (Parameter types) -> Return type |
Full Description |
---|---|
clean_data (List[list]) -> None |
The parameter represents a list of list of strings. The list could have the format of stations data, but is not required to. See the starter code docstring for some examples.
Modify the parameter so that strings that represent whole numbers are converted to |
Data queries
Once the data has been cleaned, you will write the following functions that you can use to extract information from the data. All the examples shown below assume that you are calling the function on the cleaned example list shown above.
Function name (Parameter types) -> Return type |
Full Description |
---|---|
get_station_info (int, List[list]) -> list |
The first parameter represents a station ID number and the second parameter represents cleaned stations data.
Return a list containing the name, the number of bikes available, and the number of docks available (in this order), for the station whose station ID is the first parameter. Precondition: the station ID will appear in the cleaned stations data. |
get_total (int, List[list]) -> int |
The first parameter represents an index and the second parameter represents cleaned stations data.
Return the sum of the values at the given index in each inner list of the cleaned stations data. Precondition: the items in the list at the given index position are integers. |
get_station_with_max_bikes (List[list]) -> int |
The parameter represents cleaned stations data.
Return the station ID of the station that has the most bikes available. If there is a tie for the most available, return the station ID that appears first in the stations list. Precondition: the cleaned stations data will contain at least one station |
get_stations_with_n_docks (int, List[list]) -> List[int] |
The first parameter represents a required minimum number of available docks and the second parameter represents cleaned stations data.
Return a list of the station IDs of stations that have at least the required minimum number of available docks . The station IDs should appear in the same order as in the given stations data list. Precondition: the required minimum number of available docks will be >= 0. |
get_direction (int, int, List[list]) -> str |
The first two parameters represent station ID numbers and the third represents cleaned stations data.
Return a string that contains the direction to travel to get from the first station to the second. The string should contain one of For example, if the first ID is 7000 and the second is 7001, the function should return Precondition: the two station ID numbers will appear in the cleaned stations data. |
Data modification
The functions that we have described up to this point have allowed us to clean our data and extract specific information from it. Now we will descibe functions that let us change the data.
Function name (Parameter types) -> Return type |
Full Description |
---|---|
rent_bike (int, List[list]) -> bool |
The first parameter represents a station ID and the second represents cleaned stations data.
A bike can be rented from a station if and only if:
If the conditions above are met, this function successfully rents a single bike from the station. A successful bike rental requires updating the Precondition: the station ID will appear in the cleaned stations data. |
return_bike (int, List[list]) -> bool |
The first parameter represents a station ID and the second represents cleaned stations data.
A bike can be returned to a station if and only if:
If the conditions above are met, this function successfully returns a single bike to the station. A successful bike return requires updating the Precondition: the station ID will appear in the cleaned stations data. |
balance_all_bikes (List[list]) -> int |
The parameter represents cleaned stations data.
Modify the stations data so that the percentage of bikes available at each station is as close as possible to the overall percentage of bikes available across all stations. To calculate the overall percentage of bikes available across all stations, include all stations, regardless of whether they are currently renting or returning. Return the difference between the number of bikes rented and the number returned after completing this modification. If more bikes were returned than were rented, the function should produce a negative number. (Note: this means that this function could add or remove bikes from the overall bike network.) To illustrate this, let’s consider our cleaned example list from before. This list contains two stations, one that is 65% full (20 bikes available out of a 31 dock capacity) and one that is 33% full (5 bikes available out of a 15 dock capacity). We want both of these docking stations to have a percentage available that is as close as possible to the percentage available across all stations. In our example, 20+5 bikes out of 31+15 bikes gives a goal percentage of 54%. For each station, based on its capacity, we calculate how close we can get to the goal percentage. With the cleaned example list, we are aiming for 17 bikes in the first station (54% of 31 is 17, after rouding to a whole number of bikes) and 8 in the other station (54% of 15 is 8, after rouding to a whole number of bikes). Now, for each station, we rent and/or return enough bikes to reach the target. As with the other data modification functions, you should only remove bikes from a station if and only if the station is renting and there is a bike available to rent, and you should only return a bike if and only if the station is allowing returns and there is a dock available. Keep track of the overall number of bikes rented and returned. This function is to return the difference (a positive number if more bikes were rented than returned, 0 if the same number were rented as returned, and a negative number if fewer bikes were rented than returned). For the example above, 3 bikes were rented from one station and 3 were returned to the other, so the function returns 0. This function must be able to redistribute the bikes no matter how many stations are in the cleaned stations data. |
CSC120 A1 Checker
We are providing a checker module (checker.py
) that tests two things:
- whether your functions have the correct parameter and return types, and
- whether your code follows the Python and CSC120 style guidelines.
To run the checker, open checker.py
and run it. Note: the checker file should be in the same directory as your bikes.py
, as provided in the starter code zip file.
If the checker passes:
- Your function parameters and return types match the assignment specification. This does not mean that your code works correctly in all situations. We will run additional tests on your code once you hand it in, so be sure to thoroughly test your code yourself before submitting.
- Your code follows the style guidelines.
If the checker fails, carefully read the message provided:
- It may have failed because one or more of your parameter or return types does not match the assignment specification, or because a function is misnamed. Read the error message to identify the problematic function, review the function specification in the handout, and fix your code.
- It may have failed because your code did not follow the style guidelines. Review the error description(s) and fix the code style. Please see the PyTA documentation for more information about errors.
Make sure the checker passes before submitting.
Testing your Code
It is strongly recommended that you test each function as you write it. As usual, follow the Function Design Recipe (we’ve done the first couple of steps for you). Once you’ve implemented a function, run it on the examples in the docstring.
Here are a few tips:
- Be careful that you test the right thing. Some functions return values; others modify the data in-place. Be clear on what the functions are doing before determining whether your tests work.
- Can you think of any special cases for your functions? Test each function carefully.
- Once you are happy with the behaviour of a function, move to the next function, implement it, and test it.
Remember to run the checker!
Additional requirements
- Do not add statements that call
print
,input
, oropen
, or use animport
statement. - Do not use any
break
orcontinue
statements. We are imposing this restriction (and we have not even taught you these statements) because they are very easy to abuse, resulting in terrible code. - Do not modify or add to the import statements provided in the starter code.
Marking
These are the aspects of your work that will be marked for Assignment 1:
- Correctness (80%): Your functions should perform as specified. Correctness, as measured by our tests, will count for the largest single portion of your marks. Once your assignment is submitted, we will run additional tests, not provided in the checker. Passing the checker does not mean that your code will earn full marks for correctness.
- Coding style (20%): Make sure that you follow Python style guidelines that we have introduced and the Python coding conventions that we have been using throughout the semester. Although we don’t provide an exhaustive list of style rules, the checker tests for style are complete, so if your code passes the checker, then it will earn full marks for coding style with one exception: docstrings may be evaluated separately. For each occurrence of a PyTA error, a 1 mark (out of 20) deduction will be applied. For example, if a C0301 (line-too-long) error occurs 3 times, then 3 marks will be deducted.
What to Hand In
The very last thing you do before submitting should be to run the checker program one last time. Otherwise, you could make a small error in your final changes before submitting that causes your code to receive zero for correctness.
Submit bikes.py
on MarkUs by following the instructions on the course website. Remember that spelling of filenames, including case, counts: your file must be named exactly as above.
Getting data from the web (Optional)
In this assignment, you worked with real world data. In this section, we’ll tell you a bit more about this dataset. Note, this part is optional and not for any marks, so read it only to satisfy your curiosity!
The City of Toronto Bike Share website provides data about stations in a file format called JSON. We used two of the City’s bike share JSON files, one with station information and another with station status.
Although we could have manually downloaded the files from the bike share website, we wrote a program to do that for us. We are computer scientists after all! Our program, download_data.py, is used to read the data from the bike share website and save it two JSON files (current_info.json
and current_status.json
) on our local computer. Once we have those two files, we then run a second program, prepare_data.py to convert the data from the JSON files to CSV format, remove some fields, and merge data about the same station into a single row.
Although the download_data.py
and prepare_data.py
files may be a bit challenging to read and understand at the moment, you should be able to write this type of code yourself by the end of the course. In the meantime, you can use our programs to get the most current bike share data.