CSE 231
Spring 2021
Programming Project 05
This assignment is worth 40 points (4.0% of the course grade) and must be completed and turned in before 11:59 on Monday, March 1, 2021. This assignment will give you more experience on the use of strings and functions.
Assignment Overview
An important question about the pandemic is the death rate which is often calculated as deaths per million of population. That rate is one metric for how well a country has responded to the pandemic. Once you have the ability to read files in Python you can examine raw data to answer such questions.
Program Specifications
In this project, you will read a file of data with total Covid deaths and the population of countries. You will then calculate the total deaths per million for countries. Assigning deaths to Covid is not consistently handled among countries and often not consistently within countries. However, there is more consistency with first-world countries so we will extract data for the G20 countries. (There isn¡¯t anything special about the G20 except that it makes the assignment more interesting.)
Since we are dealing with real data there will be complications. For example, there are columns of data that we aren¡¯t interested in such as the deaths in the last seven days and there are rows of data that we aren¡¯t interested in such as non-G20 countries. There is one header row that describe columns but they are not data so we need to ignore them. Also, some data is in the thousands with commas and Python cannot directly convert a number with commas into a Python number such as an int or float. You will be asked to write a function to handle commas.
Deaths-per-Million high-level algorithm
Your high-level algorithm will be:
1. Open
2. Open
3. Write
4. Loop
a.
b.
c.
5. Close
an input file for reading
a different file for writing
header information to the output file. through the input file, reading it line by line
Call a function to process each line
i. Call a function to handle commas
Calculate deaths/million-population
If the data is for a G20 country, display it and write it to a file both files
6. Display which countries have deaths-per-million worse than the US
File Specification
The input file named data.txt has one header line and five fixed-field columns.
Column 0 is a country name (in a 25-character field), columns 1, 2, and 3 are integers, and column 4 is a float. The numbers are in 10-character fields. A file such as data.txt with fixed-field columns allows you to use string slicing to extract individual values.
Your program must also meet the following specifications:
1. You must have and use at least these four functions¡ªmore are fine. A proj05.py file with
function stubs is provided.
a. defopen_file(ch)¨file_pointer
i. If ch is ‘w’, open a file for writing; otherwise open a file for reading. Repeatedly prompt for a file name until a file is successfully opened. Use the try-except command. Usually, we would use
except FileNotFoundError, but that works only for opening a file for reading so we want to use the more general file error for reading and writing so use
except IOError.
If ch is ‘w’, the prompt should be
‘Enter a file name for writing: ‘ else the prompt should be
‘Enter a file name for reading: ‘
ii. Parameters: ch
iii. Returns: file_pointer
iv. Display: prompt and error message as appropriate
b. def handle_commas(s,T) ¨ int or float or None
i. The parameters are s, a string, and T, a string. The expected values of T is the
word¡°int¡± or ¡°float¡±; any other value returns None. If the value of T is ¡°int¡±, the string s will be converted to an int and that int value will be returned. Similar for ¡°float¡±. In both cases, you have to remove the commas from s before converting to int or float. If a value of s cannot be converted to an int or float, None will be returned (hint: use try-except)
ii. Parameters: str, str
iii. Returns: int or float or None
iv. Display: nothing
c. def process_line(line) ¨ str, int , float
i. The parameter is a string which is a line from the file. Extract the country
(string in a 25-character field), deaths (int in a 10-character field), and population (float in a 10-character field from the END of the line). Strip leading and trailing spaces from the country name. Return the country, deaths, and population. Use the function handle_commas to process the numbers for deaths and population.
ii. Parameters: str
iii. Returns: str, int , float
iv. Display: nothing
d. def main() Your main algorithm will be as follows calling the above functions.
i. Call open_file to open an input file for reading
ii. Call open_file to open a different file for writing
iii. Write header information (2 lines) to the output file and to the console.
– The header information in the first line is: ‘Country’,’Deaths’,’Population’,’Death Rate’ – The header information in the second line is:
Deliverables
”,”,’Millions’,’per Million’
For both lines, use the following formatting for the headers (also provided in the strings.txt file):
“{:<20s}{:>10s}{:>14s}{:>14s}”
iv. Loop through the input file, reading it line by line 1. Call a function to process each line
a. Call a function to handle commas
2. Calculate deaths/million-population
3. If the data is for a G20 country, display it and write it to the output file.
use the following string formatting for the numbers (also provided in the strings.txt file): “{:<20s}{:>10,d}{:>14,.2f}{:>14,.2f}”
v. Close both files
vi. Display which countries have deaths-per-million worse than the US
To determine which countries are worse than the US we provide the deaths-per- million for the US (1277.10). Note that the US rate is rounded to 2 digits after the decimal point. Start with an empty string before the loop and build a comma- separated string of the countries that you will display after the loop. Note that this item is not written to the output file. It is only written to the console.
The deliverable for this assignment is the following file: proj05.py — your source code solution
Be sure to use the specified file name and to submit it for grading via Mimir before the project deadline.
Notes and Hints:
1. To clarify the project specifications, sample output is appended to the end of this document.
2. Items 1-9 of the Coding Standard will be enforced for this project¡ªnote the change to include
more items.
3. You can test functions separately¡ªthat can be a huge advantage in developing correct code
faster! If you cannot figure out how to do that, ask your TA for guidance.
4. We provide a string with the names of G20 countries. Hint use the string operator in to see if
a country is in the G20.
G20 = “Argentina, Australia, Brazil, Canada, China, France,
Germany, India, Indonesia, Italy, Japan, South Korea, Mexico,
Russia, Saudi Arabia, South Africa, Turkey, United Kingdom,
USA, European Union”
5. We also provide a constant for the US death rate per Millions. Hint use this constant to look for the countries that have death rates high than the US. Note that the constant is rounded to 2 digits after the decimal point.
US_RATE = 1277.10
6. Do not hard code your solutions¡ªthe result is a zero. Hard coding means that your program for the specific tests rather than having a generic solution.
Sample Interaction:
Function Test handle_commas()
s,T: 5 int
Instructor: 5
Student : 5
——————–
s,T: 5.3 float
Instructor: 5.3
Student : 5.3
——————–
s,T: 1,234 int
Instructor: 1234
Student : 1234
——————–
s,T: 1,234.56 float
Instructor: 1234.56
Student : 1234.56
——————–
s,T: 5.3 xxx
Instructor: None
Student : None
——————–
s,T: aaa int
Instructor: None
Student : None
——————–
s,T: 1,234.56 int
Instructor: None
Student : None
Function Test process_line()
line: Tunisia
Instructor: Tunisia 3596 Student : Tunisia 3596 ——————–
line: China
Instructor: China 4746 1397.72 Student : China 4746 1397.72 ——————–
line: India
Instructor: India 140958 1366.42 Student : India 140958 1366.42
11.69
0 1,397.72
385 1,366.42
11.69 11.69
3,596 336 35
4,746 3
140,958 2,836
Test Case 1
Enter a file name for reading: data.txt
Enter a file name for writing: outfile.txt
Country
United Kingdom
Italy
USA
Mexico
France
Argentina
Brazil
South Africa
Germany
Canada
Russia
Turkey
Saudi Arabia
India
Indonesia
Japan
Australia
South Korea
China
Deaths
98,339
85,881
419,196
150,273
72,590
47,034
217,664
41,117
53,127
18,868
68,841
25,210
6,355
153,587
28,132
5,193
909
1,371
4,807
Population
Millions
66.83
60.30
328.24
127.58
67.06
44.94
211.05
58.56
83.13
37.59
144.37
83.43
34.27
1,366.42
270.63
126.26
25.36
51.71
1,397.72
Death Rate
per Million
1,471.48
1,424.23
1,277.10
1,177.87
1,082.46
1,046.60
1,031.34
702.13
639.08
501.94
476.84
302.17
185.44
112.40
103.95
41.13
35.84
26.51
3.44
Countries with higher death rates than USA per million.
United Kingdom, Italy
Test Case 2
Enter a file name for reading: xxx
Error opening file.
Enter a file name for reading: yyyy.txt
Error opening file.
Enter a file name for reading: data.txt
Enter a file name for writing: xxx/xxx
Error opening file.
Enter a file name for writing: outfile.txt
Country Deaths
United Kingdom 98,339
Italy 85,881
USA 419,196
Mexico 150,273
France 72,590
Argentina 47,034
Brazil 217,664
South Africa 41,117
Germany 53,127
Canada 18,868
Russia 68,841
Population
Millions
66.83
60.30
328.24
127.58
67.06
44.94
211.05
58.56
83.13
37.59
144.37
Death Rate
per Million
1,471.48
1,424.23
1,277.10
1,177.87
1,082.46
1,046.60
1,031.34
702.13
639.08
501.94
476.84
Turkey 25,210
Saudi Arabia 6,355
India 153,587
Indonesia 28,132
Japan 5,193
Australia 909
South Korea 1,371
China 4,807
83.43
34.27
1,366.42
270.63
126.26
25.36
51.71
1,397.72
302.17
185.44
112.40
103.95
41.13
35.84
26.51
3.44
Countries with higher death rates than USA per million.
United Kingdom, Italy
Scoring Rubric
Computer Project #05 Scoring Summary
General Requirements
______ 5 pts Coding Standard 1-9
(descriptive comments, function header, etc…)
Implementation:
__0__ (7 pts) open_file (manual grading)
__0__ (7 pts) Function Test handle_commas
__0__ (7 pts) Function Test process_line
__0__ (5 pts) Test 1
__0__ (5 pts) Test 2
__0__ (4 pts) Test 3 (same as Test 1 but checking if output
correct)
file is