CMPT 120, Fall 2018, Project Page 1 of 6 Instructor: Diana Cukierman
FINAL ASSIGMENT: 6-49 analysis! DESCRIPTION, INCLUDING SUBMISSION INSTRUCTIONS
Image from https://www.wclc.com/games/lotto-649.htm
READ ALL THIS DOCUMENT
This is an up to 3 people team work (you can also work in a team of 2 people or individually if you prefer so). While you may discuss generalities with colleagues from other teams about this exercise, you cannot develop the same code, nor share code among teams, nor obtain code from other sources.
Being a team exercise, it places a big responsibility on each individual. You want to respect and be honest with your partners and with yourself: DO YOUR SHARE, and BE KNOWLEADGEABLE OF THE WHOLE ASSIGNMENT. Working on this assignment is also a way for you to prepare for the final exam.
Deadline:Monday December3,11:59PM–LASTDAYOFCLASSES.Noextensionsarepossible.No submissions will be accepted after a solution is post.
Like in previous assignments, if you work as a TEAM you need to JOIN a Canvas team associated to this assignment BEFORE SUBMITTING. ONLY ONE TEAM MEMBER SHOULD SUBMIT.
A.
PROBLEM SOLVING
1. FIRST: Read the Problem Solving Suggestions document. IT IS BRIEF AND HIGHLY RECOMMENDED.
2. READ BOTH THIS DOCUMENT AND THE SAMPLE RUNS. UNDERSTAND WHAT YOU ARE ASKED TO DO AND WHAT YOU ARE ASKED TO SUBMIT.
3. START WORKING ON THIS AS SOON AS POSSIBLE
GENERAL PROGRAM DESCRIPTION.
B.
You are asked to implement a Python program which analyzes data from past lotto 6-49 draws results. The data will be provided in a file combining real data (publicly available) with some fictional (simplified) data. The user will be offered a choice to run the program to process all the draws in the input file or only selected draws in the file. Given the user’s choice, the program will create an (output) file with some results and show several statistics.
More details are provided below. Check the sample runs; they complement the problem description.
CMPT 120, Fall 2018, Project Page 2 of 6 Instructor: Diana Cukierman
The file with the draws data is a CSV (Comma Separated Values) file in a specific format as detailed below. Concrete CSV data files to help you test your program are provided; these files contain partial data from actual historical 6-49 draws (this is public information), with added random data, formatted for this assignment. You may create different data files to best test your program as long as you follow the prescribed formatting. When the teaching team will mark your submission, different data files will be used, but following the same format. Your program should produce a CSV file as output information, with a specific format, and show results to the user, as described below.
C. DEVELOPING YOUR PROGRAM IN STAGES
You are recommended to develop this program gradually, and test it often.
It is recommended that you plan the whole top level idea, data structures and main variables that you
will use, but implement advancing in stages.
The following are recommended stages. Aim to have partial parts running before adding features. Do back ups often. Keep files with partial versions.
Stage I:
Process ALL the draws data, only do the most basic stats, do a basic user interaction but do not yet validate the user’s input. To start you could also create a partial output file only, for example just with dates and averages (more details below).
Stage II. Incorporate more features, test them:
a. Validations of the user input
b. Provide the option of processing selected draws. (aim to re-use code from processing all)
c. Provide additional statistics
d. Create the output file with all the values needed
Stage III.
If you have time and interest, work on bonus features.
D. Lotto 6-49 game BRIEF DESCRIPTION AND SIMPLIFYING ASSUMPTIONS FOR THIS ASSIGNMENT
Players can bet (guess and pay) up to 6 numbers, and one extra bonus number (i.e. 7 numbers in total). Numbers are from 1 to 49. These 7 numbers are drawn on the date of the draw. There is a jackpot (amount of money) available for each draw. Different number of people may guess more or less numbers-; those who guess are winners.
For this assignment we only manipulate the drawn numbers (not the individual bets).
CMPT 120, Fall 2018, Project Page 3 of 6 Instructor: Diana Cukierman
We are making a (big) simplifying assumption that winners are paid equally given the money available in the jackpot. (The amount of money that people win is calculated quite differently in reality, and takes into account many other aspects, not relevant here.)
In the data files provided for this assignment, the dates and numbers drawn are taken from public real data. The jackpot amounts and the number of winners is random and fictitious data.
E. FILES DESCRIPTIONS
1) Input to the program, IN_data_drawsN.csv
• The file includes the draws data with N lines (N draws)
• The data will be provided sorted by date
• For example, IN_data_draws3.csv contains the data of 3 draws:
• Each draw includes the draw date, 7 numbers, the jackpot and number of winners: the first 6 numbers are the regular drawn numbers, the 7th number is the bonus number, the jackpot is the amount of money available to pay, the last number is the number of people who won.
This data will be provided in a CSV file format. The Python code of a function is provided that reads the csv file and creates a list of lists with the information. (Each inner list has the data of one draw). More details below.
2) Output from the program. OUT_resultsN.csv
The output file will have one line per draw, Each line will have:
• date (original format),
• a list representing a distribution of numbers in ranges , and
• average paid per winner
the distribution of numbers in ranges lists has 5 values, each value representing how many numbers (drawn on that date, and including the bonus number) fall in the ranges (0,10], (10,20], (20,30], (30,40], (40,50). The Mathematical notation (a,b] is used to represent a range, where a is not included, b is included.
25-Nov-15
1
2
3
4
29
48
21
1500
2
1-Apr-17
27
40
41
42
45
46
20
2500
0
17-Apr-18
16
17
35
43
44
46
49
55500
3
CMPT 120, Fall 2018, Project Instructor: Diana Cukierman
Following the previous example, the output file would be:
25-Nov-15,[4,0,2,0,1],750.0 1-Apr-17,[0,1,1,1,4],0 17-Apr-18,[0,2,0,1,4],18518.33
Notice that if there are no winners the average paid is considered to be 0.
F. HOW DATA IS READ FROM AND WRITTEN TO THE FILES
Page 4 of 6
Python code is provided to read data from a CSV file. You should incorporate this function verbatim in your code and including the comments provided. YOU SHOULD USE THIS FUNCTION. When your program will call this function, the function will return a list of lists, where each list contains the data associated on one draw, each value as a string, in the same order as in the file.
Following the previous example, when calling the function, and providing the file name as argument, the function provided read_csv_into_list_of_lists (…) will return:
[ [’25-Nov-15′, ‘1’, ‘2’, ‘3’, ‘4’, ’29’, ’48’, ’21’, ‘1500’, ‘2’], [‘1-Apr-17′, ’27’, ’40’, ’41’, ’42’, ’45’, ’46’, ’20’, ‘2500’, ‘0’], [’17-Apr-18′, ’16’, ’17’, ’35’, ’43’, ’44’, ’46’, ’49’, ‘55500’, ‘3’] ]
You are recommended to further process this list to best access the information. See comments provided about a recommended function: convert_lall_to_separate_lists(…)
The functions append_1_draw_to_output_list(…) and write_list_of_output_lines_to_file(…) are also provided. These should be useful functions to allow you to produce the output file. Check the assumptions in the code provided.
G. OPTIONS PROVIDED TO THE USER
The user will be allowed to process all the data or selected data. The selection will be done for a month (provided by the user). (Only as bonus, and after you have done the whole assignment, you may want to explore offering the user the option of selecting by day of the week).
See the sample runs. No sample runs are provided for the day of the week bonus option.
H. OUTPUT TO BE SHOWN TO THE USER:
See the sample runs
While you develop your program you are recommended to include additional trace printing. Comment out this extra trace printing when you submit.
The text information that your program shows to the user should be analogous to what is shown in the sample runs, and in the same format, including the TRACE printing
CMPT 120, Fall 2018, Project Page 5 of 6 Instructor: Diana Cukierman
Some of the statistical results are recommended to be calculated as the processing takes place. You are highly recommended to plan which variables and structures (e.g. lists) you need, to accomplish these calculations. Statistics should be self-explanatory from the sample runs. Ask if in doubt.
I. REQUIREMENTS IN DETAIL (anything required will gets points)
Requirements – general and validation
a) The program should have a dialog and options analogous to what is presented in the sample runs
b) The program should allow that the file names are requested to the user
c) The results obtained by your program with the data files provided should be the same as the results
shown in the sample runs.
d) The graphic component is optional, for bonus points. You are suggested to use turtle functions for
this bar graph.
e) Your program should work well with other data files as long as the formatting conventions in the
data files are respected
f) The program should validate that the user types numbers when asked to do so. (e.g. month
numbers and within a valid range. The program should also validate that the options ALL, SEL or END are typed correctly.
Requirements – coding details and style
f) Your program should have at least 5 “fruitful” or “productive” functions
g) Your program should have at least 3 “void functions”
h) Your program should have at least 5 functions receiving parameters (and so that the parameters
are correctly used inside the function) (these functions may be productive or void)
i) Your program should have a reasonable main level which shows the general structure of the
program and calls functions. The main level can be the program top level or it can be inside a
“main” function.
j) You may use some variables as global (i.e. defined at the top level and not passed as parameters).
The reason to have these variables as global would be that they are frequently used by many functions. Yet, given the requirements above you will have to use variables in functions that are not global.
k) All the global variables used in the program should be initialized (at least with a fictitious value) at the top of the code (even before the functions are defined) including a brief comment of what each variable role is. No comments are needed if the names of the variables are self-explanatory.
l) Name your variables and functions appropriately
m) At the top of the program file include as comment the authors names and dates of the versions
n) Include comments, with general descriptions of functions, special situations being true at a certain
place in the program, etc. On the other hand, do not include redundant comments. For example the statement “i = i + 1” does not need the comment “i is increased by 1”. Keep in mind that good naming of variables and functions reduce the need of comments.
CMPT 120, Fall 2018, Project Page 6 of 6 Instructor: Diana Cukierman
o) Include “Trace printing” as you debug your code. When you submit your solution comment out extra tracing prints. However, you need to leave in your program tracing print analogous to the sample runs.
p) Bonus points will be given if you do not break loops (with break or return statements), and rather use while statements with one or more conditions.
Requirement –clarification file (txt) (team.txt or individual.txt)
q) You need to submit an admin file (as in previous assignments)
r) Name your file “team.txt” when you are a group of 2 or 3 members. In this case you need to include
the group members names and clarify how you distributed tasks among the team members. If you are working individually, name the file “individual.txt”. In this latter case, the file may minimally include comments about how you worked with this exercise.
q) It would be useful for you if you keep track of the time you spend on this exercise. You are asked but not required to share this information, including the total hours dedicated to different tasks or in general. If it is team work , include the total time considering all the team members. If you submit this information, include it in the admin file.
Requirement – Flowchart for top level
r) Submit a flowchart describing only the main/top level possibly referring to some of the global
variables. You do NOT need to do a flowchart for all the details!! You may use flowgorithm or draw the flowchart by hand and take a picture/capture the screen and submit a jpg or png file.
J. WHAT YOU ARE PROVIDED
a. This description
b. A general problem solving description
c. Code with functions (the file CMPT 120 final assig – CODE PROVIDED.py)
d. Sample runs
e. Input data files for you to test
I . WHAT YOU NEED TO SUBMIT
a. The code (Python file) of your final submission (allowing to see the Trace printing of the data files)
b. Flowchart of your main level program
c. Your own sample run: A copy of the output produced when you run the program with 3 draws, for
the options that you have implemented
d. Captured screens of the associated turtle graphics (if you implemented such)
e. The admin text file (group.txt or individual.text)
If you have any questions consult with the Teaching Team.
Make sure that you check email and Canvas announcements in case that additional clarifications are provided.
End of description of the final assignment