CSC121H Assignment 1: Election Systems
Updates/FAQ
- Feb. 24 – Constants: Your code must use the constants provided in
election_functions.R
. You cannot assume that the indexes for the parties will be the same as given for every tests we run on your code. We will be changing the value of the index constants (for example,kLiberalIndex
) when testing your code to ensure you have used the constants properly. - Mar. 1 – Election Functions Length and Design: There have been some students who showed me their election functions and they are too long, with too many uneccesary if-statements and loops, often repeating a lot of code to do similar tasks. This is bad design, and you will lose marks if your function is too long and convoluted. The best advice I could give is to think about how you can use the constants (especially the dictionaries) to make your function short and elegant. This goes for all of the functions you write – design matters!
- Mar. 5 –
PrintCountryResults()
docstring change inelection.R
: The docstring should say# Given a four-element vector representing...
, not a four-element list. Many of you have noticed this already, and it should not affect your code to much. I have updated the starter code, but you don’t have to add the changed docstring to your solution. - Mar. 6 –
GetFPPBallotsFromFile()
docstring change and advice inelection.R
:
There was some confusion when I wrote that the function should return a ‘list of ballot lists’. The first ‘list’ meant to mean the R list type, and the second ‘lists’ reffered to a collection of ballots. I made it clear in the second comment that these should be vectors of ballots in the case of FPP, but just in case, I have changed the starter code to reflect this. Again, you don’t have to add the docstring change to your solution.Also, some students in office hours had issues with knowing when a line has a ballot versus when a line has a non-ballot string. Because of the way the ballots are purposefully formatted, I suggest to use the
strsplit()
function in some way – it will likely be of help. Note that you cannot use the%in%
operator to check if a string is inside another string – it will not work!
Also note that your code must read the file line-by-line as in the starter code. Any other method of reading the file will not be accepted for this assignment.
Due: Friday, March 9th by 11:00pm
You may work alone or in a group of two. Declare your partnership on MarkUs BEFORE you begin coding.
Once a partnership is set up, I will not change groupings (barring extraordinary circumstances), so make sure it is someone you want to work with.
You have one grace point in this course. This will allow you to extend the due date by 24 hours. If you use one for A1, you can not use it for A2 (both partners must have a grace point to extend the due date), so be wise in your decision.
It is often best to work with your partner in the same room, rather than splitting up the assignment and letting each person work on their own. This is because when working alone, each person doesn’t know what’s happening on other parts of the assignment, and that can lead to issues and bugs when putting the code together (and, a much longer time to finish it!). Try working together in the same room and using the driver/navigator method we use in lab. Previous experience has shown that this will lead to finishing the assignemnt more quickly, since both people will be communicating their thoughts as the code is written and you can help each other out when needed.
Start early! This assignment will take a considerable amount of time to complete. Under no circumstances should you leave this to the last few days before it is due.
It is recommended that you print out this handout and highlight key points. Make sure you understand what each part of the assignment is asking you to do, and ask questions if you are stuck on something. Don’t forget that we have a discussion board!
Remember that the Teaching Labs in Bahen are open 24/7 for students taking CSC courses and you can work on the assignment in any of the lab rooms (if they’re not taken up by a class).
Before you start: A warning about academic offenses
The CS Department has software that is used to compare similaraties between different submissions, and checks for similaties to code found online.
As stated in the course information sheet, you must hand in your own work. You should only discuss your code with your partner (if you choose to have one), the CSC121 TAs, and the instructor. Do not submit code that is not yours or that you found somewhere. Do not show your solution to other students in the course, and do not post your code on the discussion board.
Code Restrictions
You must not use any R features or functions that have not been covered in the course. In particular, you must not use break
, next
, and repeat
.
Background: The 2018 Canadian Election
Due to a political stalemate between the four political parties that make up the federal government of Canada, it was decided unanimously that a new election would be held next month to find out the will of the people of Canada, and to try and restore faith in democracy.
The Canadian Elections Agency has decided that 2018 will be the first year that elections are counted using a computer, and you have been given the honour of creating a program that does that!
However, there has been pressure to change the way elections are run in Canada, and you have been asked to prepare a flexible system to handle different Election Systems.
An Election System is a method (or algorithm) that aims to provide the winner of an election, given a list of candidates and a set of ballots. For example, you may be familiar with the First Past the Post system: each voter selects one candidate, and the candidate with the most votes is elected, regardless of their percentage vote share.
The Canadian Elections Agency has decided that for the 2018 election, there will be four types of possible election systems to choose from:
- First Past the Post
- Yes or No
- Preference Points
- Ordered Rank
Each of these systems have different types of ballots.
How Elections in Canada Work:
Canada is, for the purpose of elections, divided into 338 geographical areas known as ridings. In a standard Canadian election, each political party nomiantes a candidate to run in each riding. The voters in each riding elect one of these candidates to the government using (at the present moment) the First Past the Post election system. These elected Members of Parliament (MPs) get one of the 338 seats in the House of Commons in Ottawa.
Four of the five parties who currently make up the House of Commons put candidates on the ballot in all 338 ridings. To simplify the assignment, you will only have to include four of these parties:
For this assignment, we will not differentiate candidates from their parties: All ballots will include only the party names.
Starter Code
Download these file and put them in a folder on your computer for this assignment. We recommend you call the folder something like ‘a1’.
- election_functions.R
- election.R
- ballot data zip file (extract the files in this zip file to the same place as the R files for this assignment)
The Assignment
Your tasks
- Write the functions to run each of the four election systems. In the table below, we specify the name, types of the input and output, and the behaviour.
- Write a set of test cases for two of these functions:
YesOrNo
andPreferencePoints
. - Write a program that allows the user to choose an Election System and some ballot data, and produce the results of the election on either one specific riding or on the entire country using your functions.
Constants
As we learned in class, Constants are variables whose values do not change for the entire run of the program. Recall that constant variable names start with a ‘k’. The code that you write for this assignment will rely on several constants, which you can find near the top of the election_functions.R starter code.
A description of these constants is below:
The party names are strings: "LIBERAL"
, "CPC"
, "NDP"
, and "GREEN"
- For each party, the index where that party’s data will appear in a 4-element vector (we’ll work with vectors of this sort throughout the assignment). Each of these is :
kLiberalIndex
kCpcIndex
kNdpIndex
kGreenIndex
kPartyIndexes
is a vector of numbers: the indices used to represent the 4 parties. This order determines the order for the Yes or No ballots and Preference Points ballots, described in the functions below. This is the value of this constant:c(kLiberalIndex, kCpcIndex, kNdpIndex, kGreenIndex)
kNameToIndex
is a dictionary in which each key is a party’s name and each value is that party’s index.kIndexToName
is a dictionary in which each key is a party’s index (as strings, not numbers, since names must be strings) and each value is that party’s name.
Types of Ballots
Name | Description | Example of a single ballot |
---|---|---|
First Past the Post | One party is selected per ballot. | "LIBERAL" |
Yes or No | For each party, voters indicate whether they like the party with a simple “YES” or “NO” vote. Voters indicate their choice for all parties. The order of these corresponds to the order in kPartyIndexes . |
c("YES", "NO", "YES", "YES") |
Preference Points | For each party, a number of points from 1 to 5 is assigned by the voter. Higher points indicate a higher ‘prefernece rating’. The points do not have to be unique. The order of these corresponds to the order in kPartyIndexes . |
c(3, 5, 1, 3) |
Ordered Rank | Parties are placed in ranked order of preference. In the example, 'CPC' is the voter’s first choice, 'GREEN' is second, 'LIBERAL' is third and 'NDP' is fourth. |
c("CPC", "GREEN", "LIBERAL", "NDP") |
Part 1: The Election System Functions
In the file election_functions.R
, you must write the four election system functions specified below.
One helper function has been included in the starter code, IndexOfMax(L)
, which you may use anywhere in your functions.
Other than the definitions of constants and possibly some helper functions, there should be no statements outside of functions.
Running election_functions.R
should not produce any output or expect any input from a user in the console. Similarly, none of your functions should produce output or expect input from a user in the console. You should test your code in separate files (like we’ve done in lab) so that you don’t accidentaly leave statements outside of functions in election_functions.R
.
Note that although there are four different functions, they do similar things, and so you should be able to use similar logic in at least some parts of each function. Don’t overthink it, and don’t write really long functions! Simplicity is key.
Function name Argument types, Return type |
Description |
---|---|
FirstPastThePost
Argument Type(s): string vector |
The argument is a vector of single-party ballots for a single riding.
Precondition: the vector argument is not empty. In First Past the Post, the party that receives the most votes wins the seat. We have provided the header and docstring for this function in the starter code. Return a list where the first element is the name of the winning party according to First Past the Post, and the second element is a four-element vector that contains the number of ballots for each party. The order of the vector elements corresponds to the order of the parties in FirstPastThePost(c("LIBERAL", "LIBERAL", "NDP")) [[1]] [1] "LIBERAL" [[2]] [1] 0 1 2 0 The output type for the other three functions is the same, so we won’t rewrite it. The other functions have a different input type though, since they use a different type of ballot. |
YesOrNo
Argument Type(s): List of string vectors |
The argument is a list of 4-element string vectors that represent Yes or No ballots for a single riding; the order of the vector elements corresponds to the order of the parties in kPartyIndexes .
Precondition: the list argument is not empty. In the Yes or No system, the party that receives the most Return a list where the first element is the name of the winning party according to the Yes or No system and the second element is a four-element vector that contains the number of |
PreferencePoints
Argument Type(s): List of numeric vectors |
The argument is a list of 4-element numeric vectors that represent Yes or No ballots for a single riding; the order of the vector elements corresponds to the order of the parties in kPartyIndexes .
Precondition: the list argument is not empty. In the Preference Points system, the number of points for each party is summed. The party that receives the most points wins the seat. Return a list where the first element is the name of the winning party according to the Preference Points system and the second element is a four-element vector that contains the total number of points for each party. The order of the vector elements corresponds to the order of the parties in |
OrderedRank
Argument Type(s): List of string vectors |
The argument is a list of 4-element vector that represent Ordered Rank ballots for a single riding.
Precondition: the list argument is not empty. The Ordered Rank is determined by assigning points according to ranking. A party gets 3 points for each first-choice ranking, 2 points for each second-choice ranking and 1 point for each third-choice ranking. (No points are awarded for being ranked fourth.) For example, the rank ballot shown in the ballot example chart above would contribute 3 points to the Conservative count, 2 points to the Green count and 1 point to the Liberal count. The party that receives the most points wins the seat. Return a list where the first element is the name of the winning party according to the OrderedRank system and the second is a four-element vector that contains the total number of points for each party. The order of the vector elements corresponds to the order of the parties in |
Part 2: Test Cases for Yes or No and Preference Points
In files called test_yes_or_no.R
and test_preference_points.R
, write test cases to thoroughly test the corresponding functions.
Put comments above each test case indicating the purpose of that test case.
You may format the console output with print statements, or using cat in whichever way you like.
Part 3: Election Day!
In this part of the assignment, you will write a program that uses the functions you just wrote to show what would happen when different election systems are used for a set of ballots in an election. Your program must read ballot data in from specified files, and then repeatedly ask the user to specify an election system and a riding number. It should then run an election for the specified riding using the specified election system. If the user gives all
as the riding number, the election is run for all the ridings in the input data file, and a seat total is given for each party.
Here are the specific requirements:
-
- Your program must be added to the bottom of the file election.R, after any helper functions that you define.
- It must source election_functions.R and use the four functions specified above to evaluate the ballots. The sourcing is done for you in the starter code (make sure all files are in the same folder on your computer). Do not delete the starter code.
- When election.R is run, your program should produce the output as described below. We have provided some helper functions in the starter code that print output to the console to make this easier.
You must use these helper functions. - To get ballot data, you will have to read from the data files we provide (and your own if you choose to make them for testing). We have provided an example helper function header for the function
GetFPPBallotsFromFile(ballotFile)
, which should read a ballot data file and create an appropriate R list to hold the ballots. We have added some code that reads from files (as we saw in lecture). You must complete the helper funciton, and create your own similar helper functions to get ballots for the other election systems. You may use the one provided as a template for your others, and its up to you how many helper functions you want to have. Make sure you write proper docstrings for any functions you write with good style, spelling, and grammar. - Although you will want to develop and test your code on different input data files, the version of elections.R that you submit should read from the files named:
- fpp_ballots_large.txt
- yesno_ballots_large.txt
- points_ballots_large.txt
- rank_ballots_large.txt
- as described below.
-
- They are assigned to constants in
election.R
- .
- Your code must be well-designed and documented and must follow the style guidelines for this course.
Input Format
Working with real data is messy and computer scientists often find themselves in a situation where they need to parse real data files where the specification either isn’t provided or isn’t correct. For this assignment we want you to work out how to extract the information you need by looking at the data files we have provided. It is important to note that while ridings are approximately the same size, the number of votes cast per riding varies with the riding. For each of the four ballot types, the provided data files (from the datafiles.zip file) include:
- a small file with eight ridings
- a medium file with 338 ridings and approximately 100 voters per riding
- and a large file with 338 ridings and approximately 1000 voters per riding
Your program must run without error and without any changes on all of these files (and other files that follow this format).
Output Format
Here is a sample transcript of sourcing election.R using the medium input files described above. The program’s console output is in blue and the user’s input is in black bold. When you run your program and give the same user input, your program should produce exactly this transcript, including whitespace (although you may choose to break ties differently than we did). In the starter code, we have provided some helper functions that print results in this format.
Select an election system or Q to quit: FPP, YesNo, Points, Rank, Q
FPP
Holding Election with the FPP system.
Which riding would you like to see results for? (Enter a number between 1 and 338, or all.): 57
Results for Riding 57
=====================
The seat is won by the CPC candidate.
The distribution of the results is as follows:
LIBERAL: 18
CPC: 41
NDP: 36
GREEN: 7
Select an election system or Q to quit: FPP, YesNo, Points, Rank, Q
YesNo
Holding Election with the Yes or No system.
Which riding would you like to see results for? (Enter a number between 1 and 338, or all.): all
Seat Assignments for the Country
================================
LIBERAL: 46
CPC: 61
NDP: 206
GREEN: 25
Select an election system or Q to quit: FPP, YesNo, Points, Rank, Q
Rank
Holding Election with the Ordered Rank system.
Which riding would you like to see results for? (Enter a number between 1 and 338, or all.): 11
Results for Riding 11
=====================
The seat is won by the NDP candidate.
The distribution of the results is as follows:
LIBERAL: 165
CPC: 98
NDP: 180
GREEN: 133
Select an election system or Q to quit: FPP, YesNo, Points, Rank, Q
Points
Holding Election with the Preference Points system.
Which riding would you like to see results for? (Enter a number between 1 and 338, or all.): all
Seat Assignments for the Country
================================
LIBERAL: 260
CPC: 52
NDP: 3
GREEN: 23
Select an election system or Q to quit: FPP, YesNo, Points, Rank, Q
YesNo
Holding Election with the Yes or No system.
Which riding would you like to see results for? (Enter a number between 1 and 338, or all.): 10000
Invalid choice. Enter a riding between 1 and 338, or all.
Which riding would you like to see results for? (Enter a number between 1 and 338, or all.): 1
Results for Riding 1
=====================
The seat is won by the NDP candidate.
The distribution of the results is as follows:
LIBERAL: 36
CPC: 42
NDP: 49
GREEN: 24
Select an election system or Q to quit: FPP, YesNo, Points, Rank, Q
not_a_choice
Invalid choice. You must select from these options: FPP, YesNo, Points, Rank, Q
q
Invalid choice. You must select from these options: FPP, YesNo, Points, Rank, Q
Q
Dealing with Ties
In a real election with much larger numbers of ballots, ties are not common. However when you use the smaller data sets (even the large one we provided), some ties occur. You should not worry about them in your code. When two parties are tied, you may choose either one. Because of the ties, your numbers may not be identical to the transcript above. The tests we use to check the correctness of your functions will not result in any ties at any point in the algorithms.
Testing
Finally, be sure to test your program in elections.R with different user input (not in separate files, since you just need to source elections.R to run it) and be very careful that the output matches the description in this handout.
Marking
Your assignment will be evaluated using these three criteria:
- Correctness: Your code should perform as specified, and election_functions.R should contain each of the four required functions. Remember that spelling of file and function names, including case, is important, and that functions should have exactly the number of arguments as specified, so be careful.
We will separately test your code for each of the four required functions. Then we will use correct implementations of these functions by replacing your election_functions.R with our solution, and test your election.R. This way you can get marks for correctly writing the program in
election.R
even if your implementation of one of the election systems has a flaw.Correctness, as evaluated by our tests and the TAs, will count for 70% of the assignment marks.
- Your Test Cases: We will mark your two test cases files for correctness (do they run?), for coverage (do they test the code thoroughly without overlap?).
The test cases are worth a total of 10%. - Style and Design: Your code should be well documented with docstrings as well as internal comments, and it should adhere to the R Style Guide for this course. You should describe complicated pieces of code using internal (
#
) comments, but do not go overboard and comment every line – you will also lose marks if you have too many comments.Helper functions: Your program should be broken down into functions, both to avoid repetitive code and to make the program easier to read. Although we have specified some of the functions for this assignment, there are lots of opportunities for you to design your own functions to help your code be cleaner by getting rid of repetitive code. We will also be looking at how you use them when grading your assignment. Each helper function should have a coherent purpose and should use arguments, not environment variables, to get all the information it requires from the function caller. For each function you write, write an appropriate docstring. (We have written some docstrings for you, do not change them).
Style and design will count for 20% of the assignment marks.
IMPORTANT NOTE: Make sure all of your .R files do not have any errors when you ‘source’ them.
If any errors occur when sourcing the file, you may get zero. If we have to edit your code a bit to get your file to source properly, you will lose marks from your final score. Having to make substantial edits to your code to make it source without errors will result in a zero.
A checklist for when you are ready to submit
- Did you reset the constants for the input files in
election.R
to this?kFPPFilename <- "fpp_ballots_large.txt" kYesNoFilename <- "yesno_ballots_large.txt" kPointsFilename <- "points_ballots_large.txt" kRankFilename <- "rank_ballots_large.txt"
- Did you address all the R Style Guidelines?
- Does your code source without errors?
- Did you run your own tests one last time?
- Did you write your name(s) and UTORid(s) at the top of each .R file?
Once you have done all of the things listed above, submit to MarkUs these four files:
election_functions.R
election.R
test_yes_or_no.R
test_preference_points.R