python代写 CSE 231

CSE 231
This assignment is worth 45 points (4.5% of the course grade) and must be completed and turned in

before 11:59 PM on Monday, October 22, 2018.

Assignment Overview

(learning objectives)
This assignment will give you more experience on the use of:

1. Lists
2. Functions 3. iteration 4. string

The goal of this project is to analyze census data.

Assignment Background

Immigration is in the news. One of the powerful things you can do with Python is investigate data related to current news. One question might be: how many immigrants are in the US? We go to the US Census, a task mandated in the Constitution (www.census.gov and https://factfinder.census.gov/). Between the required 10-year census they survey for all sorts of data. Here we consider the percentage of residents who are native born, naturalized citizens and non-citizens. (If you are curious, the data file has data on a wide variety of stuff: income broken down into many categories, average rent, education levels, etc.) The data in the 2016 file is by state so there are 51 rows of data (includes Washington D.C., but not territories such as Puerto Rico). There are hundreds of columns; we are only interested in a few.

There is a second file of data from the year 2000 that we use for comparison of total US counts. It is structured differently that the 2016 data so our desired information is in different columns with different headers.

Project Description

Your program must meet the following specifications:

  1. At program start prompt the user for the files to be analyzed: year 2016 first, then year 2000.
  2. You have to use the eight functions in the provided proj06.py skeleton. You have to implement

    and use the

    1. open_file(): Returns the file pointer to the file opened by asking user for the file

      name. Error checking is required.

    2. find_index(header_list,s): Takes two arguments. The first is the header row

      split into a list of strings. The second argument is a string. Return the index (int) of the string s in the header row list. This index is the column index of the string in the data file.

    3. read_2016_file(fp): Takes a file pointer as an argument and returns a sorted list of tuples of data in the file. The tuples are sorted on the last item in the tuples— itemgetter is useful for sorting on an item that is not first in the tuple. Sorting is smallest to largest. Percentages are with respect to total residents, i.e. native + naturalized + non-citizen. The values are to be found using the string value— pass that string to the find_index function which returns the column index (this

Programming Project 06

Fall 2018

allows us to get the right data out of different files). Each tuple has six items in this order:

  1. state (str) found at column index 2
  2. count of native-born residents (int) found in column EST_VC197
  3. count of naturalized citizens (int) found in column EST_VC201
  4. ratio of naturalized citizens to total residents (float)
  5. count of non-citizens (int) found in column EST_VC211
  6. ratio of non-citizens to total residents (float)
  1. read_2000_file(fp): Takes a file pointer as an argument. Returns one tuple of values in this order:
    1. total population (int) found in column HC01_VC02
    2. count of native-born residents (int) found in column HC01_VC03
    3. count of naturalized citizens (int) found in column HC01_VC05
    4. count of non-citizens (int) found in column HC01_VC06
  2. calc_totals(data_sorted): Takes the sorted list of 2016 data returned from the

    read_2016_file function and returns one tuple of values in this order:

    1. total count of native-born residents (int)
    2. total count of naturalized citizens (int)
    3. total count of non-citizens (int)
    4. total residents, i.e. the sum of native_born + naturalized + non-native
  3. make_lists_for_plot(native_2000,naturalized_2000,non_citizen_2000,nati ve_2016,naturalized_2016,non_citizen_2016): Takes six integers as arguments and returns one tuple of three lists, in this order. (This is a trivial function to ensure your data is organized for plotting.)
    1. [ native_2000, native_2016]
    2. [ naturalized_2000, naturalized_2016]
    3. [ non_citizen_2000, non_citizen_2016]
  4. plot_data(List1, List2, List3): This function is written for you. You just

    have to pass the correct lists for native, naturalized and non-citizens. There should be 2 entries in each list corresponding to native, naturalized, and non-citizen counts for the years 2000 and 2016.

  5. main(): Takes no input. Returns nothing. Call the functions from here. Only call plot_data if the prompt returns “yes”.

    The format string for the table header is:

    {:<20s}{:>15s}{:>17s}{:>22s}{:>16s}{:>22s}

Assignment Deliverable

The deliverable for this assignment is the following file: proj06.py – the source code for your Python program

Be sure to use the specified file name and to submit it for grading via the Mimir before the project deadline.

Assignment Notes

  1. To clarify the project specifications, sample output is appended to the end of this document.
  2. Items 1-9 of the Coding Standard will be enforced for this project.
  3. We provide a proj06.py program for you to start with.
  1. You do not need to use dictionaries for this project, but you are allowed to.
  2. If you “hard code” answers, you will receive a grade of zero for the whole project. An example of hard coding is to simply print an average rather than calculating an average and then printing

    the calculated average.

Test Cases Test 1

Enter a file name: ACS_16_1YR_S0201_with_ann.csv Enter a file name: DEC_00_SF4_QTP14_with_ann.csv

2016

State Native West Virginia 1,799,332 Montana 1,020,627 Mississippi 2,929,960 Maine 1,280,711 Vermont 596,371

Population: Native, Naturalized, Non-Citizen

North Dakota Wyoming
Ohio
Alabama Missouri Kentucky
South Dakota Louisiana
New Hampshire Wisconsin South Carolina Tennessee

Iowa Arkansas Indiana Michigan Pennsylvania Alaska

733,365

Naturalized Percent Naturalized 31,770 1.7% 21,893 2.1% 58,766 1.9% 50,768 3.7% 28,223 4.4% 24,588 3.2% 18,856 3.2%

513,592 4.3% 163,629 3.3% 249,202 4.0% 155,964 3.4%

30,926 3.5% 189,921 4.0% 76,309 5.6% 288,544 4.9% 237,964 4.7% 320,021 4.7% 160,189 5.0% 138,592 4.5% 349,169 5.1% 662,279 6.5% 870,913 6.6% 57,364 7.5% 97,953 5.6% 228,414 5.6% 452,436 7.9% 205,522 6.8% 134,137 6.7% 789,638 7.4% 89,391 9.0% 252,333 7.9% 394,217 9.1% 198,406 9.0% 1,038,312 9.5% 544,733 9.3% 1,031,169 11.6% 148,480 13.2% 514,050 13.5% 1,783,474 13.0% 1,020,394 13.0% 90,631 12.4% 921,870 14.2% 934,883 12.5% 1,123,882 15.3% 262,485 17.0% 4,236,511 18.8% 2,016,085 20.5% 4,536,115 20.8% 586,799 18.1% 4,729,920 15.3% 10,677,663 24.0%

Non-Citizen Percent Non-Citizen 16,704 0.9% 10,308 1.0% 35,300 1.2% 22,722 1.7% 11,859 1.9% 14,509 1.9% 11,460 1.9%

243,278 2.1% 104,018 2.1% 130,394 2.1%

95,051 2.1%

566,645 11,100,781 4,699,671 5,843,798 4,281,010 834,528 4,491,745 1,258,486 5,490,165 4,723,155 6,331,173 2,974,504 2,849,656 6,283,884 9,266,021 11,913,314 684,530 1,585,187 3,695,147 5,067,516 2,701,767 1,772,979 9,357,150 862,674 2,798,884 3,699,248 1,882,609 9,272,059 4,995,812 7,380,639 907,946 3,062,402 11,018,065 6,267,606 590,539 5,094,577 5,996,188 5,687,897 1,166,072 16,375,928 6,928,384 15,209,174 2,353,259 23,132,676 28,572,354

19,944 2.3% 110,233 2.3% 35,906 2.6% 157,189 2.6% 143,205 2.8% 202,097 2.9% 98,173 3.0% 94,308 3.1% 209,741 3.1% 319,280 3.1% 416,183 3.2% 24,592 3.2% 60,054 3.4% 150,657 3.7% 222,963 3.9% 124,266 4.1% 89,161 4.5% 476,498 4.5% 45,308 4.5% 158,751 4.9% 219,295 5.1% 119,746 5.4% 611,220 5.6% 328,738 5.6% 500,545 5.6% 66,770 5.9% 243,777 6.4% 899,853 6.6% 538,259 6.9% 50,628 6.9% 453,906 7.0% 535,847 7.2% 531,121 7.2% 113,088 7.3% 1,901,686 8.4% 912,497 9.3% 2,022,336 9.3% 308,088 9.5% 2,981,306 9.7% 5,308,155 11.9% —————————————————————————————————————- Total 2016 279,388,170 43,739,345 12.7% 22,500,973 6.5% Total 2000 250,314,017 12,542,626 4.5% 18,565,263 6.6%

Idaho
Oklahoma Minnesota Kansas Nebraska
North Carolina Delaware

Utah
Oregon
New Mexico
Georgia
Colorado
Virginia
Rhode Island Connecticut
Illinois
Washington
District of Columbia Maryland
Arizona Massachusetts
Hawaii
Florida
New Jersey
New York
Nevada
Texas
California

Do you want to plot? no

Test 2 (Test 1 with plotting)

Function Test 2: find_index

s = “abc”
lst = [‘xxx’,’yyy’,’abc’,’mmm’,’a’,’123′,’oops’] student_index = find_index(lst,s) instructor_index = 2

s = 'x'
student_index = find_index(lst,s)
instructor_index = None

Function Test: read_2016_file

fp = open(“ACS_tiny.csv”)
student_tup = read_2016_file(fp)
instructor_tup = [(‘Alabama’, 4699671, 163629, 0.03294111631266611, 104018, 0.02094047532290061), (‘Alaska’, 684530, 57364, 0.07484024496207367, 24592, 0.032084082422901394)]

Function Test: read_2000_file

fp = open(“DEC_tiny.csv”)
student_tup = read_2000_file(fp)
instructor_tup = (281421906, 250314017, 12542626, 18565263)

Function Test: calc_totals

sorted_tup = [(‘Alabama’, 4699671, 163629, 0.03294111631266611, 104018, 0.02094047532290061), (‘Alaska’, 684530, 57364, 0.07484024496207367, 24592, 0.032084082422901394)]
student_tup = calc_totals(sorted_tup)

instructor_tup = (5384201, 220993, 128610, 5733804)

Function Test: make_lists_for_plot

a,b,c,d,e,f = 1,2,3,4,5,6
student_tup = make_lists_for_plot(a,b,c,d,e,f) instructor_tup = ([1, 4], [2, 5], [3, 6])

Grading Rubric

Computer Project #06
General Requirements:

Scoring Summary (descriptive comments, function headers, etc…)

( 4 pts) Coding Standard 1-9
Implementation:
  ( 4 pts) open_file function (no Mimir test)
           -2 Did not use try/except
  ( 4 pts) find_index function
  ( 8 pts) read_2016_file function
  ( 5 pts) read_2000_file function
  ( 5 pts) calc_totals function
  ( 2 pts) make_lists_for_plot function
  ( 9 pts) Test1
  ( 4 pts) Test2 Draws Bar Chart (no Mimir test)

Note: hard coding an answer earns zero points for the whole project -10 points for not using main()