MET CS 688 Assignment 6
Please follow the submission requirements at the end of the assignment!
Note this is a 50-point assignment. The other 50 points come from a quiz you will take.
In this assignment, creating data visualizations (problems 1de and 2d) means making a plot or graph. Show that you understand the best type of chart for the problem at hand. It may help if you explain why you chose the chart that you did. You may create the chart using googleVis, ggplot, or any other R code that creates high quality data visualizations.
• All data visualizations that you create must follow good data visualization principles. You are held accountable for ALL of the data visualization principles given in the module. This includes, but is not limited to:
• All data visualizations that you create must have meaningful titles and axis labels.
• If you use different colors, line styles, symbols, etc., you must also create a legend.
• Ensure that your labels are not cut off and that you are not missing labels.
• Ensure that all information added to the plot is clearly readable.
• Making good data visualizations usually requires you to adjust plot settings. That is, don’t use the defaults!
Part 1 (30 points)
Load the Glitch data, as explained in the lecture. Do the following additional work:
• Using the aggregate() function, or appropriate code from the tidyverse, compute the data frame for the total players joining each month. Names the columns as Month and Joining.
• Using the aggregate() function, or appropriate code from the tidyverse, compute the data frame for the total players departing each month. Names the columns as Month and Departing.
• Merge the two data frames by Month column, using the merge() function or appropriate code from the tidyverse. You should get 14 rows and 3 columns.
• Create a graph to compare the trends in the joining and departing data. Ensure the data are sorted by time, so December is before January, January is before February, etc. Explain what you learn from your graph.
• Modify your graph from part (d) in some way that focuses on and/or identifies the months with the least departures (you don’t want to lose players). If it helps, you may remove the joining data as part of your modifications.
• You will need to do some research to change some aspect of the plot to create focus on these months. If using googleVis, these links could help:
• https://stackoverflow.com/questions/44573977/annotate-point-ant-text-to-r-googlevis-gvisareachart
• https://cran.r-project.org/web/packages/googleVis/vignettes/googleVis_examples.html
Part 2 (20 points)
Use the R SportsAnalytics API, do the following:
• Retrieve the NBA data for the 18-19 season.
• Which player has the best field goal percentage? Require >300 shots made. (why 300? see: https://stats.nba.com/help/statminimums/)
• Show the top 10 players in terms of TotalPoints, arranged from the highest to lowest.
• Create five data visualizations of your choice that highlight interesting elements in this dataset.
SUBMISSION REQUIREMENTS:
• Create a Word, PDF, or Rmd document. If you use Rmd you will need to make sure to save the output as a PDF.
• For each question, state the question you are answering. Then answer the question by explaining in sentences (in English, not in R or other languages) what you did to get to the answer. You may include screenshots and/or copy-paste of key lines of code and the corresponding output in your answer. (If you are using Rmd, this means you must generally use echo=FALSE and/or include=FALSE for the body of the document.)
• Full code should be included as an Appendix to your Word or PDF document. Coding must be in R. Do NOT include full code in the main part of your document.
• Please ensure that a Word or PDF file as the first file in your submission.
• You may also separately upload your R and/or Rmd code to Blackboard.
• If your facilitator tells you to submit the files differently than the above guidelines, you are expected to respect your facilitator’s wishes starting on the next assignment.
• Facilitators can deduct up to 20% if you fail to follow these requirements (more if the questions are not actually answered).
• Facilitators can deduct 5% for each day the assignment is late. You may submit one (and only one) of the six assignments up to three days late with no penalty but all other assignments will be penalized.
• Unless your facilitator or the professor agrees, your assignment will not be graded if it is more than 3 days late (e.g., no credit will be given after Friday at 6 AM Boston time). The professor will usually ask the facilitator to make the decision but in rare cases (<1% of the time) has overridden a facilitator. Do not expect the professor to override in most cases.