School of Information Management INFO 6270 Introduction to Data Science Winter 2019/2020
Instructor: Colin Conrad
Individual Project – A Data Science Project to Call Your Own
OVERVIEW AND INSTRUCTIONS
The primary objective of INFO 6270 is to introduce you to data science and prepare you to apply data science techniques to challenges faced in your target work environment. As managers (broadly construed), it is imperative that you are not only able to work with code but also apply these technical skills to solve challenging problems and effectively communicate the results. In order to assess your ability to achieve this proficiency, you will complete a comprehensive individual project.
Using an open data repository and the Python Jupyter notebook framework that we have used throughout this class, conduct and document your own analysis. Your document must meet the following requirements:
1. Document must outline data analysis using Python 3 and document markup;
2. Document must leverage data retrieved from at least one organized repository;
3. Document must contain at least 50% original code;
4. Document must cite supporting literature, data, and code used from external sources using a distinct References section. References should loosely adhere to the APA citation style.
5. Analysis must use one or more of the following:
a. A SQL or SQL Lite database;
b. An application programming interface;
c. A Numpy dataframe.
6. Analysis must use one or more of the following:
a. Statistical analysis;
b. Association rule mining;
c. Machine learning.
7. Document must contain at least one data visualization generated by Python code.
When complete, you will have the option to submit your work to the nascent SIM data repository. Contributions to this repository will help advance SIM education, research and industry projects. Contributions to the SIM data science repository must adhere to the MIT License; you will be recognized for your work, though anyone can use your code for future projects.
DEADLINE & SUBMISSION
Completion: Individual
Due date: Monday, April 13th, 11:55 pm.
Submit: Upload your Jupyter notebooks to Github (recommended) or Brightspace
ASSESSMENT
You will be assessed using the following rubric.
Description
Excellent (85%-100%)
Very Good (80%-84%)
Good (75%-79%)
Satisfactory (70%-74%)
Unsatisfactory (0%-69%)
Analysis quality (25 points)
The project reflects analysis that adheres to all of the project requirements. The analysis is conducted soundly and reflects warranted conclusions to the specified research question.
The project reflects analysis that adheres to all of the project requirements. The analysis conducted generally reflects conclusions to the specified research question.
The project reflects analysis that adheres to most of the project requirements. The analysis conducted generally reflects conclusions to the specified research question.
The project reflects analysis that adheres to most of the project requirements.
The project does not reflect analysis that adheres to the project requirements.
Documentation (10 points)
Documentation is consistently provided using Jupyter markup. An untrained reader would be able to clearly understand what code does, the problem it solves, and its usage rights.
Documentation is consistently provided using Jupyter markup. An untrained reader would be able to generally understand what the code does, the problem its solves and its usage rights.
Some documentation is provided using Jupyter markup. An untrained reader would be able to generally understand the code and the problem it solves.
Some documentation is provided using Jupyter markup.
No documentation is provided in the submitted document.
Solution (5 points)
The conducted analysis either solves an original problem or contributes to extant analysis in a substantial way.
The conducted analysis solves a problem and makes some contribution to extant analysis.
The conducted analysis solves a problem by replicating an existing solution.
The conducted analysis makes progress towards solving a problem.
The conducted analysis does not make progress towards solving a problem.