• [30 points in total]
This dataset comprises individual credit card transactions by employees of the State of Oklahoma in the U.S. Please use this dataset to answer the questions below. Answering this question requires cleaning and manipulating the dataset using either Python or other tools. Please document any methods you use to clean the messy data.
• [3 points] What is the total amount of spending captured in this dataset?
• [3 points] How much was spent at WW GRAINGER?
• [3 points] How much was spent at WM SUPERCENTER?
• [3 points] What is the standard deviation of the total monthly spending in the dataset?
• [8 points] Please describe the process you would follow to build a model on this dataset to make predictions about the U.S. public company performance measured by the stock market. Please note this is a hypothetical only – there is no need to build an actual model. You only need to show your detailed ideas here.
• [5 points] What biases might this dataset have if you tried to use it to model stock returns prediction?
• [5 points] By data exploration, do you have any other interesting observations about this dataset pattern? This is an open-end question and you are free to go to Internet such as Yahoo finance to download any other time-series which may help for this question. Feel free to paste figures and tables here if needed.