python机器学习代写

Assessed Learning Outcomes

This second assessment aims at testing your ability to build up a predictive model based on probability distributions analyse a dataset to gain understanding of the data from within Python build up a regression model and evaluate it communicate your findin gs on your predictive model

How to submit

For this assignment,, you need to submit the followings::

1.. A short report (iin .ppdf)) on your findings in exploring the given datasets,, a descriptionof your models and their evaluations,, as well as any decisions or acti ons that may be taken following your analyses..

2.. The Python source code written in order to complete the tasks set in the paper.. It isrecommended to submit two Python code files,, say Task1 sol..ppy and Task2 sol..ppy for the two problems you have proposed solut ions..

3.. A signed coursework cover

1 Predicting the W inner

Consider the Premier League dataset,, which records the results of the Premier league matches thus far during the current season 2017//118.. The data include Full Time Result,, Full Time Home// Away Team Goals,, Half Time Home//AAway Team Goals and other type s of variables.. Please have a look at the given notes along with the dataset in order to understand the abbreviations used in the dataset..

Objective:: Using the given dataset,, we would like to build up a model that can predict the winning team of the next p remier league match between Manchester United and Manchester City by using simulation and a historical dataset..

Manchester City vs Manchester United

Analyse the given dataset in order to show the followings::

1.. Check for missing values in the dataset,, drop columns that may be irrelevant to theproblem of predicting the winning team and provide a descriptive analysis of the dataset..

2.. Extract all home and away matches played by both teams as well as the number of goa lsscored or conceded.. You may present the results in the form of data frames..

3.. To get a better picture of how Manchester United and Manchester City stack up againsteach other,, juxtapose the teams ’ offensive and defensive performance data.. For

that purpose,, plot the goal scores frequency of the Manchester City ’ s away offense against Manchester United ’ s home defense and the Manchester City ’ s home offense against Manchester United ’ s away defense..

Simulation

To predict the winning team,, we can create fantasy gam es with an objective to estimate the probability that one team will beat another..

1.. Use empirical distributions of goals scored by the two teams to predict the winningteam by simulating a large enough fantasy games between both teams.. Handle possible draws a ppropriately..

2.. A balanced simulation should consider both the offensive and defensive performance ofeach team.. Perform a balanced simulation of the match between both teams in order to predict the winning team.. Handle possible draws appropriately..

3.. Use and j ustify theoretical probability distributions to simulate Manchester City – ManchesterUnited ’ s games as paired random drawings.. Then,, execute a balanced simulation of offensive and defensive performance with your probability models for goals scored.. Handle po ssible draws appropriately..

2 Predicting a Stock price

Consider the AAPL stock,, which records the daily AAPL stock prices from 1980s to date.. The data include the Open – price when the market opens,, High – the highest price on the day,, Low – the lowest price of the day,, Volume – the amount of stocks traded,, Cl ose – the price when the market closes,, and the Adjusted close – the adjusted stock price to account for stock splits that could have occurred..

Objective:: Using the given dataset,, we would like to build up a model that can predict the price of the stock for the next five days..

1.. Create the time series of the given stock prices.. You should consider the adjusted closeprices.. Comment on the graph obtained spotting trends or possible sharp price changes..

2.. Construct a predictive model of stock prices with any predic tors you feel are relevant.. You may introduce additional attributes into the dataset e..gg.. moving averages,, see

https::////een..wwikipedia..oorg//wwiki//MMoving_average

,

Bollinger bands,, see

https::////een..wwikipedia..oorg//wwiki//BBollinger_Bands

,

etc.. Justify why your model i s appropriate to use..

3.. Write down the mathematical equation of your fitted model and evaluate your model..MMake sure to withhold a subset of the data for testing.. You should aim for a model with a higher accuracy..

4.. Include in your report a discussion if you could make any money with your predictivemodel..

Mark Scheme

The following areas are assessed::

1.. Man.. City vs.. Man.. U.. model + justification + evaluation

[330 marks]]

2.. AAPL Stock prices prediction model + justification + evaluation

[330 marks]]

3.. Quality of coding

[220 marks]]

4.. writing a report (uup to 5 pages including graphs)) interpreting the results

[220 marks]]

Indicative weights on the assessed learning outcomes are given above.. The following is a guide for the marking::

First ( ≥ 70 to 100 marks)):: A complete coverage of data science techniques exploring the dataset;; both predictive models are detailed and well justified along with the evaluation of the regression model and perhaps an attempt to evaluate how good your model for finding the winning team is;; and a well written and structured report on the results obtained from the datasets and any decisions that may be recommended..

Second Upper ( ≥ 60 to 69 marks)):: A good coverage of data science techniques exploring the dataset;; bo th predictive models are justified with an appreciable accuracy for the regression model;; and a well structured narrative on the results obtained from the datasets and any decisions that may be recommended..

Second Lower ( ≥ 50 to 59 marks)):: Some techniques used for model building and evaluation are overlooked;; at least one predictive model partially justified with an appreciable accuracy is given;; and a good narrative of the findings about the dataset with few deficiencies..

Third ( ≥ 40 to 49 marks)):: Essentia l data science techniques are covered;; at least one predictive model is given with some justification;; and a written report describing some of the work done..

Fail ( ≤ 39 marks)):: Not satisfy the pass criteria and will still get some marks in most cases..

None – submission:: A mark of 0 will be awarded..