MAT005
Coursework 2020
Time Series and Forecasting Dr Tracey England
Background to the coursework
• The manager of a local medical walk-in centre has no experience of time series or forecasting, and has asked for your help.
• The aim of this coursework is to analyse the number of patients that attend a local walk-in centre and predict the future number of patients.
• The centre manager is keen to know how many patients will come into the walk-in centre, each day, in the next week.
• The centre is open 7 days a week.
What forecasts are they interested in?
The manager of the GP walk-in centre would like to know:
• The number of patients that will come into the centre (each day) over the next 7 days.
• Is there any pattern in the data? – this will help the manager plan the staff rota accordingly.
The data
• The data is available on Learning Central (19/20-MAT005 Time Series and Forecasting under the Assessment section) within an Excel spreadsheet called TimeSeriesCourseworkData19_20.xls.
• The Data worksheet lists the number of patients that come in to the walk-in centre each day. The data covers the time period between 1st April 2015 and 31st March 2019.
• Use the data to predict the daily number of patients between 1st April 2019 – 7th April 2019. If you are able to accurately predict further then please do.
What do you
need to do in
your analysis
• A preliminary analysis of the data including both numerical and graphical summaries.
• Examine the components of the time series: the underlying trend, seasonality and error and produce a decomposition plot.
• Investigate a selection of time series models to see which model provides a good fit to the observed data.
Baseline & simple approaches, including: Naïve, Mean, Moving Average, Simple Linear Regression.
Complex approaches including: SES, Holt Linear, Holt Winters, Multiple Linear Regression, ARIMAs.
• Remember to include the appropriate error statistics and graphical comparisons for each forecasting model.
Sections
required within
the poster
1. An appropriate title for the poster. Please remember to include your name and student number.
2. An introduction to the problem and how you have decided to tackle it.
3. Numerical Summaries which describe the variation within the data.
4. Graphical Summaries (e.g. time plot, seasonal plot, scatter plot)
5. Decomposition of the data to examine the trend, seasonality and error.
6. Baseline model (e.g. Naïve)
7. Extrapolation Models (e.g. SES, Holt Linear, Holt Winters)
8. Regression (Simple Linear Regression, Multiple Linear Regression)
9. ARIMAs including an examination of autocorrelation.
10. Summary of Error Statistics for each method (training & test
sets, overall); e.g. MSE, MAPE
11. Summary of 7-day forecasts
12. Conclusions & recommendations
Some helpful hints
• Please remember to use an initialisation set (first 70%) and a test set (remaining 30%) when developing your models.
• Please note that as the data is real-world data, the fits you experience with your models may not be perfect; you’re looking for the best model that gives you a realistic fit to the data and will provide believable projections after the end of the data set. You might need to clean the data.
• When you are describing your preliminary analysis, and the models you have used to produce your forecasts, explain how confident you are in your forecasts and why. Discuss the difficulties you had with the data and / or fitting the models. It makes each project individual. I am not expecting everyone to tackle this in the same way.
Computer
Software
Packages
• Excel
• ‘R’
• Python?
• Other?
• A mixture
• Powerpoint for the poster – you will find it easier than using WORD.
• Please can you keep a copy of all your files in case we need to see them
Deadline
• The assignment must be handed in to the Maths school office by 2pm on Thursday 26th March 2020. A copy of these instructions can be found on Learning Central (19/20-MAT005 Time Series and Forecasting under the Assessment section).
• You are asked to produce an A3 poster to describe the analysis you have carried out and the results you have obtained.
• Please keep an electronic copy of your analysis and the poster in case we want to see the electronic files.
Finally
• Plagiarism will not be accepted, and if discovered will result in both students failing the coursework.
• Noextensionstothedeadlinewillbeallowed.
• Don’t leave the coursework until the last minute – forecasting always takes longer than you
think.
• Useitaspracticefortechniquesthatyoumightneedduringyourdissertationorinafuturejob.
When you will
expect to get
feedback
• We will aim to mark all the coursework by the start of the week beginning the 20th April 2020
• The provisional marks will be released during that week
• Comments / feedback will be captured and can be fed back as required
• Module marks will be fed into the Exam Board
Any questions?