QBUS6840 Predictive Analytics Lecture 2: Data Patterns, Graphing, Time Series Components, and Forecast Accuracy
QBUS6840 Predictive Analytics
Copyright By PowCoder代写 加微信 powcoder
Lecture 2: Data Patterns, Graphing, Time
Series Components, and Forecast Accuracy
Discipline of Business Analytics
The University of School
Forecasting
Definition/terminologies
Problems Data Forecasting
Quantitative/data-driven Qualitative
principles/steps
Table of contents
Data and plotting
Components of Time Series
Näıve Forecasting Methods
Prediction Errors and Measures
Evaluating forecast accuracy
Online textbook Chapter 2:
https://otexts.com/fpp2/graphics.html and Chapter 3:
https://otexts.com/fpp2/simple-methods.html
https://otexts.com/fpp2/graphics.html
https://otexts.com/fpp2/simple-methods.html
Data carry information/knowledge: image, video, sound, text,
preference, categorical, numerical, any forms you can think of.
Traditionally, data collected to prove some ’hypotheses”, to
infer knowledge or causality
I A pharmaceutical company conducts a study on effectiveness of
its COVID-19 vaccine candidates through lengthy clinical trials,
I A company uses face to face interviews to ask how much people
would be willing to pay for a product.
Today, data are often collected in a passive way thanks to
digital technology and can be used for near real-time decision
I data driven drug development (e.g. to identify candidate
compounds)
I dynamic pricing to quickly adapt to demand and supply
(e.g. Uber’s surge pricing)
Data carry information/knowledge: image, video, sound, text,
preference, categorical, numerical, any forms you can think of.
Traditionally, data collected to prove some ’hypotheses”, to
infer knowledge or causality
I A pharmaceutical company conducts a study on effectiveness of
its COVID-19 vaccine candidates through lengthy clinical trials,
I A company uses face to face interviews to ask how much people
would be willing to pay for a product.
Today, data are often collected in a passive way thanks to
digital technology and can be used for near real-time decision
I data driven drug development (e.g. to identify candidate
compounds)
I dynamic pricing to quickly adapt to demand and supply
(e.g. Uber’s surge pricing)
Data carry information/knowledge: image, video, sound, text,
preference, categorical, numerical, any forms you can think of.
Traditionally, data collected to prove some ’hypotheses”, to
infer knowledge or causality
I A pharmaceutical company conducts a study on effectiveness of
its COVID-19 vaccine candidates through lengthy clinical trials,
I A company uses face to face interviews to ask how much people
would be willing to pay for a product.
Today, data are often collected in a passive way thanks to
digital technology and can be used for near real-time decision
I data driven drug development (e.g. to identify candidate
compounds)
I dynamic pricing to quickly adapt to demand and supply
(e.g. Uber’s surge pricing)
Plotting data is the first step in any data analysis task,
including predictive analytics task.
I Visualize many features of the data: trend, correlation
relationship, etc.
I Give insights into the data
The type of the data determines the type of graphing technique
I Two popular ones: time series plot and scatter plot
Plotting data is the first step in any data analysis task,
including predictive analytics task.
I Visualize many features of the data: trend, correlation
relationship, etc.
I Give insights into the data
The type of the data determines the type of graphing technique
I Two popular ones: time series plot and scatter plot
Plotting data is the first step in any data analysis task,
including predictive analytics task.
I Visualize many features of the data: trend, correlation
relationship, etc.
I Give insights into the data
The type of the data determines the type of graphing technique
I Two popular ones: time series plot and scatter plot
Data Graphing: Some examples
’s video: 200 Countries, 200 Years, 4 Minutes
Google Charts
https://developers.google.com/chart/interactive/docs/
Good graphs convey both patterns and the randomness in the
Message+ noise
https://developers.google.com/chart/interactive/docs/gallery
https://developers.google.com/chart/interactive/docs/gallery
Python Plotting
Almost all types of plots done by matplotlib in Python
You use “import matplotlib.pyplot as plt” to import all
the functionalities
Always follow the following steps
1. Prepare data: either loading data from file, processing and
2. Define a drawing window: size, subplots etc (or use the default
setting by plt.plot())
3. Use the main plotting functions plot and/or scatter etc,
depending on what plots are needed
Please see tutorials 1 and 2
Many examples online, e.g. matplotlib gallery
Time Series Plots
Time Series Plots
Time Series Plots
Time Series Components
See an example in Lecture02 Example01.py
Reflects the long-run growth or decline in the time series
The Trend Component
What type of trend? Trend may depend on the length of the
observed time series so we must extrapolate with care
What type of trend?
Trend may depend on the length of the
observed time series so we must extrapolate with care
What type of trend? Trend may depend on the length of the
observed time series so we must extrapolate with care
Slow rises and falls that are not in a regularly repeating
pattern, no fixed period
often related to “business cycles”
thus, very difficult to model!
Figure: The data exhibit a cyclic pattern every 8-10 years
Rises and falls that are in
a regular repeating
pattern, on a seasonal
basis such as months of
the year or days of the
There is a fixed seasonal
period/frequency,
denoted by M
Example: Time series plot
Example: Seasonal Plot
See an example in Lecture02 Example02.py
Difference between Seasonal and Cycle
Cycle: the rises and falls are not of a fixed frequency
Seasonal: the rises and falls are associated with the calendar,
and the frequency M is unchanging (every 12 months, 7 days,
Figure: The data exhibit both a cyclic pattern and a seasonal pattern.
Irregular fluctuations
follow no visualizable pattern, need
statistical models to capture
assumed unexplainable
Might be ’unusual’ events: earthquakes,
accidents, hurricanes, wars, strikes
OR just random variations: e.g. noise,
customer behaviors.
Example: Tasty Cola sales (in hundreds of cases)
The Upward Trend
Seasonal Patterns:
Septembers
Irregular Fluctuations
… after the seasonal component is removed
We denote a time series of length T (or N) as
Y = {Y1, Y2, …, YT }, or Y1:T
where Yt is the observation at time point t.
The forecast of YT+h based on data Y1:T is denoted as ŶT+h|T :
I h = 1, 2, … is called horizon
I ŶT+h|T is a h-step-ahead forecast.
I Sometimes, we simply write ŶT+h for ŶT+h|T .
The estimate error (residual)
et = Yt − Ŷt
The seasonal period M and the general seasonal index m
(taking one value of 1, …, M)
Näıve forecasting method: Most Recent Value
This is the baseline model to which all forecast models should
be compared
Monthly beer consumption in Australia
Monthly beer consumption in Australia: naive forecast.
Multi-step-ahead forecast.
Mean: −43.8 43.8 −32.7 32.7
Monthly beer consumption in Australia: naive forecast.
One-step-ahead forecast.
Mean: −3.6 22.1 −3.7 15.9
Seasonal Näıve Method: Most Recent Season’s Value
Ŷt+h = Yt+h−M , h = 1, 2, …
Drift Method
A variant of the näıve method, allowing the changes in the forecasts
The amount of change over time, called drift, is the average change
seen in the historical data.
Give a (historical) time series
T = {Y1, Y2, . . . , YT }
The drift method defines the forecast for the time point T + h as
ŶT+h := YT +
(Yt − Yt−1)
i.e., adding (h) times of the average change to the most recent
observation YT .
It can be proved that
ŶT+h = YT + h
This is equivalent to drawing a line between the first and last
observations, and use that line to forecast for times after T .
Google stock price forecast
The forecast error
Let Ŷt = Ŷt|t−1 be an one-step-ahead forecast of Yt, based on
data Y1:t−1. The forecast error is the difference between Yt and
its forecast Ŷt.
I depends on how we define the ”difference”
For numerical data, the forecast error is often defined as
et = Yt − Ŷt
For categorical data, the forecast error is measured in terms of
disagreement
0 Yt = Ŷt,
1 Yt 6= Ŷt.
MUST have actual data to compute errors!
Forecast accuracy measures
A forecast method is unbiased if:
E(et) = 0⇐⇒ E(Yt) = E(Ŷt)
which implies that:
(Yt − Ŷt) ≈ 0.
Is this a good criterion to assess forecast accuracy? WHY?
Answers: Should we choose method with sum of errors (i.e.
average error) closest to 0? Desirable but what about the SIGN
of errors?
Forecast accuracy measures
A forecast method is unbiased if:
E(et) = 0⇐⇒ E(Yt) = E(Ŷt)
which implies that:
(Yt − Ŷt) ≈ 0.
Is this a good criterion to assess forecast accuracy? WHY?
Answers: Should we choose method with sum of errors (i.e.
average error) closest to 0? Desirable but what about the SIGN
of errors?
Accuracy measures
Mean Absolute Deviation (MAD)
|Yt − Ŷt|
MAD is average distance between actual and forecast, i.e.
average forecast error.
Mean Squared Error (MSE)
(Yt − Ŷt)2
MSE is like a forecast variance, if forecasts are unbiased.
Root Mean Squared Error (RMSE) = the square root of MSE.
MAD and RMSE are in original units of data. MSE penalises
large errors more than MAD.
Accuracy measures
Mean Absolute Percentage Error
|Yt − Ŷt|
Forecast errors are percentages of the actual data point,
e.g. 10%. Very popular in business forecasting
E.g., MAPE = 10% means that on average the forecast error is
10% of the data value.
I sometimes, knowing that MAPE is 10%, say, can be more
valuable than knowing it is e.g. 12 Megalitres (MAD or RMSE)
Cannot be used if any Yt = 0.
Monthly beer consumption in Australia: naive forecast.
Multi-step-ahead forecast.
Mean: −43.8 43.8 −32.7 32.7
Monthly beer consumption in Australia: naive forecast.
One-step-ahead forecast.
Mean: −3.6 22.1 −3.7 15.9
Which measure is best?
I same units as Y
I Does not heavily penalise a small number of large errors
I Harder to interpret
I Heavily penalises large errors
I RMSE has same units as Y
I measures percentage error
All can be used simultaneously. Report measure(s) that
decision maker/manager can BEST understand. Are one or two
large errors highly UNDESIRABLE? Yes? Use MSE. No? Use
MAD or MAPE
Which measure is best?
All the MAD, MSE, RMSE and MAPE are suitable for
numerical data
For categorical or direction forecasting
Percentage agreement and/or Percentage disagreement
are often used.
e.g. how many elections has this model correctly forecast,
compared to the total? Ans. 3
Training and test sets
An objective way of assessing a model’s forecast accuracy is to
use a test set for testing forecasting.
It is often not enough to look at how well a model fits the
historical data. We can only determine the accuracy of forecasts
by considering how well a model performs on new data
I A model which fits the data well does not necessarily forecast
I A near perfect fit can always be obtained by using a model with
enough parameters.
I Over-fitting a model to data is as bad as failing to identify the
systematic pattern in the data.
Occam’s Razor: Simpler methods tend to work well on average
in practice.
Training and test sets
When choosing models, it is common to use a portion of the
available data for fitting, and use the rest of the data for testing
the model. Then the testing data can be used to measure how
well the model is likely to forecast on new data.
The size of the test set is typically about 20% of the total
sample, although this value depends on how long the sample is
and how far ahead you want to forecast. The size of the test set
should ideally be at least as large as the maximum forecast
horizon required.
Time series references often call the training set the “in-sample
data” and the test set the “out-of-sample data”
Refresher: cross validation for cross-sectional data
Consider leave-one-out cross validation
Select observation i (leave-one-out) for the test set, and use the
remaining observations in the training set.
Compute the error on the test observation.
Repeat the above steps for i = 1, 2, …, T , where T is the total
number of observations.
Compute the forecast accuracy measures based on the errors
Cross-validation for time series data
Two approaches: sliding window and expanding window
chronological testing since order matters!
using consecutive time steps (with a fixed or variable length) for
training/building a model
testing using the next 1 or few observations
averaging over all passes: report or use this to compare models
Testing: we could use one-step-ahead or h-step-ahead forecasts.
Cross-validation for time series data
Two approaches: sliding window and expanding window
chronological testing since order matters!
using consecutive time steps (with a fixed or variable length) for
training/building a model
testing using the next 1 or few observations
averaging over all passes: report or use this to compare models
Testing: we could use one-step-ahead or h-step-ahead forecasts.
Example: using expanding window with one-step-ahead
The blue dots are training data, red dots are test data.
Example: using expanding window with four-step-ahead
The blue dots are training data, red dots are test data. h = 4.
Data and plotting
Components of Time Series
Naïve Forecasting Methods
Prediction Errors and Measures
Evaluating forecast accuracy
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com