Week-2 The Forecaster’s Toolbox
Some of the slides are adapted from the lecture notes provided by Prof. Antoine Saure and Prof. Rob Hyndman
Business Forecasting Analytics
ADM 4307 – Fall 2021
The Forecaster’s Toolbox
Ahmet Kandakoglu, PhD
20 September, 2021
Outline
• A tidy forecasting workflow
• Some simple forecasting methods
• Naïve method
• Seasonal naïve method
• Average method
• Drift method
• Residual diagnostics
• Evaluating forecasting accuracy
• Prediction intervals
• Questions
Fall 2021 ADM 4307 Business Forecasting Analytics 2
A Tidy Forecasting Workflow
• The process of producing forecasts for time series data can be broken down into a
few steps
• To illustrate the process, we will fit linear trend models to national GDP data stored in
global_economy
Fall 2021 ADM 4307 Business Forecasting Analytics 3
Data Preparation (Tidy)
• The first step in forecasting is to prepare data in the correct format.
• This process may involve loading in data, identifying missing values, filtering the time series,
and other pre-processing tasks.
Fall 2021 ADM 4307 Business Forecasting Analytics 4
> gdppc
# A tsibble: 15,150 x 5 [1Y]
# Key: Country [263]
Year Country GDP Population GDP_per_capita
1 1960 Afghanistan 537777811. 8996351 59.8
2 1961 Afghanistan 548888896. 9166764 59.9
3 1962 Afghanistan 546666678. 9345868 58.5
4 1963 Afghanistan 751111191. 9533954 78.8
5 1964 Afghanistan 800000044. 9731361 82.2
6 1965 Afghanistan 1006666638. 9938414 101.
7 1966 Afghanistan 1399999967. 10152331 138.
8 1967 Afghanistan 1673333418. 10372630 161.
9 1968 Afghanistan 1373333367. 10604346 130.
10 1969 Afghanistan 1408888922. 10854428 130.
# … with 15,140 more rows
gdppc <- global_economy %>%
mutate(GDP_per_capita = GDP/Population) %>%
select(Year, Country, GDP, Population,
GDP_per_capita)
gdppc
Plot the Data (Visualize)
• An essential step in understanding the data.
• Looking at your data allows you to identify common patterns, and subsequently specify an
appropriate model..
Fall 2021 ADM 4307 Business Forecasting Analytics 5
gdppc %>%
filter(Country == “Sweden”) %>%
autoplot(GDP_per_capita) +
labs(y = “$US”, title = “GDP per capita
for Sweden”)
Define a Model (Specify)
• There are many different time series models that can be used for forecasting.
• In this case the model function is TSLM() (time series linear model), the response variable is
GDP_per_capita and it is being modelled using trend()
Fall 2021 ADM 4307 Business Forecasting Analytics 6
TSLM(GDP_per_capita ~ trend())
Train the Model (Estimate)
• Once an appropriate model is specified, we next train the model on some data.
• One or more models can be trained using the model() function.
• A mable is a model table, each cell corresponds to a fitted model.
Fall 2021 ADM 4307 Business Forecasting Analytics 7
> fit
# A mable: 263 x 2
# Key: Country [263]
Country trend_model
1 Afghanistan
2 Albania
3 Algeria
4 American Samoa
5 Andorra
6 Angola
7 Antigua and Barbuda
8 Arab World
9 Argentina
10 Armenia
# … with 253 more rows
fit <- gdppc %>%
model(trend_model = TSLM(GDP_per_capita ~ trend()))
Check Model Performance (Evaluate)
• Once a model has been fitted, it is important to check how well it has performed on the data.
• There are several diagnostic tools available to check model behavior.
• Accuracy measures that allow one model to be compared against another.
Fall 2021 ADM 4307 Business Forecasting Analytics 8
Produce Forecasts (Forecast)
• With an appropriate model specified, estimated and checked, it is time to produce the
forecasts using forecast().
• Specify the number of future observations to forecast. For example, forecasts for the next 3
years can be generated using h = 3. We can also use natural language; e.g., h = “3 years”.
• A fable is a forecast table with point forecasts and distributions
Fall 2021 ADM 4307 Business Forecasting Analytics 9
> fit %>% forecast(h = “3 years”)
# A fable: 789 x 5 [1Y]
# Key: Country, .model [263]
Country .model Year GDP_per_capita .mean
1 Afghanistan trend_model 2018 N(526, 9653) 526.
2 Afghanistan trend_model 2019 N(534, 9689) 534.
3 Afghanistan trend_model 2020 N(542, 9727) 542.
4 Albania trend_model 2018 N(4716, 476419) 4716.
5 Albania trend_model 2019 N(4867, 481086) 4867.
6 Albania trend_model 2020 N(5018, 486012) 5018.
7 Algeria trend_model 2018 N(4410, 643094) 4410.
8 Algeria trend_model 2019 N(4489, 645311) 4489.
9 Algeria trend_model 2020 N(4568, 647602) 4568.
10 American Samoa trend_model 2018 N(12491, 652926) 12491.
# … with 779 more rows
fit %>% forecast(h = “3 years”)
Visualizing Forecasts
• With an appropriate model specified, estimated and checked, it is time to produce the
forecasts using forecast().
• Specify the number of future observations to forecast. For example, forecasts for the next 3
years can be generated using h = 3. We can also use natural language; e.g., h = “3 years”.
• A fable is a forecast table with point forecasts and distributions
Fall 2021 ADM 4307 Business Forecasting Analytics 10
fit %>%
forecast(h = “3 years”) %>%
filter(Country == “Sweden”) %>%
autoplot(gdppc) +
labs(y = “$US”, title = “GDP per capita for Sweden”)
Outline
• A tidy forecasting workflow
• Some simple forecasting methods
• Naïve method
• Seasonal naïve method
• Average method
• Drift method
• Residual diagnostics
• Evaluating forecasting accuracy
• Prediction intervals
• Questions
Fall 2021 ADM 4307 Business Forecasting Analytics 11
Some Simple Forecasting Methods
• Some forecasting methods are very simple and surprisingly effective.
• Here are four methods that we will use as benchmarks for other forecasting
methods:
• Average method
• Naïve method
• Seasonal naïve method
• Drift method
Fall 2021 ADM 4307 Business Forecasting Analytics 12
Average Method
• Forecast of all future values is equal to mean of historical data {𝑦1, … , 𝑦𝑇}.
ො𝑦𝑇+ℎ|𝑇 = 𝑦 = (𝑦1 +⋯+ 𝑦𝑇)/𝑇
MEAN(y)
# y contains the time series
Fall 2021 ADM 4307 Business Forecasting Analytics 13
bricks <- aus_production %>%
filter_index(“1970 Q1” ~ “2004 Q4”)
bricks %>% model(MEAN(Bricks))
Naïve Method
• Forecasts equal to last observed value.
• Simple to use and understand, very low cost and low accuracy
•
ො𝑦𝑇+ℎ|𝑇 = 𝑦𝑇
NAIVE(y)
Fall 2021 ADM 4307 Business Forecasting Analytics 14
bricks %>% model(NAIVE(Bricks))
Seasonal Naïve Method
• Forecasts equal to last value from same season.
ො𝑦𝑇+ℎ|𝑇 = 𝑦𝑇+ℎ−𝑚(𝑘+1)
(𝑚 = seasonal period and 𝑘 is the integer part of (ℎ − 1)/𝑚)
SNAIVE(y ~ lag(m))
Fall 2021 ADM 4307 Business Forecasting Analytics 15
bricks %>% model(SNAIVE(Bricks ~ lag(“year”)))
Drift Method
• A variation on the naïve method is to allow the forecasts to increase or decrease over time,
where the amount of change over time (called the drift) is set to be the average change seen
in the historical data
• So the forecast for time 𝑇 + ℎ is given by:
𝑦𝑇 +
ℎ
𝑇 − 1
𝑡=2
𝑇
𝑦t − 𝑦t−1 = 𝑦𝑇 + ℎ
𝑦T − 𝑦1
𝑇 − 1
• This is equivalent to drawing a line between the first and last observation, and extrapolating it
into the future
RW(y ~ drift())
Fall 2021 ADM 4307 Business Forecasting Analytics 18
Drift Method
bricks %>% model(RW(Bricks ~ drift()))
Fall 2021 ADM 4307 Business Forecasting Analytics 19
Example 2 – Australian Quarterly Beer Production
# Set training data from 1992 to 2006
train <- aus_production %>% filter_index(“1992 Q1” ~ “2006 Q4”)
# Fit the models
beer_fit <- train %>%
model(
Mean = MEAN(Beer),
Drift = RW(Beer ~ drift()),
`Naïve` = NAIVE(Beer),
`Seasonal naïve` = SNAIVE(Beer)
)
# Generate forecasts for 14 quarters
beer_fc <- beer_fit %>% forecast(h = 14)
# Plot forecasts against actual values
beer_fc %>% autoplot(train, level = NULL) + autolayer(
filter_index(aus_production, “2007 Q1” ~ .), colour = “black”) +
labs( y = “Megalitres”, title = “Forecasts for quarterly beer
production”) + guides(colour = guide_legend(title = “Forecast”))
Fall 2021 ADM 4307 Business Forecasting Analytics 20
Outline
• Some simple forecasting methods
• Naïve method
• Seasonal naïve method
• Average method
• Drift method
• Residual diagnostics
• Evaluating forecasting accuracy
• Prediction intervals
• Questions
Fall 2021 ADM 4307 Business Forecasting Analytics 21
Fitted Values and Residuals
• A residual in forecasting is the difference between an observed value and its
forecast based on other observations:
𝑒𝑖 = 𝑦𝑖 − ො𝑦𝑖
• For time series forecasting, a residual is based on one-step forecasts; that is
ො𝑦𝑡|𝑡−1 is the forecast of 𝑦𝑡 based on observations 𝑦1, … , 𝑦𝑡.
• ො𝑦𝑡|𝑡−1 is also called fitted values.
Fall 2021 ADM 4307 Business Forecasting Analytics 22
Fitted Values and Residuals
The fitted values and residuals from a model can be obtained using the augment() function.
augment(beer_fit)
Fall 2021 ADM 4307 Business Forecasting Analytics 23
> augment(beer_fit)
# A tsibble: 240 x 6 [1Q]
# Key: .model [4]
.model Quarter Beer .fitted .resid .innov
1 Mean 1992 Q1 443 436. 6.55 6.55
2 Mean 1992 Q2 410 436. -26.4 -26.4
3 Mean 1992 Q3 420 436. -16.4 -16.4
4 Mean 1992 Q4 532 436. 95.6 95.6
5 Mean 1993 Q1 433 436. -3.45 -3.45
6 Mean 1993 Q2 421 436. -15.4 -15.4
7 Mean 1993 Q3 410 436. -26.4 -26.4
8 Mean 1993 Q4 512 436. 75.6 75.6
9 Mean 1994 Q1 449 436. 12.6 12.6
10 Mean 1994 Q2 381 436. -55.4 -55.4
# … with 230 more rows
Residual Diagnostics
• A good forecasting method will yield residuals with the following properties:
• The residuals are uncorrelated. If there are correlations between residuals, then there is
information left in the residuals which should be used in computing forecasts
• The residuals have zero mean. If the residuals have a mean other than zero, then the
forecasts are biased
• Any forecasting method that does not satisfy these properties can be
improved. That does not mean that forecasting methods that satisfy these
properties can not be improved.
Fall 2021 ADM 4307 Business Forecasting Analytics 24
Residual Diagnostics
• It is possible to have several forecasting methods for the same data set, all of
which satisfy these properties. Checking these properties is important to see if
a method is using all available information well, but it is not a good way for
selecting a forecasting method.
• If either of these two properties is not satisfied, then the forecasting method
can be modified to give better forecasts
• Adjusting for bias is easy: if the residuals have mean 𝑚, then simply add 𝑚 to
all forecasts and the bias problem is solved
Fall 2021 ADM 4307 Business Forecasting Analytics 25
Residual Diagnostics
• In addition to these essential properties, it is useful (but not necessary) for the
residuals to also have the following two properties:
• The residuals have constant variance
• The residuals are normally distributed
• These two properties make the calculation of prediction intervals easier.
However, a forecasting method that does not satisfy these properties may not
necessarily be improved
Fall 2021 ADM 4307 Business Forecasting Analytics 26
Example: Google Stock Price
Naïve forecast:
ො𝑦𝑡|𝑡−1 = 𝑦𝑡−1
𝑒𝑡 = 𝑦𝑡 − 𝑦𝑡−1
Note: 𝑒𝑡 are one-step-forecast residuals
Fall 2021 ADM 4307 Business Forecasting Analytics 27
Example: Google Stock Price
autoplot(google_2015, Close) +
labs(y = “$US”, title = “Google daily closing stock prices in 2015”)
Fall 2021 ADM 4307 Business Forecasting Analytics 28
Example: Google Stock Price
google_2015 %>%
model(NAIVE(Close)) %>%
gg_tsresiduals()
Fall 2021 ADM 4307 Business Forecasting Analytics 29
ACF of Residuals
• We assume that the residuals are white noise (uncorrelated, mean zero,
constant variance). If they aren’t, then there is information left in the residuals
that should be used in computing forecasts.
• So, a standard residual diagnostic is to check the ACF of the residuals of a
forecasting method.
• We expect these to look like white noise.
Fall 2021 ADM 4307 Business Forecasting Analytics 30
Outline
• Some simple forecasting methods
• Naïve method
• Seasonal naïve method
• Average method
• Drift method
• Residual diagnostics
• Evaluating forecasting accuracy
• Prediction intervals
• Questions
Fall 2021 ADM 4307 Business Forecasting Analytics 31
Forecast Errors
• Forecast “error”: the difference between an observed value and its forecast
𝑒𝑖 = 𝑦𝑖 − ො𝑦𝑖
• Unlike residuals, forecast errors on the test set involve multi-step forecasts.
• These are true forecast errors as the test data is not used in computing the
forecast ො𝑦𝑖
Fall 2021 ADM 4307 Business Forecasting Analytics 32
Measures of Forecast Accuracy
• Key measures to evaluate the accuracy:
Mean absolute error: MAE = mean(|𝑒𝑖|)
Mean square error: MSE = mean(𝑒𝑖
2)
Mean absolute percentage error: MAPE = 100 mean(|𝑒𝑖|/|𝑦𝑖|)
Root mean squared error: RMSE = mean(𝑒𝑖
2)
• MAE, MSE, RMSE are all scale dependent.
• MAPE is scale independent but is only sensible if 𝑦𝑡 ≫ 0 for all 𝑡, and 𝑦 has a
natural zero.
Fall 2021 ADM 4307 Business Forecasting Analytics 33
Example – Australian Quarterly Beer Production
recent_production <- aus_production %>%
filter(year(Quarter) >= 1992)
beer_train <- recent_production %>%
filter(year(Quarter) <= 2007) beer_fit <- beer_train %>%
model(
Mean = MEAN(Beer),
`Naïve` = NAIVE(Beer),
`Seasonal naïve` = SNAIVE(Beer),
Drift = RW(Beer ~ drift())
)
beer_fc <- beer_fit %>% forecast(h = 10)
beer_fc %>%
autoplot(aus_production %>% filter(year(Quarter) >= 1992),
level = NULL) + labs( y = “Megalitres”,
title = “Forecasts for quarterly beer production”
) + guides(colour = guide_legend(title = “Forecast”))
Fall 2021 ADM 4307 Business Forecasting Analytics 34
Example – Australian Quarterly Beer Production
accuracy(beer_fc, recent_production)
• It is obvious from the graph that the seasonal naïve method is best for these data, although it
can still be improved
• Sometimes, different accuracy measures will lead to different results as to which forecast
method is best.
Method RMSE MAE MAPE MASE
Drift method 64.90 58.88 14.58 4.12
Mean method 38.45 34.83 8.28 2.44
Naïve method 62.69 57.40 14.18 4.01
Seasonal naïve method 14.31 13.40 3.17 0.94
Fall 2021 ADM 4307 Business Forecasting Analytics 35
Training and Test Sets
• It is important to evaluate forecast accuracy using genuine forecasts
• It is invalid to look at how well a model fits the historical data. However, the
accuracy of forecasts can only be determined by considering how well a model
performs on new data that were not used when fitting the model
• When choosing models, it is common to use a portion of the available data for
fitting and use the rest of the data for testing the model
Fall 2021 ADM 4307 Business Forecasting Analytics 36
Training and Test Sets
• The following points should be noted:
• A model which fits the data well does not necessarily forecast well
• A perfect fit can always be obtained by using a model with enough
parameters
• Over-fitting a model to data is as bad as failing to identify the systematic
pattern in the data
• The test set must not be used for any aspect of model development or
calculation of forecasts.
• Forecast accuracy is based only on the test set
Fall 2021 ADM 4307 Business Forecasting Analytics 37
Outline
• Some simple forecasting methods
• Naïve method
• Seasonal naïve method
• Average method
• Drift method
• Residual diagnostics
• Evaluating forecasting accuracy
• Prediction intervals
• Questions
Fall 2021 ADM 4307 Business Forecasting Analytics 38
Prediction Intervals
• A prediction interval gives an interval within which we expect 𝑦𝑖 to lie with a
specified probability
• Assuming the forecast errors are uncorrelated and normally distributed, then a
simple 95% prediction interval for the next observation in a time series is
ො𝑦𝑡± 1.96 ො𝜎
where ො𝜎 is an estimate of the standard deviation of the forecast distribution
• When forecasting one-step ahead, the standard deviation of the forecast
distribution is almost the same as the standard deviation of the residuals
Fall 2021 ADM 4307 Business Forecasting Analytics 39
Prediction Intervals
• Naive forecast with prediction interval
google_2015 %>%
model(NAIVE(Close)) %>%
forecast(h = 10) %>% hilo()
• The hilo() function converts the forecast distributions into intervals.
• By default, 80% and 95% prediction intervals are returned, although other options are possible
via the level argument.
Fall 2021 ADM 4307 Business Forecasting Analytics 40
# A tsibble: 10 x 7 [1]
# Key: Symbol, .model [1]
Symbol .model day Close .mean `80%` `95%`
1 GOOG NAIVE(Close) 253 N(759, 125) 759. [745, 773]80 [737, 781]95
2 GOOG NAIVE(Close) 254 N(759, 250) 759. [739, 779]80 [728, 790]95
3 GOOG NAIVE(Close) 255 N(759, 376) 759. [734, 784]80 [721, 797]95
4 GOOG NAIVE(Close) 256 N(759, 501) 759. [730, 788]80 [715, 803]95
5 GOOG NAIVE(Close) 257 N(759, 626) 759. [727, 791]80 [710, 808]95
6 GOOG NAIVE(Close) 258 N(759, 751) 759. [724, 794]80 [705, 813]95
7 GOOG NAIVE(Close) 259 N(759, 876) 759. [721, 797]80 [701, 817]95
8 GOOG NAIVE(Close) 260 N(759, 1002) 759. [718, 799]80 [697, 821]95
9 GOOG NAIVE(Close) 261 N(759, 1127) 759. [716, 802]80 [693, 825]95
10 GOOG NAIVE(Close) 262 N(759, 1252) 759. [714, 804]80 [690, 828]95
Prediction Intervals
google_2015 %>%
model(NAIVE(Close)) %>%
forecast(h = 10) %>%
autoplot(google_2015) +
labs(title=”Google daily closing stock price”, y=”$US” )
Fall 2021 ADM 4307 Business Forecasting Analytics 41
Prediction Intervals
• Point forecasts are often useless without prediction intervals.
• The value of prediction intervals is that they express the uncertainty in the
forecasts.
• If we only produce point forecasts, there is no way of telling how accurate
the forecasts are.
• But if we also produce prediction intervals, then it is clear how much
uncertainty is associated with each forecast.
Fall 2021 ADM 4307 Business Forecasting Analytics 42
Questions (True or False)
• Good forecast methods should have normally distributed residuals.
• A model with small residuals will give good forecasts.
• The best measure of forecast accuracy is MAPE.
• If your model doesn’t forecast well, you should make it more complicated.
• Always choose the model with the best forecast accuracy as measured on the
test set.
Fall 2021 ADM 4307 Business Forecasting Analytics 43
Business Forecasting Analytics
ADM 4307 – Fall 2021
The Forecaster’s Toolbox
Fall 2021 ADM 4307 Business Forecasting Analytics 44