Week-5 ARIMA Models
Some of the slides are adapted from the lecture notes provided by Prof. Antoine Saure and Prof. Rob Hyndman
Business Forecasting Analytics
ADM 4307 – Fall 2021
ARIMA Models
Ahmet Kandakoglu, PhD
18 October, 2021
Outline
• Review of last lecture
• Stationarity
• Differencing
• Backshift Notation
• Autoregressive Models
• Moving Average Models
• ARIMA models
• ARIMA modelling in R
ADM 4307 Business Forecasting Analytics 2Fall 2021
Review of Last Lecture
• Simple exponential smoothing
• Trend methods
• Seasonal methods
• Taxonomy of exponential smoothing methods
• Innovations state space models
ADM 4307 Business Forecasting Analytics 3Fall 2021
ARIMA Models
• ARIMA models provide another approach to time series forecasting
• ARIMA
• AR: autoregressive (lagged observations as inputs)
• I: integrated (differencing to make series stationary)
• MA: moving average (lagged errors as inputs)
Fall 2021 ADM 4307 Business Forecasting Analytics 4
Exponential Smoothing vs. ARIMA
• While exponential smoothing models were based on a description of trend and
seasonality in the data, ARIMA models aim to describe the autocorrelations in
the data.
• Exponential smoothing and ARIMA models are the two most widely used
approaches to time series forecasting.
ADM 4307 Business Forecasting Analytics 5Fall 2021
Box-Jenkins Approach
ADM 4307 Business Forecasting Analytics 6Fall 2021
Stationarity
• A stationary time series is one whose properties do not depend on the time at
which the series is observed.
• So time series with trends, or with seasonality, are not stationary.
• Some cases can be confusing — a time series with cyclic behavior (but not
trend or seasonality) is stationary. That is because the cycles are not of fixed
length, so before we observe the series we cannot be sure where the peaks
and troughs of the cycles will be.
• In general, a stationary time series will have no obvious observable patterns in
the long-term.
ADM 4307 Business Forecasting Analytics 7Fall 2021
Stationary?
ADM 4307 Business Forecasting Analytics 8Fall 2021
• Obvious seasonality rules out series (d),
(h) and (i).
• Trends and changing levels rules out
series (a), (c), (e), (f) and (i).
• Increasing variance also rules out (i).
• That leaves only (b) and (g) as stationary
series.
• At first glance, the strong cycles in series
(g) might appear to make it non-
stationary. But these cycles are
aperiodic. In the long-term, the timing of
these cycles is not predictable. Hence
the series is stationary.
Stationarity
• A stationary series is:
• roughly horizontal
• constant variance
• no patterns predictable in the long-term
• Transformations help to stabilize the variance.
• For ARIMA modelling, we also need to stabilize the mean.
ADM 4307 Business Forecasting Analytics 9Fall 2021
Non-stationarity in the Mean
• Identifying non-stationary series
• time plot
• The ACF of stationary data drops to zero relatively quickly
• The ACF of non-stationary data decreases slowly
• Besides looking at the time plot of the data, the ACF plot is also useful for
identifying non-stationary time series
ADM 4307 Business Forecasting Analytics 10Fall 2021
Example: Dow-Jones index
ADM 4307 Business Forecasting Analytics 11Fall 2021
Differencing
• One way to make a time series stationary is to compute the differences
between consecutive observations. This is known as differencing
• Transformations such as logarithms can help to stabilize the variance of a time
series
• Differencing can help stabilize the mean of a time series by removing changes
in the level of a time series, and so eliminating trend and seasonality
ADM 4307 Business Forecasting Analytics 12Fall 2021
Differencing
• The differenced series is the change between consecutive observations in the
original series, and can be written as
• The differenced series will have only 𝑇 − 1 values since it is not possible to
calculate a difference 𝑦𝑡
′ for the first observation
• When the differenced series is white noise, the model for the original series
can be written as
ADM 4307 Business Forecasting Analytics 13Fall 2021
Random Walk Model
• A random walk model is very widely used for non-stationary data, particularly
financial and economic data.
• Random walks typically have:
• Long periods of apparent trends up or down
• Sudden and unpredictable changes in direction
• The forecasts from a random walk model are equal to the last observation, as
future movements are unpredictable, and are equally likely to be up or down.
Thus, the random walk model underpins naive forecasts
ADM 4307 Business Forecasting Analytics 14Fall 2021
Random Walk Model
• A closely related model allows the differences to have a non-zero mean. Then,
• The value of 𝑐 is the average of the changes between consecutive
observations. If 𝑐 is positive, then the average change is an increase in the
value of 𝑦𝑡 . Thus, will tend to drift upwards. But if 𝑐 is negative, 𝑦𝑡 will tend to
drift downwards
• This is the model behind the drift method
ADM 4307 Business Forecasting Analytics 15Fall 2021
Second-Order Differencing
• Occasionally, the differenced data will not appear stationary and it may be
necessary to difference the data a second time to obtain a stationary series:
• In this case, 𝑦𝑡
′′ will have values 𝑇 − 2.
• Then we would model the change in the changes of the original data.
• In practice, it is almost never necessary to go beyond second-order
differences
ADM 4307 Business Forecasting Analytics 16Fall 2021
Seasonal Differencing
• A seasonal difference is the difference between an observation and the
corresponding observation from the previous season
where 𝑚 is the number of seasons.
• For monthly data 𝑚 = 12
• For quarterly data 𝑚 = 4
ADM 4307 Business Forecasting Analytics 17Fall 2021
Seasonal Differencing
• These are also called lag-m differences as we subtract the observation after a
lag of m periods
• If seasonally differenced data appear to be white noise, then an appropriate
model for the original data is
• Forecasts from this model are equal to the last observation from the relevant
season. That is, this model gives seasonal naïve forecasts
ADM 4307 Business Forecasting Analytics 18Fall 2021
Example 1: Antidiabetic Drug Sales
Monthly anti-diabetic drug sales in Australia
a10 <- PBS %>% filter(ATC2 == “A10”) %>%
summarise(Cost = sum(Cost)/1e6)
a10 %>% autoplot(Cost)
a10 %>% autoplot(log(Cost))
a10 %>% autoplot(log(Cost) %>% difference(12))
ADM 4307 Business Forecasting Analytics 19Fall 2021
Example 3: US Electricity Generation
• Seasonally differenced series is closer to being
stationary.
• Remaining non-stationarity can be removed with
further first difference
ADM 4307 Business Forecasting Analytics 20Fall 2021
Example 2: Cortecosteroid Drug Sales
h02 <- PBS %>% filter(ATC2 == “H02”) %>%
summarise(Cost = sum(Cost)/1e6)
h02 %>% autoplot(Cost)
h02 %>% autoplot(log(Cost))
h02 %>% autoplot(log(Cost) %>% difference(12))
h02 %>% autoplot(log(Cost) %>% difference(12) %>%
difference(1))
ADM 4307 Business Forecasting Analytics 21Fall 2021
Seasonal Differencing
• When both seasonal and first differences are applied…
• it makes no difference which is done first – the result will be the same
• if the data have a strong seasonal pattern, we recommend that seasonal
differencing be done first because sometimes the resulting series will be
stationary and there will be no need for a further first difference. If first
differencing is done first, there will still be seasonality present.
ADM 4307 Business Forecasting Analytics 22Fall 2021
Interpretation of Differencing
• It is important that if differencing is used, the differences are interpretable:
• First differences are the change between one observation and the next
• Seasonal differences are the change between one year to the next
• Other lags are unlikely to make much interpretable sense and should be
avoided (For example, taking lag 3 differences for yearly data results in a
model which cannot be sensibly interpreted)
ADM 4307 Business Forecasting Analytics 23Fall 2021
Unit Root Tests
• One way to determine more objectively if differencing is required is to use a
unit root test
• These are statistical hypothesis tests of stationarity that are designed for
determining whether differencing is required
• A number of unit root tests are available. They are based on different
assumptions and may lead to conflicting answers
• In this course, we use the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test
ADM 4307 Business Forecasting Analytics 24Fall 2021
KPSS Test
• KPSS test:
• null hypothesis is that the data are stationary and non-seasonal and
• we look for evidence that the null hypothesis is false.
• Consequently, small p-values (e.g., less than 0.05) suggest that differencing is
required.
• The test can be computed using the unitroot_kpss() function.
ADM 4307 Business Forecasting Analytics 25Fall 2021
Example: KPSS Test
google_2018 <- gafa_stock %>%
filter(Symbol == “GOOG”, year(Date) == 2018) %>%
mutate(trading_day = row_number()) %>%
update_tsibble(index = trading_day, regular = TRUE)
google_2018 %>% features(Close, unitroot_kpss)
# A tibble: 1 x 3
Symbol kpss_stat kpss_pvalue
1 GOOG 0.573 0.0252
ADM 4307 Business Forecasting Analytics 26
P-value is less than 0.05,
indicating that the null hypothesis
is rejected.
That is, the data is not stationary.
Fall 2021
Example: KPSS Test
Difference the data, and apply the test again.
google_2018 %>% mutate(diff_close = difference(Close)) %>%
features(diff_close, unitroot_kpss)
# A tibble: 1 x 3
Symbol kpss_stat kpss_pvalue
1 GOOG 0.0955 0.1
ADM 4307 Business Forecasting Analytics 27
This time, the p-value is greater
than 0.05.
We can conclude that the
differenced data appear stationary.
Fall 2021
Automatically Selecting Differences
• This process of using a sequence of KPSS tests to determine the appropriate number of first
differences is carried out using the unitroot_ndiffs() feature.
• A similar feature for determining whether seasonal differencing is required is unitroot_nsdiffs(),
which uses the measure of seasonal strength to determine the appropriate number of
seasonal differences required.
google_2018 %>% features(Close, unitroot_ndiffs)
# A tibble: 1 x 2
Symbol ndiffs
1 GOOG 1
• As we saw from the KPSS tests above, one difference is required to make the google_2018
data stationary.
ADM 4307 Business Forecasting Analytics 28Fall 2021
Automatically Selecting Differences
h02 %>% mutate(log_sales = log(Cost)) %>% features(log_sales, unitroot_nsdiffs)
# A tibble: 1 x 1
nsdiffs
1 1
• Because unitroot_nsdiffs() returns 1 (indicating one seasonal difference is required), we apply
the ndiffs() function to the seasonally differenced data.
h02 %>% mutate(d_log_sales = difference(log(Cost), 12)) %>%
features(d_log_sales, unitroot_ndiffs)
# A tibble: 1 x 1
ndiffs
1 1
• Since unitroot_ndiffs() returns 1, one difference is required.
ADM 4307 Business Forecasting Analytics 29Fall 2021
Backshift Notation
• The backward shift operator 𝐵 is a useful notational device when working with
time series lags:
• 𝐵 has the effect of shifting the data back one period. Two applications of 𝐵 to
𝑦𝑡 shifts the data back two periods:
• For monthly data, if we wish to consider the same month last year, the notation
is:
Fall 2019 ADM 4307 Business Forecasting Analytics 30
Backshift Notation
• The backward shift operator is convenient for describing the process of
differencing. A first difference can be written as:
• Similarly, if second-order differences have to be computed, then:
• In general, a 𝑑𝑡ℎ-order difference can be written as:
Fall 2019 ADM 4307 Business Forecasting Analytics 31
Backshift Notation
• Backshift notation is very useful when combining differences as the operator
can be treated using ordinary algebraic rules. In particular, terms involving
𝐵 can be multiplied together
• For example, a seasonal difference followed by a first difference can be written
as:
Fall 2019 ADM 4307 Business Forecasting Analytics 32
Autoregressive Models
• In multiple regression, we forecast the variable of interest using a linear
combination of predictors. In an autoregression model, we forecast the
variable of interest using a linear combination of past values of the variable (a
regression of the variable against itself)
• An autoregressive model of order 𝑝, denoted AR(𝑝) model, can be written as:
• 𝑐 is a constant and 𝑒𝑡 is white noise. This is like a multiple regression but with
lagged values of 𝑦𝑡 as predictors
Fall 2019 ADM 4307 Business Forecasting Analytics 33
Autoregressive Models
• Left: AR(1) with
• Right: AR(2) with
• 𝑒𝑡 is normally distributed white noise with mean zero and variance one)
Fall 2019 ADM 4307 Business Forecasting Analytics 34
Autoregressive Models
• For an AR(1) model:
• When ∅1 = 0, 𝑦𝑡 is equivalent to white noise
• When ∅1 = 1 and 𝑐 = 0, 𝑦𝑡 is equivalent to a random walk
• When ∅1 = 1 and 𝑐 ≠ 0, 𝑦𝑡 is equivalent to a random walk with drift
• When ∅1 < 0, 𝑦𝑡 tends to oscillate between positive and negative values (tends to oscillate around the mean) Fall 2019 ADM 4307 Business Forecasting Analytics 35 Moving Average Models • A moving average model of order 𝑞, denoted by 𝑀𝐴(𝑞), uses 𝑞 past forecast errors in a regression-like model, where 𝑒𝑡 is white noise (This is a multiple regression with past errors as predictors) • Moving average models should not be confused with moving average smoothing • A moving average model is used for forecasting future values while moving average smoothing is used for estimating the trend-cycle of past values Fall 2019 ADM 4307 Business Forecasting Analytics 36 Moving Average Models • Left: MA(1) with • Right: MA(2) with • 𝑒𝑡 is normally distributed white noise with mean zero and variance one) Fall 2019 ADM 4307 Business Forecasting Analytics 37 Business Forecasting Analytics ADM 4307 – Fall 2021 ARIMA Models ADM 4307 Business Forecasting Analytics 38Fall 2021