ETW3420: Principles of Forecasting and Applications
Principles of
Copyright By PowCoder代写 加微信 powcoder
Forecasting and
Applications
Topic 6: ARIMA Models
1 Introduction
2 Stationarity and differencing
3 Non-seasonal ARIMA models
4 Model Identification
5 Estimation and order selection
6 ARIMA modelling in R
7 Forecasting
8 Seasonal ARIMA models
Introduction
ARIMA models provide another approach to time series
forecasting.
Exponential smoothing and ARIMA models are the two most
widely used approaches to time series forecasting, and
provide complementary approaches to the problem.
While exponential smoothing models are based on a
description of the trend and seasonality in the data, ARIMA
models aim to describe the autocorrelations in the data.
Before we introduce ARIMA models, we must first discuss the
concept of stationarity and the technique of differencing time
1 Introduction
2 Stationarity and differencing
3 Non-seasonal ARIMA models
4 Model Identification
5 Estimation and order selection
6 ARIMA modelling in R
7 Forecasting
8 Seasonal ARIMA models
Stationarity
Definition
If {yt} is a stationary time series, then for all s, the distribution of
(yt, . . . , yt+s) does not depend on t.
A stationary series is:
roughly horizontal
constant variance
no patterns predictable in the long-term
Stationarity
Definition
If {yt} is a stationary time series, then for all s, the distribution of
(yt, . . . , yt+s) does not depend on t.
A stationary series is:
roughly horizontal
constant variance
no patterns predictable in the long-term
Stationarity
Statistical Properties of Stationary Series
Statistically, a time series {yt} is stationary if:
E(yt) = µ <∞ for all t Var(yt) = γ0 <∞ for all t cov(yt, yt−j) = γj <∞ for all t Stationary? 0 50 100 150 200 250 300 Stationary? 0 50 100 150 200 250 300 Stationary? 1950 1955 1960 1965 1970 1975 1980 Stationary? 1975 1980 1985 1990 1995 Sales of new one−family houses, USA Stationary? 1900 1920 1940 1960 1980 Price of a dozen eggs in 1993 dollars Stationary? 1820 1840 1860 1880 1900 1920 Annual Canadian Lynx Trappings Stationary? 1995 2000 2005 2010 Australian quarterly beer production Stationarity Definition If {yt} is a stationary time series, then for all s, the distribution of (yt, . . . , yt+s) does not depend on t. Transformations help to stabilize the variance. For ARIMA modelling, we also need to stabilize the mean. Stationarity Definition If {yt} is a stationary time series, then for all s, the distribution of (yt, . . . , yt+s) does not depend on t. Transformations help to stabilize the variance. For ARIMA modelling, we also need to stabilize the mean. Non-stationarity in the mean Identifying non-stationary series Time plot. The ACF of stationary data drops to zero relatively quickly The ACF of non-stationary data decreases slowly. For non-stationary data, the value of r1 is often large and Example: Dow-Jones index 0 50 100 150 200 250 300 Example: Dow-Jones index 0 5 10 15 20 25 Series: dj Example: Dow-Jones index 0 50 100 150 200 250 300 Example: Dow-Jones index 0 5 10 15 20 25 Series: diff(dj) helps to stabilize the mean. The differenced series is the change between each observation in the original series: y′t = yt − yt−1. The differenced series will have only T − 1 values since it is not possible to calculate a difference y′1 for the first observation. Second-order differencing Occasionally the differenced data will not appear stationary and it may be necessary to difference the data a second time: = (yt − yt−1)− (yt−1 − yt−2) = yt − 2yt−1 + yt−2. y′′t will have T − 2 values. In practice, it is almost never necessary to go beyond second-order differences. Second-order differencing Occasionally the differenced data will not appear stationary and it may be necessary to difference the data a second time: = (yt − yt−1)− (yt−1 − yt−2) = yt − 2yt−1 + yt−2. y′′t will have T − 2 values. In practice, it is almost never necessary to go beyond second-order differences. Second-order differencing Occasionally the differenced data will not appear stationary and it may be necessary to difference the data a second time: = (yt − yt−1)− (yt−1 − yt−2) = yt − 2yt−1 + yt−2. y′′t will have T − 2 values. In practice, it is almost never necessary to go beyond second-order differences. Seasonal differencing A seasonal difference is the difference between an observation and the corresponding observation from the previous year. y′t = yt − yt−m wherem = number of seasons. For monthly datam = 12. For quarterly datam = 4. Seasonal differencing A seasonal difference is the difference between an observation and the corresponding observation from the previous year. y′t = yt − yt−m wherem = number of seasons. For monthly datam = 12. For quarterly datam = 4. Seasonal differencing A seasonal difference is the difference between an observation and the corresponding observation from the previous year. y′t = yt − yt−m wherem = number of seasons. For monthly datam = 12. For quarterly datam = 4. Electricity production usmelec %>% autoplot()
1980 1990 2000 2010
Electricity production
usmelec %>% log() %>% autoplot()
1980 1990 2000 2010
Electricity production
usmelec %>% log() %>% diff(lag=12) %>%
autoplot()
1980 1990 2000 2010
Electricity production
usmelec %>% log() %>% diff(lag=12) %>%
diff(lag=1) %>% autoplot()
1980 1990 2000 2010
Electricity production
Seasonally differenced series is closer to being stationary.
Remaining non-stationarity can be removed with further first
difference.
If y′t = yt − yt−12 denotes seasonally differenced series, then
twice-differenced series is given by:
= (yt − yt−12)− (yt−1 − yt−13)
= yt − yt−1 − yt−12 + yt−13 .
Seasonal differencing
When both seasonal and first differences are applied. . .
it makes no difference which is done first—the result will be
If seasonality is strong, we recommend that seasonal
differencing be done first because sometimes the resulting
series will be stationary and there will be no need for further
first difference.
It is important that if differencing is used, the differences are
interpretable.
Seasonal differencing
When both seasonal and first differences are applied. . .
it makes no difference which is done first—the result will be
If seasonality is strong, we recommend that seasonal
differencing be done first because sometimes the resulting
series will be stationary and there will be no need for further
first difference.
It is important that if differencing is used, the differences are
interpretable.
Seasonal differencing
When both seasonal and first differences are applied. . .
it makes no difference which is done first—the result will be
If seasonality is strong, we recommend that seasonal
differencing be done first because sometimes the resulting
series will be stationary and there will be no need for further
first difference.
It is important that if differencing is used, the differences are
interpretable.
Interpretation of differencing
first differences are the change between one observation
and the next;
seasonal differences are the change between one year to the
But taking lag 3 differences for yearly data, for example, results in
a model which cannot be sensibly interpreted.
Interpretation of differencing
first differences are the change between one observation
and the next;
seasonal differences are the change between one year to the
But taking lag 3 differences for yearly data, for example, results in
a model which cannot be sensibly interpreted.
Unit root tests
Statistical tests to determine the required order of differencing.
1 Augmented Dickey Fuller test: null hypothesis is that the data
are non-stationary.
2 Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test: null hypothesis
is that the data are stationary. (Will use this for our analysis)
3 Other tests available for seasonal data.
library(urca)
summary(ur.kpss(goog))
## #######################
## # KPSS Unit Root Test #
## #######################
## Test is of type: mu with 7 lags.
## Value of test-statistic is: 10.7223
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
Test statistic is much larger than the 1% critical value, indicating that the
null hypothesis is rejected, i.e. data are not stationary. 31
goog %>% diff() %>% ur.kpss() %>% summary()
## #######################
## # KPSS Unit Root Test #
## #######################
## Test is of type: mu with 7 lags.
## Value of test-statistic is: 0.0324
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
Conclude that the differenced data are stationary.
Automatic selection of first difference
This process of using a sequence of KPSS tests to determine the
appropriate number of first differences is carried out by the
function ndiffs().
ndiffs(goog)
Automatic selection of seasonal difference
STL decomposition: yt = Tt + St + Rt
Seasonal strength Fs = max
0, 1− Var(Rt)Var(St+Rt)
If Fs > 0.64, do one seasonal difference. Otherwise, no
seasonal difference required.
Use the nsdiffs() function to test.
usmelec %>% log() %>% nsdiffs()
usmelec %>% log() %>% diff(lag=12) %>% ndiffs()
These functions suggest performing both a seasonal and first
difference.
Automatic selection of seasonal difference
STL decomposition: yt = Tt + St + Rt
Seasonal strength Fs = max
0, 1− Var(Rt)Var(St+Rt)
If Fs > 0.64, do one seasonal difference. Otherwise, no
seasonal difference required.
Use the nsdiffs() function to test.
usmelec %>% log() %>% nsdiffs()
usmelec %>% log() %>% diff(lag=12) %>% ndiffs()
These functions suggest performing both a seasonal and first
difference.
Self-practice
For the visitors series, find an appropriate differencing (after
transformation if necessary) to obtain stationary data.
Backshift notation
A very useful notational device is the backward shift operator, B,
which is used as follows:
Byt = yt−1 .
In other words, B, operating on yt, has the effect of shifting the
data back one period. Two applications of B to yt shifts the data
back two periods:
B(Byt) = B2yt = yt−2 .
For monthly data, if we wish to shift attention to “the same month
last year,” then B12 is used, and the notation is B12yt = yt−12.
Backshift notation
A very useful notational device is the backward shift operator, B,
which is used as follows:
Byt = yt−1 .
In other words, B, operating on yt, has the effect of shifting the
data back one period.
Two applications of B to yt shifts the data
back two periods:
B(Byt) = B2yt = yt−2 .
For monthly data, if we wish to shift attention to “the same month
last year,” then B12 is used, and the notation is B12yt = yt−12.
Backshift notation
A very useful notational device is the backward shift operator, B,
which is used as follows:
Byt = yt−1 .
In other words, B, operating on yt, has the effect of shifting the
data back one period. Two applications of B to yt shifts the data
back two periods:
B(Byt) = B2yt = yt−2 .
For monthly data, if we wish to shift attention to “the same month
last year,” then B12 is used, and the notation is B12yt = yt−12.
Backshift notation
A very useful notational device is the backward shift operator, B,
which is used as follows:
Byt = yt−1 .
In other words, B, operating on yt, has the effect of shifting the
data back one period. Two applications of B to yt shifts the data
back two periods:
B(Byt) = B2yt = yt−2 .
For monthly data, if we wish to shift attention to “the same month
last year,” then B12 is used, and the notation is B12yt = yt−12.
Backshift notation
The backward shift operator is convenient for describing the
process of differencing.
A first difference can be written as
y′t = yt − yt−1 = yt − Byt = (1− B)yt .
Note that a first difference is represented by (1− B).
Similarly, if second-order differences (i.e., first differences of first
differences) have to be computed, then:
y′′t = yt − 2yt−1 + yt−2 = (1− B)
Backshift notation
The backward shift operator is convenient for describing the
process of differencing. A first difference can be written as
y′t = yt − yt−1 = yt − Byt = (1− B)yt .
Note that a first difference is represented by (1− B).
Similarly, if second-order differences (i.e., first differences of first
differences) have to be computed, then:
y′′t = yt − 2yt−1 + yt−2 = (1− B)
Backshift notation
The backward shift operator is convenient for describing the
process of differencing. A first difference can be written as
y′t = yt − yt−1 = yt − Byt = (1− B)yt .
Note that a first difference is represented by (1− B).
Similarly, if second-order differences (i.e., first differences of first
differences) have to be computed, then:
y′′t = yt − 2yt−1 + yt−2 = (1− B)
Backshift notation
The backward shift operator is convenient for describing the
process of differencing. A first difference can be written as
y′t = yt − yt−1 = yt − Byt = (1− B)yt .
Note that a first difference is represented by (1− B).
Similarly, if second-order differences (i.e., first differences of first
differences) have to be computed, then:
y′′t = yt − 2yt−1 + yt−2 = (1− B)
Backshift notation
Second-order difference is denoted (1− B)2.
Second-order difference is not the same as a second
difference, which would be denoted 1− B2;
In general, a dth-order difference can be written as
(1− B)dyt.
A seasonal difference followed by a first difference can be
written as
(1− B)(1− Bm)yt .
Backshift notation
The “backshift” notation is convenient because the terms can be
multiplied together to see the combined effect.
(1− B)(1− Bm)yt = (1− B− Bm + Bm+1)yt
= yt − yt−1 − yt−m + yt−m−1.
For monthly data,m = 12 and we obtain the same result as earlier.
Backshift notation
The “backshift” notation is convenient because the terms can be
multiplied together to see the combined effect.
(1− B)(1− Bm)yt = (1− B− Bm + Bm+1)yt
= yt − yt−1 − yt−m + yt−m−1.
For monthly data,m = 12 and we obtain the same result as earlier.
1 Introduction
2 Stationarity and differencing
3 Non-seasonal ARIMA models
4 Model Identification
5 Estimation and order selection
6 ARIMA modelling in R
7 Forecasting
8 Seasonal ARIMA models
Autoregressive models
Autoregressive (AR) models:
yt = c + φ1yt−1 + φ2yt−2 + · · · + φpyt−p + εt,
where εt is white noise. This is a multiple regression with lagged
values of yt as predictors.
0 20 40 60 80 100
0 20 40 60 80 100
AR(1) model
yt = 2− 0.8yt−1 + εt
εt ∼ N(0, 1), T = 100.
0 20 40 60 80 100
AR(1) model
yt = c + φ1yt−1 + εt
When φ1 = 0, yt is equivalent to WN
When φ1 = 1 and c = 0, yt is equivalent to a RW
When φ1 = 1 and c 6= 0, yt is equivalent to a RW with drift
When φ1 < 0, yt tends to oscillate between positive and
negative values.
AR(2) model
yt = 8 + 1.3yt−1 − 0.7yt−2 + εt
εt ∼ N(0, 1), T = 100.
0 20 40 60 80 100
Stationarity conditions
We normally restrict autoregressive models to stationary data, and
then some constraints on the values of the parameters are
General condition for stationarity
Complex roots of 1− φ1z− φ2z2 − · · · − φpzp lie outside the unit
circle on the complex plane.
For p = 1: −1 < φ1 < 1.
For p = 2:
−1 < φ2 < 1 φ2 + φ1 < 1 φ2 − φ1 < 1.
More complicated conditions hold for p ≥ 3.
Estimation software takes care of this.
Stationarity conditions
We normally restrict autoregressive models to stationary data, and
then some constraints on the values of the parameters are
General condition for stationarity
Complex roots of 1− φ1z− φ2z2 − · · · − φpzp lie outside the unit
circle on the complex plane.
For p = 1: −1 < φ1 < 1.
For p = 2:
−1 < φ2 < 1 φ2 + φ1 < 1 φ2 − φ1 < 1.
More complicated conditions hold for p ≥ 3.
Estimation software takes care of this.
Moving Average (MA) models
Moving Average (MA) models:
yt = c + εt + θ1εt−1 + θ2εt−2 + · · · + θqεt−q,
where εt is white noise. This is a multiple regression with past
errors as predictors. Don’t confuse this with moving average
smoothing!
0 20 40 60 80 100
0 20 40 60 80 100
MA(1) model
yt = 20 + εt + 0.8εt−1
εt ∼ N(0, 1), T = 100.
0 20 40 60 80 100
MA(2) model
yt = εt − εt−1 + 0.8εt−2
εt ∼ N(0, 1), T = 100.
0 20 40 60 80 100
MA(∞) models
It is possible to write any stationary AR(p) process as an MA(∞)
Example: AR(1)
yt = φ1yt−1 + εt
= φ1(φ1yt−2 + εt−1) + εt
= φ21yt−2 + φ1εt−1 + εt
= φ31yt−3 + φ
1εt−2 + φ1εt−1 + εt
Provided−1 < φ1 < 1:
yt = εt + φ1εt−1 + φ21εt−2 + φ
1εt−3 + · · ·
MA(∞) models
It is possible to write any stationary AR(p) process as an MA(∞)
Example: AR(1)
yt = φ1yt−1 + εt
= φ1(φ1yt−2 + εt−1) + εt
= φ21yt−2 + φ1εt−1 + εt
= φ31yt−3 + φ
1εt−2 + φ1εt−1 + εt
Provided−1 < φ1 < 1:
yt = εt + φ1εt−1 + φ21εt−2 + φ
1εt−3 + · · ·
Invertibility
Any MA(q) process can be written as an AR(∞) process if we
impose some constraints on the MA parameters.
Then the MA model is called “invertible”.
Invertible models have some mathematical properties that
make them easier to use in practice.
Invertibility of an ARIMA model is equivalent to
forecastability of an ETS model.
Invertibility
General condition for invertibility
Complex roots of 1 + θ1z + θ2z2 + · · · + θqzq lie outside the unit
circle on the complex plane.
For q = 1: −1 < θ1 < 1.
For q = 2:
−1 < θ2 < 1 θ2 + θ1 > −1 θ1 − θ2 < 1.
More complicated conditions hold for q ≥ 3.
Estimation software takes care of this.
Invertibility
General condition for invertibility
Complex roots of 1 + θ1z + θ2z2 + · · · + θqzq lie outside the unit
circle on the complex plane.
For q = 1: −1 < θ1 < 1.
For q = 2:
−1 < θ2 < 1 θ2 + θ1 > −1 θ1 − θ2 < 1.
More complicated conditions hold for q ≥ 3.
Estimation software takes care of this.
ARMA models
Autoregressive Moving Average models:
yt = c + φ1yt−1 + · · · + φpyt−p
+ θ1εt−1 + · · · + θqεt−q + εt.
Predictors include both lagged values of yt and lagged
Conditions on AR coefficients ensure stationarity.
Conditions on MA coefficients ensure invertibility.
ARMA models
Autoregressive Moving Average models:
yt = c + φ1yt−1 + · · · + φpyt−p
+ θ1εt−1 + · · · + θqεt−q + εt.
Predictors include both lagged values of yt and lagged
Conditions on AR coefficients ensure stationarity.
Conditions on MA coefficients ensure invertibility.
ARIMA Models
Autoregressive Integrated Moving Average models:
y′t = c + φ1y
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com