Models for Nonstationary Time Series
Models for Nonstationary Time Series
MAS 640 – Time Series Analysis and Forecasting
1/29/2018
Determining Model Order
0 10 20 30 40
−
0
.5
0
.0
0
.5
Series: sim1
LAG
A
C
F
0 10 20 30 40
−
0
.5
0
.0
0
.5
LAG
P
A
C
F
ACF PACF
[1,] -0.87 -0.87
[2,] 0.76 0.00
[3,] -0.67 -0.01
[4,] 0.59 0.04
[5,] -0.53 0.01
[6,] 0.45 -0.06
[7,] -0.40 -0.05
[8,] 0.35 -0.03
[9,] -0.30 0.00
[10,] 0.26 -0.03
[11,] -0.22 -0.01
[12,] 0.19 0.01
[13,] -0.16 0.02
[14,] 0.13 -0.03
[15,] -0.10 0.00
[16,] 0.08 -0.03
[17,] -0.06 -0.04
[18,] 0.05 -0.01
[19,] -0.03 0.05
[20,] 0.01 -0.02
[21,] 0.01 0.01
[22,] -0.02 0.01
[23,] 0.04 0.01
[24,] -0.05 -0.02
[25,] 0.08 0.09
[26,] -0.11 -0.06
[27,] 0.14 0.03
[28,] -0.16 0.01
[29,] 0.16 -0.05
[30,] -0.16 0.02
[31,] 0.16 0.02
[32,] -0.15 -0.01
[33,] 0.15 -0.01
[34,] -0.14 0.00
[35,] 0.13 0.00
[36,] -0.12 0.01
[37,] 0.11 0.00
[38,] -0.10 -0.01
[39,] 0.10 0.03
[40,] -0.10 -0.02
[41,] 0.10 0.02
[42,] -0.12 -0.08
Determining Model Order
0 20 40 60 80 100
−
0
.2
0
.0
0
.2
0
.4
0
.6
Series: sim3
LAG
A
C
F
0 20 40 60 80 100
−
0
.2
0
.0
0
.2
0
.4
0
.6
LAG
P
A
C
F
ACF PACF
[1,] 0.50 0.50
[2,] 0.01 -0.32
[3,] 0.00 0.24
[4,] 0.00 -0.19
[5,] -0.01 0.15
[6,] -0.01 -0.13
[7,] 0.01 0.12
[8,] 0.01 -0.09
[9,] 0.00 0.08
[10,] 0.00 -0.07
[11,] 0.00 0.07
[12,] 0.01 -0.05
[13,] 0.01 0.05
[14,] 0.00 -0.06
[15,] -0.02 0.04
[16,] -0.01 -0.04
[17,] 0.01 0.05
[18,] 0.01 -0.03
[19,] 0.00 0.02
[20,] 0.00 -0.02
[21,] -0.01 0.00
[22,] -0.03 -0.03
[23,] -0.03 0.01
[24,] -0.01 -0.01
[25,] 0.01 0.02
[26,] 0.02 -0.01
[27,] 0.01 0.00
[28,] -0.01 -0.01
[29,] 0.00 0.02
[30,] 0.00 -0.02
[31,] 0.00 0.01
[32,] 0.00 0.00
[33,] 0.00 0.00
[34,] -0.01 -0.01
[35,] -0.01 0.00
[36,] 0.00 -0.01
[37,] -0.01 0.00
[38,] 0.00 0.00
[39,] 0.00 0.01
[40,] 0.00 -0.01
[41,] -0.01 0.00
[42,] 0.00 0.01
[43,] 0.01 0.00
[44,] 0.01 0.01
[45,] 0.02 0.01
[46,] 0.02 0.01
[47,] 0.02 0.02
[48,] 0.02 0.00
[49,] 0.01 0.00
[50,] 0.00 0.00
[51,] -0.01 -0.01
[52,] -0.01 0.00
[53,] 0.00 0.02
[54,] 0.02 0.01
[55,] 0.01 0.00
[56,] 0.00 0.00
[57,] 0.00 0.01
[58,] 0.00 -0.01
[59,] -0.01 -0.01
[60,] -0.02 -0.02
[61,] -0.02 0.01
[62,] -0.01 -0.01
[63,] -0.01 0.00
[64,] -0.01 0.00
[65,] 0.00 -0.01
[66,] 0.01 0.02
[67,] 0.03 0.01
[68,] 0.02 0.01
[69,] 0.01 0.01
[70,] 0.01 0.00
[71,] 0.00 0.00
[72,] -0.01 -0.01
[73,] -0.01 -0.01
[74,] -0.02 -0.01
[75,] -0.01 0.00
[76,] -0.01 -0.01
[77,] -0.01 0.00
[78,] 0.00 -0.01
[79,] 0.00 0.01
[80,] 0.00 -0.02
[81,] -0.01 0.01
[82,] 0.00 0.00
[83,] 0.00 0.00
[84,] -0.01 -0.01
[85,] 0.00 0.01
[86,] 0.01 0.01
[87,] 0.01 0.00
[88,] 0.01 0.02
[89,] 0.01 0.00
[90,] 0.00 0.00
[91,] -0.01 -0.01
[92,] -0.01 0.00
[93,] 0.00 0.00
[94,] 0.00 0.00
[95,] 0.00 -0.01
[96,] 0.00 0.02
[97,] 0.03 0.02
[98,] 0.04 0.01
[99,] 0.02 0.00
[100,] 0.00 0.00
[101,] 0.01 0.02
[102,] 0.02 0.00
[103,] 0.01 0.00
[104,] -0.01 -0.01
[105,] 0.00 0.02
[106,] 0.02 0.00
[107,] 0.02 0.01
[108,] 0.01 -0.01
[109,] -0.01 -0.01
[110,] -0.02 -0.02
Determining Model Order
0 10 20 30 40
0
.0
0
.2
0
.4
0
.6
0
.8
Series: sim2
LAG
A
C
F
0 10 20 30 40
0
.0
0
.2
0
.4
0
.6
0
.8
LAG
P
A
C
F
ACF PACF
[1,] 0.82 0.82
[2,] 0.75 0.24
[3,] 0.67 0.02
[4,] 0.62 0.03
[5,] 0.54 -0.05
[6,] 0.48 -0.01
[7,] 0.42 -0.03
[8,] 0.37 -0.01
[9,] 0.33 0.01
[10,] 0.28 -0.04
[11,] 0.24 -0.01
[12,] 0.19 -0.05
[13,] 0.14 -0.03
[14,] 0.10 -0.03
[15,] 0.06 -0.01
[16,] 0.05 0.06
[17,] 0.04 0.04
[18,] 0.03 -0.01
[19,] 0.01 -0.03
[20,] -0.01 -0.03
[21,] -0.01 0.03
[22,] -0.02 0.01
[23,] -0.04 -0.06
[24,] -0.05 -0.02
[25,] -0.07 -0.03
[26,] -0.08 -0.01
[27,] -0.09 -0.02
[28,] -0.10 -0.03
[29,] -0.11 0.00
[30,] -0.13 -0.03
[31,] -0.14 -0.02
[32,] -0.14 0.03
[33,] -0.13 0.02
[34,] -0.13 0.00
[35,] -0.11 0.04
[36,] -0.10 0.02
[37,] -0.09 0.00
[38,] -0.06 0.07
[39,] -0.05 -0.02
[40,] -0.04 -0.02
[41,] -0.03 0.01
[42,] 0.01 0.08
Determining Model Order
0 20 40 60 80 100
−
0
.3
−
0
.2
−
0
.1
0
.0
0
.1
Series: sim4
LAG
A
C
F
0 20 40 60 80 100
−
0
.3
−
0
.2
−
0
.1
0
.0
0
.1
LAG
P
A
C
F
ACF PACF
[1,] -0.11 -0.11
[2,] -0.30 -0.32
[3,] 0.00 -0.09
[4,] 0.00 -0.12
[5,] -0.01 -0.06
[6,] 0.00 -0.06
[7,] 0.00 -0.03
[8,] 0.01 -0.02
[9,] -0.01 -0.02
[10,] -0.01 -0.02
[11,] 0.02 0.00
[12,] 0.02 0.02
[13,] -0.01 0.01
[14,] -0.02 -0.01
[15,] 0.00 0.00
[16,] 0.00 -0.01
[17,] 0.01 0.00
[18,] 0.01 0.00
[19,] -0.01 0.00
[20,] 0.02 0.02
[21,] -0.01 0.00
[22,] 0.00 0.01
[23,] -0.01 -0.01
[24,] 0.00 0.00
[25,] 0.00 -0.01
[26,] 0.00 0.00
[27,] 0.00 0.00
[28,] 0.00 0.00
[29,] 0.01 0.01
[30,] -0.01 -0.01
[31,] 0.00 0.01
[32,] 0.01 0.00
[33,] -0.01 0.00
[34,] 0.01 0.01
[35,] 0.00 0.00
[36,] -0.02 -0.02
[37,] 0.00 -0.01
[38,] 0.02 0.01
[39,] 0.00 0.00
[40,] 0.00 0.01
[41,] 0.00 0.00
[42,] 0.00 0.01
[43,] 0.01 0.02
[44,] -0.01 0.00
[45,] -0.01 -0.01
[46,] 0.00 -0.01
[47,] 0.01 0.00
[48,] 0.01 0.00
[49,] 0.00 0.00
[50,] 0.00 0.01
[51,] 0.00 0.00
[52,] -0.01 -0.01
[53,] 0.00 -0.01
[54,] 0.00 -0.01
[55,] 0.00 0.00
[56,] 0.01 0.00
[57,] 0.01 0.01
[58,] 0.00 0.00
[59,] 0.00 0.00
[60,] 0.02 0.02
[61,] -0.01 -0.01
[62,] -0.02 -0.01
[63,] 0.01 0.00
[64,] 0.00 -0.01
[65,] 0.00 0.00
[66,] 0.01 0.01
[67,] 0.00 0.00
[68,] -0.02 -0.01
[69,] 0.00 0.00
[70,] 0.01 0.00
[71,] 0.01 0.01
[72,] 0.00 0.00
[73,] -0.01 0.00
[74,] 0.00 0.00
[75,] 0.00 -0.01
[76,] 0.00 -0.01
[77,] 0.01 0.01
[78,] 0.00 -0.01
[79,] -0.01 -0.01
[80,] -0.01 -0.02
[81,] 0.01 0.00
[82,] 0.00 0.00
[83,] 0.00 0.00
[84,] 0.00 0.00
[85,] 0.00 0.01
[86,] -0.02 -0.02
[87,] 0.01 0.01
[88,] 0.01 0.00
[89,] -0.01 0.00
[90,] 0.00 0.00
[91,] 0.00 0.00
[92,] 0.00 0.00
[93,] 0.01 0.02
[94,] -0.01 -0.01
[95,] -0.02 -0.01
[96,] 0.02 0.02
[97,] 0.01 0.01
[98,] -0.02 -0.01
[99,] 0.00 0.00
[100,] 0.01 0.00
[101,] -0.01 0.00
[102,] 0.01 0.01
[103,] 0.00 0.00
[104,] -0.01 -0.01
[105,] 0.00 0.00
[106,] 0.00 0.00
[107,] 0.01 0.01
[108,] -0.02 -0.02
[109,] 0.00 -0.01
[110,] 0.01 -0.01
Outline
I Models for Nonstationary Time Series
I Dealing with nonconstant mean functions
I Dealing with nonconstant variance
I Chapter 5 from text
Detrending and Stationarity
Any time series without a constant mean over time is nonstationary.
Yt = µt + et
If µ varies across t, the series is nonstationary
Detrending and Stationarity
I We fixed this issue by building a model for µt and studying the
residuals
I “Detrending”
I We called this modeling “deterministic” trends
I Only reasonable if we assume this trend is an intrinsic property
of the time series
I Implicitly assumes that the trend is “forever”
I Which is often difficult to believe
Random Walk
I Consider the random walk process
Yt = Yt−1 + et
I By definition, it has constant mean µt = 0
I Mistaking a trend seen from a random walk as deterministic
(“forever”) wouldn’t be appropriate
Random Walk – 30 Days of Data
0 5 10 15 20 25 30
0
5
1
0
1
5
30 realizations from a random walk process
Index
y[
1
:3
0
]
Random Walk – 180 Days of Data
180 realizations from a random walk process
Time
y
0 50 100 150
0
5
1
0
1
5
Price of Gold
Time
g
o
ld
0 500 1000 1500 2000 2500 3000
0
5
0
0
0
1
0
0
0
0
1
5
0
0
0
2
0
0
0
0
2
5
0
0
0
Current Price
Forecast
null device
1
Price of Gold
Oil Price
Time
o
il.
p
ri
ce
1990 1995 2000 2005
1
0
2
0
3
0
4
0
5
0
6
0
Oil Price
CO2 Levels
Time
co
2
1995 2000 2005 2010 2015
3
4
0
3
6
0
3
8
0
4
0
0
4
2
0
Current Value
Forecast
Differencing
I An alternative approach is to study the differenced time series
∇Yt = Yt − Yt−1
I No assumptions on trend through time
I No model to fit or parameters to estimate
I Tends to work well in practice
I Good, quick, simple for forecasts
CREF Time Series
Time
C
R
E
F
0 100 200 300 400 500
1
7
0
1
8
0
1
9
0
2
0
0
2
1
0
2
2
0
CREF Differenced
Time
d
iff
(C
R
E
F
)
0 100 200 300 400 500
−
4
−
2
0
2
4
Autocorrelations for Differenced CREF Data
0 5 10 15 20 25 30
−
0
.1
0
0
.0
0
0
.1
0
0
.2
0
Series: diff(CREF)
LAG
A
C
F
0 5 10 15 20 25 30
−
0
.1
0
0
.0
0
0
.1
0
0
.2
0
LAG
P
A
C
F
ACF PACF
[1,] 0.06 0.06
[2,] -0.07 -0.07
[3,] 0.04 0.05
[4,] -0.02 -0.03
[5,] 0.00 0.01
[6,] -0.06 -0.07
[7,] -0.05 -0.04
[8,] -0.07 -0.07
[9,] -0.09 -0.08
[10,] 0.01 0.01
[11,] 0.02 0.01
[12,] -0.07 -0.07
[13,] 0.03 0.03
[14,] 0.10 0.07
[15,] -0.03 -0.05
[16,] 0.07 0.08
[17,] 0.05 0.02
[18,] -0.02 -0.02
[19,] -0.02 -0.02
[20,] 0.01 0.02
[21,] 0.03 0.02
[22,] -0.02 0.00
[23,] 0.02 0.05
[24,] -0.06 -0.07
[25,] 0.00 0.03
[26,] 0.01 0.00
[27,] -0.02 -0.02
[28,] 0.01 0.01
[29,] -0.09 -0.08
[30,] -0.02 -0.03
[31,] 0.02 0.01
[32,] 0.01 0.02
[33,] 0.02 0.01
Air Passenger Miles
Time
a
ir
m
ile
s
1996 1998 2000 2002 2004
3
.0
e
+
0
7
4
.5
e
+
0
7
Time
a
ir
m
ile
s
1996 1998 2000 2002 2004
−
2
.0
e
+
0
7
0
.0
e
+
0
0
Autocorrelations for Differenced airmiles Data
0 1 2 3 4
−
0
.5
0
.0
0
.5
Series: diff(airmiles)
LAG
A
C
F
0 1 2 3 4
−
0
.5
0
.0
0
.5
LAG
P
A
C
F
ACF PACF
[1,] -0.31 -0.31
[2,] -0.08 -0.20
[3,] -0.05 -0.16
[4,] 0.11 0.02
[5,] 0.21 0.28
[6,] -0.71 -0.65
[7,] 0.21 -0.25
[8,] 0.09 -0.05
[9,] -0.02 -0.25
[10,] -0.10 -0.16
[11,] -0.23 -0.34
[12,] 0.78 0.39
[13,] -0.24 0.14
[14,] -0.07 0.06
[15,] -0.05 -0.02
[16,] 0.10 -0.10
[17,] 0.20 -0.04
[18,] -0.63 0.00
[19,] 0.18 -0.05
[20,] 0.09 0.02
[21,] -0.02 -0.11
[22,] -0.07 -0.05
[23,] -0.20 -0.07
[24,] 0.68 0.11
[25,] -0.21 0.03
[26,] -0.07 -0.01
[27,] -0.03 0.01
[28,] 0.11 0.09
[29,] 0.16 0.00
[30,] -0.58 -0.05
[31,] 0.17 -0.02
[32,] 0.07 -0.04
[33,] -0.02 -0.04
[34,] -0.08 -0.01
[35,] -0.16 -0.01
[36,] 0.59 -0.04
[37,] -0.20 -0.08
[38,] -0.04 0.03
[39,] -0.03 -0.02
[40,] 0.10 -0.05
[41,] 0.13 -0.03
[42,] -0.51 0.00
[43,] 0.16 0.01
[44,] 0.04 -0.03
[45,] 0.00 -0.02
[46,] -0.04 0.09
[47,] -0.13 0.09
[48,] 0.45 -0.14
Electricity
Time
e
le
ct
ri
ci
ty
1975 1980 1985 1990 1995 2000 2005
1
5
0
0
0
0
3
0
0
0
0
0
Time
e
le
ct
ri
ci
ty
1975 1980 1985 1990 1995 2000 2005
−
6
0
0
0
0
0
4
0
0
0
0
Autocorrelations for Differenced electricity Data
0 1 2 3 4
−
0
.5
0
.0
0
.5
1
.0
Series: diff(electricity)
LAG
A
C
F
0 1 2 3 4
−
0
.5
0
.0
0
.5
1
.0
LAG
P
A
C
F
ACF PACF
[1,] 0.09 0.09
[2,] -0.20 -0.20
[3,] -0.49 -0.47
[4,] -0.31 -0.40
[5,] 0.30 0.12
[6,] 0.21 -0.17
[7,] 0.31 0.13
[8,] -0.30 -0.29
[9,] -0.47 -0.46
[10,] -0.17 -0.42
[11,] 0.10 -0.61
[12,] 0.85 0.35
[13,] 0.12 -0.06
[14,] -0.17 0.02
[15,] -0.49 -0.10
[16,] -0.28 0.09
[17,] 0.30 0.15
[18,] 0.18 0.01
[19,] 0.30 0.04
[20,] -0.28 0.07
[21,] -0.45 -0.01
[22,] -0.18 -0.17
[23,] 0.11 -0.19
[24,] 0.83 0.18
[25,] 0.10 -0.10
[26,] -0.15 -0.06
[27,] -0.47 0.00
[28,] -0.28 0.06
[29,] 0.29 0.05
[30,] 0.18 0.01
[31,] 0.29 0.05
[32,] -0.28 0.04
[33,] -0.43 0.02
[34,] -0.18 -0.06
[35,] 0.11 -0.08
[36,] 0.80 0.05
[37,] 0.11 -0.06
[38,] -0.15 -0.08
[39,] -0.44 0.08
[40,] -0.28 -0.05
[41,] 0.28 0.01
[42,] 0.17 0.02
[43,] 0.27 -0.05
[44,] -0.27 0.00
[45,] -0.41 0.01
[46,] -0.17 -0.01
[47,] 0.10 -0.08
[48,] 0.78 0.11
Differencing
I Moral of the story: differencing can effectively remove the
nonconstant mean in a time series
I Rather than build a model for the trend and study residuals,
study the differenced data
I Note: For exponential trends, you may need to take two
differences!
I diff(diff(DATA))
Implementation of Differencing
I Since differencing is so standard for time series analyses and
forecasting, most software will allow you to specify it directly.
I So rather than building a model and studying the residuals, we
can simply pass the original data and say “difference it”
sarima(x, p, d, q)
I p = AR order
I d = Number of differences for stationarity
I q = MA order
ARIMA(p, d, q)
ARMA models that need to be differenced are referred to as ARIMA
models
I Autoregressive Integrated Moving Average
I Denoted ARIMA(p, d , q)
ARIMA(p, d, q)
ARIMA(p, d, q) models encompass every class of models we have
encountered up to this point
I ARIMA(p, 0, q) =
I ARIMA(p, 0, 0) =
I ARIMA(0, 0, q) =
I ARIMA(0, 0, 0) =
Stationarity
I Stationarity is a very important concept in time series and one
that you will often hear. Broadly speaking, a time series is
called stationary if. . .
1. No systematic change in the mean (no trend),
2. No systematic change in the variance,
3. No noteable seasonal patterns exist
In other words, the properties of one section of the data are the
same as any other section.
Transformations
I If we have clear evidence of nonconstant variance over time, a
suitable transformation might fix (or lessen the impact of) the
nonconstant variance pattern.
I Any transformations we apply to the data should be a first
step.
I Transform the data before looking at differences or modeling
the trend.
Monthly Electricity Usage in the US
Time
e
le
ct
ri
ci
ty
1975 1980 1985 1990 1995 2000 2005
1
5
0
0
0
0
2
0
0
0
0
0
2
5
0
0
0
0
3
0
0
0
0
0
3
5
0
0
0
0
4
0
0
0
0
0
What do we learn from this time series plot?
Monthly Electricity Usage in the US
I Variance is increasing over time
I Time series that exhibit a “fanning-out” shape are not
stationary becuase the variance changes over time.
I Before modeling, we should transform the data to stabilize the
variance.
log(Electricity) in the US
Time
e
le
ct
ri
ci
ty
1975 1980 1985 1990 1995 2000 2005
1
2
.0
1
2
.2
1
2
.4
1
2
.6
1
2
.8
Variance looks ok! Let’s plot the differences next.
diff(log(Electricity)) in the US
Time
e
le
ct
ri
ci
ty
1975 1980 1985 1990 1995 2000 2005
−
0
.2
−
0
.1
0
.0
0
.1
Remember: transform first, them difference
Box-Cox for Power Transformations
−2 −1 0 1 2
1
4
2
0
1
4
4
0
1
4
6
0
1
4
8
0
1
5
0
0
λ
L
o
g
L
ik
e
lih
o
o
d
95%
Power Transformations
λ T(Y) Description
-2.0 1/Y 2 Inverse Square
-1.0 1/Y Inverse or Reciprocal
-0.5 1/
√
Y Inverse square root
0.0 ln(Y ) Logarithm
0.5
√
Y Square root
1.0 Y Identity (No transformation)
2.0 Y 2 Square
Comments on using BoxCox procedure
I In the electricity example, the optimal λ was found to be about
-0.1. However, this transformation makes little sense and isn’t
interpretable.
I Interval suggested values between about -0.4 and 0.2.
I So log transformation would be appropriate.
Want to find a reasonable transformation, not necessarily an
optimal..
Variance Stabilizing Transformations
I Can only perform variance stabilizing transformations on
positive time series
I All values > 0
I However, if some or all Y are negative, we can simply add the
same positive constant c to every observation so that every
value becomes positive.
I This does not affect anything
Adding Constant to Obtain a Positive Time Series
Consider the hypothetical time series below. Note that observations
2, 4, and 8 are negative, reach as low as -2.
2 4 6 8 10 12
−
2
−
1
0
1
2
3
4
5
Index
y
Adding Constant to Obtain a Positive Time Series
Simply add 3 to it. Shape is maintained, but now it’s entirely
positive.
2 4 6 8 10 12
1
2
3
4
5
6
7
8
Index
y.
n
e
w
Relationship with Returns
I Suppose that Yt has relatively stable percent changes from
period to period. That is, suppose that
Yt = (1 + rt)Yt−1
I where 100rt represents the percent change from Yt−1 to Yt .
Relationship with Returns
I Suppose now that we take the log of this time series, and then
the difference.
log(Yt)− log(Yt−1) = log
(
Yt
Yt−1
)
= log(1 + rt)
I If rt is relatively low ( <20% returns), then log(1 + rt) ≈ rt I Consequently, ∇log(Yt) ≈ rt I Common in time series studies of financial data where returns are important and meaningful