CS计算机代考程序代写 algorithm flex Bayesian ETC3231/5231 Business forecasting

ETC3231/5231 Business forecasting
Ch8. Exponential smoothing OTexts.org/fpp3/

Outline
1 Exponential smoothing
2 Simple exponential smoothing
3 Models with trend
4 Models with seasonality
5 Innovations state space models
6 Forecasting with exponential smoothing
2

Outline
1 Exponential smoothing
2 Simple exponential smoothing
3 Models with trend
4 Models with seasonality
5 Innovations state space models
6 Forecasting with exponential smoothing
3

Historical perspective
Proposed in the late 1950s (Brown 1959, Holt 1957 and Winters 1960 are key pioneering works) as methods (algorithms) to produce point forecasts. Combine a “level”, “trend” (slope) and “seasonal” component to describe a time series.
The rate of change of the components are controlled by “smoothing parameters”: α, β and γ respectively. Need to choose best values for the smoothing parameters (and initial states).
Equivalent ETS state space models developed in the 1990s and 2000s.
4

Combine components
yt =f(lt,bt,st,εt)
Combine components level lt , trend (slope) bt and seasonal st and error εt to describe a time series.
5

Big idea: control the rate of change (smoothing)
α controls the flexibility of the level lt If α = 0, the level never updates (mean)
If α = 1, the level updates completely (naive) β controls the flexibility of the trend bt
If β = 0, the trend is linear (regression trend) If β = 1, the trend updates every observation
γ controls the flexibility of the seasonality st
If γ = 0, the seasonality is fixed (seasonal means)
If γ = 1, the seasonality updates completely (seasonal naive)
6

Models for levels, trends, and seasonalities
We want a model that captures lt, bt and st. How do we combine these elements?
7

Models for levels, trends, and seasonalities
We want a model that captures lt, bt and st.
How do we combine these elements?
Additively?
yt =lt−1 +bt−1 +st−m +εt
7

Models for levels, trends, and seasonalities
We want a model that captures lt, bt and st.
How do we combine these elements?
Additively?
yt =lt−1 +bt−1 +st−m +εt Multiplicatively?
yt = lt−1bt−1st−m(1 + εt)
Perhaps a mix of both?
yt = (lt−1 + bt−1)st−m + εt
7

Models for levels, trends, and seasonalities
We want a model that captures lt, bt and st.
How do we combine these elements?
Additively?
yt =lt−1 +bt−1 +st−m +εt Multiplicatively?
yt = lt−1bt−1st−m(1 + εt) Perhaps a mix of both?
yt = (lt−1 + bt−1)st−m + εt
How do the level, trend and seasonal components evolve over time?
7

ETS models
General notation E T S : ExponenTial Smoothing ↗↑↖
Error Trend Season
8

ETS models
General notation E T S : ExponenTial Smoothing ↗↑↖
Error Trend Season
model(ETS(y ~ error() + trend() + season()))
8

ETS models
General notation E T S : ExponenTial Smoothing ↗↑↖
Error Trend Season
model(ETS(y ~ error() + trend() + season()))
error() = {“A”, “M”}
trend() = {“N”, “A”, “Ad”,”M”,”Md”} season() = {“N”, “A”, “M”}.
“N” = none
“A” = additive “Ad” = additive damped
“M” = multiplicative “Md” = multiplicative damped
8

Outline
1 Exponential smoothing
2 Simple exponential smoothing
3 Models with trend
4 Models with seasonality
5 Innovations state space models
6 Forecasting with exponential smoothing
9

Simple exponential smoothing
algeria_economy <- global_economy %>%
filter(Country == “Algeria”)
algeria_economy %>% autoplot(Exports) +
labs(y=”% of GDP”,
title=”Algerian exports of goods and services”)
10
50
40
30
20
Algerian exports of goods and services
1960 1980
2000
Year [1Y]
% of GDP

Simple methods
Timeseriesy1,y2,…,yT.
Random walk forecasts
yˆ T + h | T = y T
11

Simple methods
Timeseriesy1,y2,…,yT. Random walk forecasts
Average forecasts
yˆ T + h | T = y T
1 􏰆T
yˆ T + h | T = T
t=1
y t
11

Simple methods
Timeseriesy1,y2,…,yT. Random walk forecasts
Average forecasts
yˆ T + h | T = y T 1 􏰆T
yˆ T + h | T = T
Want something in between these methods. Most recent data should have more weight.
t=1
y t
11

Simple Exponential Smoothing – SES
Forecast equation
yˆT+1|T =αyT +α(1−α)yT−1 +α(1−α)2yT−2 +··· where 0 ≤ α ≤ 1.
12

Simple Exponential Smoothing – SES
Forecast equation
yˆT+1|T =αyT +α(1−α)yT−1 +α(1−α)2yT−2 +···
where 0 ≤ α ≤ 1.
Weights assigned to observations for:
Observation α = 0.2 yT 0.2
yT−1 0.16
yT−2 0.128
yT−3 0.1024
yT−4 (0.2)(0.8)4
yT−5 (0.2)(0.8)5
α = 0.4 0.4
0.24 0.144 0.0864 (0.4)(0.6)4 (0.4)(0.6)5
α = 0.6 0.6
0.24 0.096 0.0384 (0.6)(0.4)4 (0.6)(0.4)5
α = 0.8 0.8
0.16
0.032 0.0064 (0.8)(0.2)4 (0.8)(0.2)5 12

Simple Exponential Smoothing – SES
13

Simple Exponential Smoothing – SES
Component form
Forecast equation yˆt+1|t = lt
Smoothing equation lt = αyt + (1 − α)lt−1
lt is the level (or the smoothed value) of the series at time t.
yˆ T + 1 | T = l T
yˆt+1|t = αyt + (1 − α)yˆt|t−1
Iterate to get exponentially weighted moving average form.
Weighted average form
T−1
yˆT+1|T = 􏰆 α(1 − α)jyT−j + (1 − α)Tl0
j=0
14

Initialisation
Any exponential smoothing method requires initialisation; SES needs initial level l0.
Last term in the weighted average form
(1 − α)Tl0.
So initial level l0 plays a role in all subsequent forecasts.
Weight is small unless α is close to zero or T is small.
15

Optimising smoothing parameters
Need to choose best values for α and l0.
Similarly to regression, choose optimal
parameters by minimising SSE:
TT
SSE = 􏰆(yt − yˆt|t−1)2 = 􏰆 e2t .
t=1 t=1 Unlike regression there is no closed form
solution — use numerical optimization.
16

Multi-step forecasts
yˆT+1|T = … = yˆT+h|T = lT
17

Simple Exponential Smoothing
Algerian exports of goods and services: level
50
40
30
20
1960 1980
2000
Year
alpha 0.00
0.84 1.00
18
% of GDP

Models and methods
Methods
Algorithms that return point forecasts.
19

Models and methods
Methods
Algorithms that return point forecasts.
Models
Generate same point forecasts but can also generate forecast distributions.
A stochastic (or random) data generating process that can generate an entire forecast distribution. Allow for “proper” model selection.
19

ETS(A,N,N): A model for SES Component form
Forecast equation yˆt+h|t = lt
Smoothing equation lt = αyt + (1 − α)lt−1
20

ETS(A,N,N): A model for SES Component form
Forecast equation yˆt+h|t = lt
Smoothing equation lt = αyt + (1 − α)lt−1
Forecast error: et = yt − yˆt|t−1 = yt − lt−1.
20

ETS(A,N,N): A model for SES
Component form
Forecast equation yˆt+h|t = lt
Smoothing equation lt = αyt + (1 − α)lt−1
Forecast error: et = yt − yˆt|t−1 = yt − lt−1. Error correction form
yt = lt−1 + et
lt = lt−1 + α(yt − lt−1)
= lt−1 + αet
20

ETS(A,N,N)
Measurement equation yt = lt−1 + εt State equation lt = lt−1 + αεt
where εt ∼ NID(0, σ2).
innovations or single source of error because equations have the same error process, εt. Measurement equation: relationship between observations and states.
State equation(s): evolution of the state(s) through time.
21

ETS(M,N,N)
SES with multiplicative errors.
Specify relative errors εt = yt−yˆt|t−1
∼ NID(0, σ2)
yˆt|t−1 Substituting yˆt|t−1 = lt−1 gives:
􏰌 yt=lt−1+lt−1εt
􏰌 et = yt − yˆt|t−1 = lt−1εt
Measurement equation yt = lt−1(1 + εt) State equation lt = lt−1(1 + αεt)
Models with additive and multiplicative errors
with the same parameters generate the same pointforecastsbutdifferentpredictionintervals. 22

ETS(A,N,N): Specifying the model
ETS(y ~ error(“A”) + trend(“N”) + season(“N”))
By default, an optimal value for α and l0 is used. α can be chosen manually in trend().
trend(“N”, alpha = 0.5)
trend(“N”, alpha_range = c(0.2, 0.8))
23

Example: Algerian Exports
algeria_economy <- global_economy %>%
filter(Country == “Algeria”)
fit <- algeria_economy %>%
model(ANN = ETS(Exports ~ error(“A”) + trend(“N”) + season(“N”)))
report(fit)
## Series: Exports
## Model: ETS(A,N,N)
##
##
##
##
## l
## 39.54
##
Smoothing parameters:
alpha = 0.84
Initial states:
## sigma^2: 35.63
##
## AIC AICc BIC
## 446.7 447.2 452.9
24

Example: Algerian Exports
components(fit) %>% autoplot()
ETS(A,N,N) decomposition Exports = lag(level, 1) + remainder
50 40 30 20 50 40 30 20
10 0 −10 −20
1960 1980
2000 2020
Year
25
Exports level remainder

Example: Algerian Exports
components(fit) %>%
left_join(fitted(fit), by = c(“Country”, “.model”, “Year”))
## # A dable: 59 x 7 [1Y]
## # Key: Country, .model [1]
## #:
##
##
## 1 Algeria ANN ## 2 Algeria ANN ## 3 Algeria ANN ## 4 Algeria ANN ## 5 Algeria ANN ## 6 Algeria ANN ## 7 Algeria ANN ## 8 Algeria ANN ## 9 Algeria ANN ## 10 Algeria ANN
Exports = lag(level, 1) + remainder
Country .model Year Exports level remainder .fitted

1959
1960
1961
1962
1963
1964
1965
1966
1967
1968

NA 39.5
39.0 39.1
46.2 45.1
19.8 23.8
24.7 24.6
25.1 25.0
22.6 23.0
26.0 25.5
23.4 23.8
23.1 23.2

NA NA
-0.496 39.5
7.12 39.1
-25.3 45.1
0.841 23.8
0.534 24.6
-2.39 25.0
3.00 23.0
-2.07 25.5
-0.630 23.8
26

Example: Algerian Exports
fit %>%
forecast(h = 5) %>%
autoplot(algeria_economy) +
labs(y = “% of GDP”, title = “Exports: Algeria”)
27
Exports: Algeria
50
40
30
20
10
0
l
1960 1980
2000
2020
Year
evel 80
95
% of GDP

Outline
1 Exponential smoothing
2 Simple exponential smoothing
3 Models with trend
4 Models with seasonality
5 Innovations state space models
6 Forecasting with exponential smoothing
28

Holt’s linear trend
Component form
Forecast Level Trend
yˆt+h|t = lt + hbt
lt = αyt + (1 − α)(lt−1 + bt−1)
bt = β∗(lt − lt−1) + (1 − β∗)bt−1,
29

Holt’s linear trend
Component form
Forecast Level Trend
yˆt+h|t = lt + hbt
lt = αyt + (1 − α)(lt−1 + bt−1)
bt = β∗(lt − lt−1) + (1 − β∗)bt−1,
Two smoothing parameters α and β∗ (0≤α,β∗ ≤1).
lt level: weighted average between yt and one-step ahead forecast for time t,
(lt−1 + bt−1 = yˆt|t−1)
bt slope: weighted average of (lt − lt−1) and
bt−1, current and previous estimate of slope.
29

ETS(A,A,N)
Holt’s linear method with additive errors.
Assume εt = yt − lt−1 − bt−1 ∼ NID(0, σ2).
Substituting into the error correction equations
for Holt’s linear method
yt =lt−1 +bt−1 +εt
lt =lt−1 +bt−1 +αεt bt = bt−1 + αβ∗εt
For simplicity, set β = αβ∗.
30

Exponential smoothing: trend/slope
Australian population. beta = 0
25
20
15
10
1960 1980
2000
Year
31
Millions

Exponential smoothing: trend/slope
25
Australian population. beta = 0.15
20
15
10
1960 1980
2000
Year
32
Millions

Exponential smoothing: trend/slope
Australian population. beta = 0.8
25
20
15
10
1960 1980
2000
Year
33
Millions

ETS(M,A,N)
Holt’s linear method with multiplicative errors.
Assume εt = yt−(lt−1+bt−1) (lt−1 +bt−1 )
Following a similar approach as above, the innovations state space model underlying Holt’s linear method with multiplicative errors is
specified asyt = (lt−1 + bt−1)(1 + εt) lt = (lt−1 + bt−1)(1 + αεt)
bt = bt−1 + β(lt−1 + bt−1)εt where again β = αβ∗ and εt ∼ NID(0, σ2).
34

ETS(A,A,N): Specifying the model
ETS(y ~ error(“A”) + trend(“A”) + season(“N”))
By default, optimal values for β and b0 are used. β can be chosen manually in trend().
trend(“A”, beta = 0.004)
trend(“A”, beta_range = c(0, 0.1))
35

Example: Australian population
aus_economy <- global_economy %>% filter(Code == “AUS”) %>%
mutate(Pop = Population/1e6)
fit <- aus_economy %>%
model(AAN = ETS(Pop ~ error(“A”) + trend(“A”) + season(“N”)))
report(fit)
## Series: Pop
## Model: ETS(A,A,N)
##
##
##
##
##
## l b
## 10.05 0.2225
##
Smoothing parameters:
alpha = 0.9999
beta = 0.3266
Initial states:
## sigma^2: 0.0041
##
## AIC AICc BIC
36
## -76.99 -75.83 -66.68

Example: Australian population
components(fit) %>% autoplot()
25
20
15
10 25
20 15 10
0.35 0.30 0.25 0.20
0.2 0.1 0.0
−0.1
ETS(A,A,N) decomposition
Pop = lag(level, 1) + lag(slope, 1) + remainder
1960 1980
2000 2020
Year
37
Pop level slope remainder

Example: Australian population
components(fit) %>%
left_join(fitted(fit), by = c(“Country”, “.model”, “Year”))
## # A dable: 59 x 8 [1Y]
## # Key: Country, .model [1]
## # : Pop = lag(level, 1) + lag(slope, 1) + remainder
## Country .model Year Pop level slope remainder .fitted
##
## 1 Australia AAN
## 2 Australia AAN
## 3 Australia AAN
## 4 Australia AAN
## 5 Australia AAN
## 6 Australia AAN
## 7 Australia AAN
## 8 Australia AAN
## 9 Australia AAN
## 10 Australia AAN
## # … with 49 more rows
NA
1959 NA 10.1 0.222 NA
1960 10.3 10.3 0.222 -0.000145 10.3
1961 10.5 10.5 0.217 -0.0159 10.5
1962 10.7 10.7 0.231 0.0418 10.7
1963 11.0 11.0 0.223 -0.0229 11.0
1964 11.2 11.2 0.221 -0.00641 11.2
1965 11.4 11.4 0.221 -0.000314 11.4
1966 11.7 11.7 0.235 0.0418 11.6
1967 11.8 11.8 0.206 -0.0869 11.9
1968 12.0 12.0 0.208 0.00350 12.0
38

Example: Australian population
fit %>%
forecast(h = 10) %>%
autoplot(aus_economy) +
labs(y = “Millions”, title= “Population: Australia”)
39
Population: Australia
30
25
20
15
l
10
1960 1980
2000 2020
Year
evel 80
95
Millions

Damped trend method
Component form
yˆt+h|t =lt +(φ+φ2 +···+φh)bt
lt = αyt + (1 − α)(lt−1 + φbt−1)
bt = β∗(lt − lt−1) + (1 − β∗)φbt−1.
40

Damped trend method
Component form
yˆt+h|t =lt +(φ+φ2 +···+φh)bt
lt = αyt + (1 − α)(lt−1 + φbt−1)
bt = β∗(lt − lt−1) + (1 − β∗)φbt−1.
Damping parameter 0 < φ < 1. If φ = 1, identical to Holt’s linear trend. Ash→∞,yˆT+h|T →lT +φbT/(1−φ). Short-run forecasts trended, long-run forecasts constant. 40 Your turn Write down the model for ETS(A,Ad,N) 41 Example: Australian population aus_economy %>%
model(holt = ETS(Pop ~ error(“A”) + trend(“Ad”) + season(“N”))) %>%
forecast(h = 20) %>%
autoplot(aus_economy)
30
25
20
15
10
l
1960 1980 2000 2020 2040
Year
evel 80
95
42
Pop

Example: Australian population
fit <- aus_economy %>%
filter(Year <= 2010) %>%
model(
ses = ETS(Pop ~ error(“A”) + trend(“N”) + season(“N”)),
holt = ETS(Pop ~ error(“A”) + trend(“A”) + season(“N”)),
damped = ETS(Pop ~ error(“A”) + trend(“Ad”) + season(“N”)
)
)
tidy(fit)
accuracy(fit)
43

Example: Australian population
l0 b0 Training RMSE Test RMSE Test MASE Test MAPE Test MAE
term
α β∗
SES Linear trend Damped trend
1.00 1.00 1.00 0.30 0.40 φ 0.98 10.28 10.05 10.04 0.22 0.25 0.24 0.06 0.07 1.63 0.15 0.21 6.18 0.55 0.75 6.09 0.55 0.74 1.45 0.13 0.18
44

Outline
1 Exponential smoothing
2 Simple exponential smoothing
3 Models with trend
4 Models with seasonality
5 Innovations state space models
6 Forecasting with exponential smoothing
45

Holt-Winters additive method
Holt and Winters extended Holt’s method to capture seasonality.
Component form
yˆt+h|t = lt + hbt + st+h−m(k+1)
lt = α(yt − st−m) + (1 − α)(lt−1 + bt−1)
bt = β∗(lt − lt−1) + (1 − β∗)bt−1
st =γ(yt −lt−1 −bt−1)+(1−γ)st−m
k = integer part of (h − 1)/m. Ensures estimates from the final year are used for forecasting.
Parameters: 0≤α≤1, 0≤β∗ ≤1, 0≤γ≤1−α and m = period of seasonality (e.g. m = 4 for quarterly data).
46

Holt-Winters additive method
Seasonal component is usually expressed as
st = γ∗(yt − lt) + (1 − γ∗)st−m.
Substitute in for lt:
st =γ∗(1−α)(yt−lt−1−bt−1)+[1−γ∗(1−α)]st−m We set γ = γ∗(1 − α).
The usual parameter restriction is 0 ≤ γ∗ ≤ 1, which translates to 0 ≤ γ ≤ (1 − α).
47

Exponential smoothing: seasonality
48

Exponential smoothing: seasonality
Medicare Australia cost of vaccine scripts: seasonality
1.0e+07
7.5e+06
5.0e+06
2.5e+06
0.0e+00
1995 Jan
2000 Jan
Month
2005 Jan
gamma 0.0000
0.7185 1.0000
49
Cost of scripts ($AUD)

ETS(A,A,A)
Holt-Winters additive method with additive errors.
Forecast equation Observation equation State equations
yˆt+h|t = lt + hbt + st+h−m(k+1)
yt = lt−1 + bt−1 + st−m + εt
lt = lt−1 + bt−1 + αεt bt = bt−1 + βεt
st = st−m + γεt
Forecast errors: εt = yt − yˆt|t−1 k is integer part of (h − 1)/m.
50

Your turn
Write down the model for ETS(A,N,A)
51

Holt-Winters multiplicative method
For when seasonal variations are changing proportional to the level of the series.
Component form
yˆt+h|t = (lt + hbt)st+h−m(k+1)
lt =α yt +(1−α)(lt−1 +bt−1)
st−m
bt = β∗(lt − lt−1) + (1 − β∗)bt−1
st =γ yt +(1−γ)st−m (lt−1 + bt−1)
k is integer part of (h − 1)/m.
With additive method st is in absolute terms:
within each year 􏰈i si ≈ 0.
With multiplicative method st is in relative terms:
within each year 􏰈i si ≈ m. 52

ETS(M,A,M)
Holt-Winters multiplicative method with multiplicative errors.
Forecast equation Observation equation State equations
yˆt+h|t = (lt + hbt)st+h−m(k+1)
yt = (lt−1 + bt−1)st−m(1 + εt) lt = (lt−1 + bt−1)(1 + αεt)
bt = bt−1 + β(lt−1 + bt−1)εt st = st−m(1 + γεt)
Forecast errors: εt = (yt − yˆt|t−1)/yˆt|t−1 k is integer part of (h − 1)/m.
53

Example: Australian holiday tourism
aus_holidays <- tourism %>%
filter(Purpose == “Holiday”) %>%
summarise(Trips = sum(Trips))
fit <- aus_holidays %>%
model(
additive = ETS(Trips ~ error(“A”) + trend(“A”) + season(“A”)),
multiplicative = ETS(Trips ~ error(“M”) + trend(“A”) + season(“M”))
)
fc <- fit %>% forecast()
54

Example: Australian holiday tourism
fc %>%
autoplot(aus_holidays, level = NULL) +
labs(y = “Thousands”, title = “Overnight trips”)
Overnight trips
14000
12000
10000
8000
.model additive
multiplicative
2000 Q1
2005 Q1
2010 Q1
Quarter
2015 Q1
2020 Q1
55
Thousands

Estimated components
components(fit)
## # A dable: 168 x 7 [1Q]
## # Key: .model [2]
## # : Trips = lag(level, 1) + lag(slope, 1) + lag(season,
## # 4) + remainder
## .model Quarter Trips level slope season remainder
##
## 1 additive 1997 Q1
## 2 additive 1997 Q2
## 3 additive 1997 Q3
## 4 additive 1997 Q4
## 5 additive 1998 Q1 11806. 9964. -24.5 1512. 433.
## 6 additive 1998 Q2 9276. 9851. -35.6 -290. -374.
## 7 additive 1998 Q3 8642. 9700. -50.2 -684. -489.
## 8 additive 1998 Q4 9300. 9694. -44.6 -538. 188.
## 9 additive 1999 Q1 11172. 9652. -44.3 1512. 10.7
## 10 additive 1999 Q2 9608. 9676. -35.6 -290. 290.
NA NA NA
NA NA NA
NA NA NA
NA 9899. -37.4 -538. NA
1512. NA
-290. NA
-684. NA
56

Estimated components
Additive states
12000 11000 10000
9000 8000
11000 10500 10000
9500 9000 120 80 40 0 −40
1500 1000 500 0 −500
1000 500 0 −500 −1000
2000 Q1
2005 Q1
Year
2010 Q1
2015 Q1
Multiplicative states
12000 11000 10000
9000 8000
11000 10500 10000
9500 9000
100 50 0 −50
1.1
1.0
0.10 0.05 0.00
−0.05 −0.10
2000 Q1
2005 Q1
Year
2010 Q1
2015 Q1
57
Trips level slope season remainder
Trips level slope season remainder

Holt-Winters damped method
Often the single most accurate forecasting method for seasonal data:
yˆt+h|t = [lt + (φ + φ2 + · · · + φh)bt]st+h−m(k+1) lt = α(yt/st−m) + (1 − α)(lt−1 + φbt−1)
bt = β∗(lt − lt−1) + (1 − β∗)φbt−1
st =γ yt +(1−γ)st−m (lt−1 + φbt−1)
58

Holt-Winters with daily data
sth_cross_ped <- pedestrian %>%
filter(Date >= “2016-07-01”,
Sensor == “Southern Cross Station”) %>%
index_by(Date) %>%
summarise(Count = sum(Count)/1000)
sth_cross_ped %>%
filter(Date <= "2016-07-31") %>%
model(
hw = ETS(Count ~ error(“M”) + trend(“Ad”) + season(“M”))
) %>%
forecast(h = “2 weeks”) %>%
autoplot(sth_cross_ped %>% filter(Date <= "2016-08-14")) + labs(title = "Daily traffic: Southern Cross", y="Pedestrians ('000)") 59 Holt-Winters with daily data Daily traffic: Southern Cross 20 15 10 5 l Jul 01 Jul 15 Aug 01 Aug 15 Date evel 80 95 60 Pedestrians ('000) Outline 1 Exponential smoothing 2 Simple exponential smoothing 3 Models with trend 4 Models with seasonality 5 Innovations state space models 6 Forecasting with exponential smoothing 61 Exponential smoothing methods Seasonal Component Trend N A M Component N (None) A (Additive) Ad (Additive damped) (None) (Additive) (Multiplicative) (N,N) (A,N) (Ad,N) (N,A) (A,A) (Ad,A) (N,M) (A,M) (Ad,M) (N,N): Simple exponential smoothing (A,N): Holt’s linear method (Ad,N): Additive damped trend method (A,A): Additive Holt-Winters’ method (A,M): Multiplicative Holt-Winters’ method (Ad,M): Damped multiplicative Holt-Winters’ method 62 ETS models Additive Error Seasonal Component Trend N A M Component N (None) A (Additive) Ad (Additive damped) (None) A,N,N A,A,N A,Ad ,N (Additive) (Multiplicative) A,N,A A,A,A A,Ad ,A A,N,M A,A,M A,Ad ,M Seasonal Component Trend N A M Multiplicative Error Component N (None) A (Additive) Ad (Additive damped) (None) M,N,N M,A,N M,Ad,N (Additive) (Multiplicative) M,N,A M,N,M M,A,A M,A,M M,Ad,A M,Ad,M 63 Additive error models 7/ exponential smoothing 149 ADDITIVE ERROR MODELS Trend N A Ad M M d Seasonal NAM yt =lt−1+εt lt =lt−1+αεt yt =lt−1+bt−1+εt lt =lt−1+bt−1+αεt bt =bt−1+βεt yt =lt−1+φbt−1+εt lt =lt−1+φbt−1+αεt bt =φbt−1+βεt yt = lt−1bt−1 + εt lt =lt−1bt−1+αεt bt =bt−1+βεt/lt−1 y=lbφ+ε t t−1 t−1 t yt =lt−1+st−m+εt lt =lt−1+αεt st =st−m+γεt yt =lt−1+bt−1+st−m+εt lt =lt−1+bt−1+αεt bt =bt−1+βεt st =st−m+γεt yt =lt−1+φbt−1+st−m+εt lt =lt−1+φbt−1+αεt bt =φbt−1+βεt st =st−m+γεt yt = lt−1bt−1 + st−m + εt lt =lt−1bt−1+αεt bt =bt−1+βεt/lt−1 st =st−m+γεt yt =lt−1st−m+εt lt =lt−1+αεt/st−m st =st−m+γεt/lt−1 yt =(lt−1+bt−1)st−m+εt lt =lt−1+bt−1+αεt/st−m bt =bt−1+βεt/st−m st =st−m+γεt/(lt−1+bt−1) yt =(lt−1+φbt−1)st−m+εt lt =lt−1+φbt−1+αεt/st−m bt =φbt−1+βεt/st−m st =st−m+γεt/(lt−1+φbt−1) yt = lt−1bt−1st−m + εt l =l bφ +αε t t−1 t−1 t t−1 t−1 t−m l =l bφ +αε t t t t−1 t t−1 t t−1 t−1 b =bφ +βε/l b =bφ +βε/l t t t−1 t t−1 y=lbφ+s +ε lt =lt−1bt−1+αεt/st−m bt =bt−1+βεt/(st−mlt−1) st =st−m+γεt/(lt−1bt−1) y=lbφs +ε t t−1 t−1 t−m t l =l bφ +αε/s t t−1 t−1 t t−m b =bφ +βε/(s l ) t t−1 t t−m t−1 s=s +γε/(l bφ) t t−m t t−1t−1 s=s +γε t t−m t 64 MULTIPLICATIVE ERROR MODELS d tt1t1 t tt1t1 t tt1t1 ttm b =bφ +βε/l b =bφ +βε/l t t−1 t t−1 t t−1 t t−1 st =st−m+γεt Multiplicative error models b =bφ +βε/(s l ) t t−1 t t−m t−1 φ st =st−m+γεt/(lt−1bt−1) MULTIPLICATIVE ERROR MODELS Trend N A Ad M M d Seasonal NAM yt =lt−1(1+εt) lt =lt−1(1+αεt) yt =(lt−1+bt−1)(1+εt) lt =(lt−1+bt−1)(1+αεt) bt = bt−1 + β(lt−1 + bt−1)εt yt =(lt−1+φbt−1)(1+εt) lt =(lt−1+φbt−1)(1+αεt) bt = φbt−1 + β(lt−1 + φbt−1)εt yt =lt−1bt−1(1+εt) lt =lt−1bt−1(1+αεt) bt =bt−1(1+βεt) y=l bφ (1+ε) t t−1 t−1 t yt =(lt−1+st−m)(1+εt) lt =lt−1+α(lt−1+st−m)εt st =st−m+γ(lt−1+st−m)εt yt =(lt−1+bt−1+st−m)(1+εt) lt =lt−1+bt−1+α(lt−1+bt−1+st−m)εt bt = bt−1 + β(lt−1 + bt−1 + st−m)εt st =st−m+γ(lt−1+bt−1+st−m)εt yt =(lt−1+φbt−1+st−m)(1+εt) lt =lt−1+φbt−1+α(lt−1+φbt−1+st−m)εt bt = φbt−1 + β(lt−1 + φbt−1 + st−m)εt st =st−m+γ(lt−1+φbt−1+st−m)εt yt =(lt−1bt−1+st−m)(1+εt) lt =lt−1bt−1+α(lt−1bt−1+st−m)εt bt =bt−1+β(lt−1bt−1+st−m)εt/lt−1 st =st−m+γ(lt−1bt−1+st−m)εt y=(l bφ +s )(1+ε) t t−1 t−1 t−m t l=l bφ +α(l bφ +s )ε t t−1 t−1 t−1 t−1 t−m t b =bφ +β(l bφ +s )ε/l t t−1 t−1 t−1 t−m t t−1 s=s +γ(l bφ +s )ε tt−m t−1t−1t−mt yt =lt−1st−m(1+εt) lt =lt−1(1+αεt) st =st−m(1+γεt) yt =(lt−1+bt−1)st−m(1+εt) lt =(lt−1+bt−1)(1+αεt) bt = bt−1 + β(lt−1 + bt−1)εt st =st−m(1+γεt) yt =(lt−1+φbt−1)st−m(1+εt) lt =(lt−1+φbt−1)(1+αεt) bt = φbt−1 + β(lt−1 + φbt−1)εt st =st−m(1+γεt) l=l bφ (1+αε) yt =lt−1bt−1st−m(1+εt) lt =lt−1bt−1(1+αεt) bt =bt−1(1+βεt) st =st−m(1+γεt) y=l bφ s (1+ε) t t−1 t−1 t−m t l=l bφ (1+αε) t t−1 t−1 t b =bφ (1+βε) t t−1 t s=s (1+γε) tt−m t t t−1 t−1 b =bφ (1+βε) t t t−1 t Table 7.10: State space equations 65 for each of the models in the ETS Estimating ETS models Smoothing parameters α, β, γ and φ, and the initial states l0, b0, s0, s−1, . . . , s−m+1 are estimated by maximising the “likelihood” = the probability of the data arising from the specified model. For models with additive errors equivalent to minimising SSE. For models with multiplicative errors, not equivalent to minimising SSE. 66 Innovations state space models iid 2 Letxt =(lt,bt,st,st−1,...,st−m+1)andεt ∼N(0,σ ). yt = h(xt−1) + k(xt−1)εt 􏰕 􏰔􏰓 􏰖 􏰕 􏰔􏰓 􏰖 μt et xt = f(xt−1) + g(xt−1)εt Additive errors k(xt−1)=1. Multiplicative errors k(xt−1) = μt. εt = (yt − μt)/μt is relative error. yt =μt +εt. yt = μt(1 + εt). 67 Innovations state space models Estimation 􏰂T􏰃T L∗(θ,x0)=Tlog 􏰆ε2t +2􏰆log|k(xt−1)| t=1 t=1 = −2 log(Likelihood) + constant Estimate parameters θ = (α, β, γ, φ) and initial statesx0 =(l0,b0,s0,s−1,...,s−m+1)by minimizing L∗. 68 Parameter restrictions Usual region Traditional restrictions in the methods 0 < α,β∗,γ∗,φ < 1 (equations interpreted as weighted averages). Inmodelswesetβ=αβ∗ andγ=(1−α)γ∗. Therefore0<α<1, 0<β<α and0<γ<1−α. 0.8 < φ < 0.98 — to prevent numerical difficulties. 69 Parameter restrictions Usual region Traditional restrictions in the methods 0 < α,β∗,γ∗,φ < 1 (equations interpreted as weighted averages). Inmodelswesetβ=αβ∗ andγ=(1−α)γ∗. Therefore0<α<1, 0<β<α and0<γ<1−α. 0.8 < φ < 0.98 — to prevent numerical difficulties. Admissible region To prevent observations in the distant past having a continuing effect on current forecasts. Usually (but not always) less restrictive than the traditional region. For example for ETS(A,N,N): traditional 0 < α < 1 — admissible is 0 < α < 2. 69 Model selection Akaike’s Information Criterion AIC = −2 log(L) + 2k where L is the likelihood and k is the number of parameters initial states estimated in the model. 70 Model selection Akaike’s Information Criterion AIC = −2 log(L) + 2k where L is the likelihood and k is the number of parameters initial states estimated in the model. Corrected AIC AICc =AIC+2(k+1)(k+2) T−k which is the AIC corrected (for small sample bias). 70 Model selection Akaike’s Information Criterion AIC = −2 log(L) + 2k where L is the likelihood and k is the number of parameters initial states estimated in the model. Corrected AIC AICc =AIC+2(k+1)(k+2) T−k which is the AIC corrected (for small sample bias). Bayesian Information Criterion BIC = AIC + k(log(T) − 2). 70 AIC and cross-validation Minimizing the AIC assuming Gaussian residuals is asymptotically equivalent to minimizing one-step time series cross validation MSE. 71 Automatic forecasting From Hyndman et al. (IJF, 2002): Apply each model that is appropriate to the data. Optimize parameters and initial values using MLE (or some other criterion). Select best method using AICc: Produce forecasts using best method. Obtain forecast intervals using underlying state space model. Method performed very well in M3 competition. 72 Example: National populations fit <- global_economy %>%
mutate(Pop = Population / 1e6) %>%
model(ets = ETS(Pop))
fit
## # A mable: 263 x 2
## # Key: Country [263]
## Country ets
##
## 1 Afghanistan
## 2 Albania
## 3 Algeria
## 4 American Samoa
## 5 Andorra

## 6 Angola
## 7 Antigua and Barbuda
## 8 Arab World
## 9 Argentina
## 10 Armenia
## # … with 253 more rows

73

Example: National populations
fit %>%
forecast(h = 5)
## # A fable: 1,315 x 5 [1Y]
## # Key:
## Country
##
## 1 Afghanistan ets
## 2 Afghanistan ets
## 3 Afghanistan ets
## 4 Afghanistan ets
## 5 Afghanistan ets
Country, .model [263]
.model Year
## 6 Albania
## 7 Albania
## 8 Albania
## 9 Albania
## 10 Albania
## # … with 1,305 more rows

2018
2019
2020
2021
2022
ets
ets
ets
ets
ets
74
Pop .mean

N(36, 0.012) 36.4
N(37, 0.059) 37.3
N(38, 0.16) 38.2
N(39, 0.35) 39.0
N(40, 0.64) 39.9
2018 N(2.9, 0.00012) 2.87
2019 N(2.9, 6e-04) 2.87
2020 N(2.9, 0.0017) 2.87
2021 N(2.9, 0.0036) 2.86
2022 N(2.9, 0.0066) 2.86

Example: Australian holiday tourism
holidays <- tourism %>%
filter(Purpose == “Holiday”)
fit <- holidays %>% model(ets = ETS(Trips))
fit
## # A mable: 76 x 4
## # Key:
## Region
##
## 1 Adelaide
## 2 Adelaide Hills
## 3 Alice Springs
## 4 Australia’s Coral Coast Western Austral~ Holiday
## 5 Australia’s Golden Outb~ Western Austral~ Holiday
Region, State, Purpose [76]
State Purpose ets
## 6 Australia’s North West
## 7 Australia’s South West
## 8 Ballarat
## 9 Barkly
## 10 Barossa
Western Austral~ Holiday Western Austral~ Holiday Victoria Holiday Northern Territ~ Holiday South Australia Holiday 75

South Australia Holiday
South Australia Holiday
Northern Territ~ Holiday
## # … with 66 more rows

Example: Australian holiday tourism
fit %>%
filter(Region == “Snowy Mountains”) %>%
report()
## Series: Trips
## Model: ETS(M,N,A)
##
##
##
##
##
## l s1 s2 s3 s4
## 141.7 -60.96 130.9 -42.24 -27.66
##
## sigma^2: 0.0388
##
## AIC AICc BIC
## 852.0 853.6 868.7
Smoothing parameters:
alpha = 0.1571
gamma = 0.0001001
Initial states:
76

Example: Australian holiday tourism
fit %>%
filter(Region == “Snowy Mountains”) %>%
components(fit)
## # A dable: 84 x 9 [1Q]
## # Key: Region, State, Purpose, .model [1]
## # :
## #
##
##
## 1 Snowy~ New ~ Holiday ets 1997 Q1 NA NA -27.7
## 2 Snowy~ New ~ Holiday ets 1997 Q2 NA NA -42.2
## 3 Snowy~ New ~ Holiday ets 1997 Q3 NA NA 131.
## 4 Snowy~ New ~ Holiday ets 1997 Q4 NA 142. -61.0
## 5 Snowy~ New ~ Holiday ets 1998 Q1 101. 140. -27.7
## 6 Snowy~ New ~ Holiday ets 1998 Q2 112. 142. -42.2
## 7 Snowy~ New ~ Holiday ets 1998 Q3 310. 148. 131.
## 8 Snowy~ New ~ Holiday ets 1998 Q4 89.8 148. -61.0
## 9 Snowy~ New ~ Holiday ets 1999 Q1 112. 147. -27.7
Trips = (lag(level, 1) + lag(season, 4)) * (1 +
remainder)
Region State Purpose .model Quarter Trips level season
77
## 10 Snowy~ New ~ Holiday ets 1999 Q2 103. 147. -42.2

Example: Australian holiday tourism
fit %>%
filter(Region == “Snowy Mountains”) %>%
components(fit) %>%
autoplot()
78
ETS(M,N,A) decomposition
Trips = (lag(level, 1) + lag(season, 4)) * (1 + remainder)
300 200 100
160 150 140 130 120 110
100 50 0 −50
0.25
0.00 −0.25
2000 Q1
2005 Q1
2010 Q1
2015 Q1
Quarter
Trips level season remainder

Example: Australian holiday tourism
fit %>% forecast()
## # A fable: 608 x 7 [1Q]
## # Key:
## Region
##
## 1 Adelaide
## 2 Adelaide
## 3 Adelaide
## 4 Adelaide
## 5 Adelaide
## 6 Adelaide
## 7 Adelaide
## 8 Adelaide
## 9 Adelaide ~ South Aus~ Holiday ets 2018 Q1 N(19, 36) 19.4
## 10 Adelaide ~ South Aus~ Holiday ets 2018 Q2 N(20, 36) 19.6
## # … with 598 more rows
Region, State, Purpose, .model [76]
State Purpose .model Quarter Trips .mean

South Aus~ Holiday ets
South Aus~ Holiday ets
South Aus~ Holiday ets
South Aus~ Holiday ets
South Aus~ Holiday ets
South Aus~ Holiday ets
South Aus~ Holiday ets
South Aus~ Holiday ets

2018 Q1 N(210, 457) 210.
2018 Q2 N(173, 473) 173.
2018 Q3 N(169, 489) 169.
2018 Q4 N(186, 505) 186.
2019 Q1 N(210, 521) 210.
2019 Q2 N(173, 537) 173.
2019 Q3 N(169, 553) 169.
2019 Q4 N(186, 569) 186.
79

Example: Australian holiday tourism
fit %>%
forecast() %>%
filter(Region == “Snowy Mountains”) %>%
autoplot(holidays) +
labs(y = “Thousands”, title = “Overnight trips”)
80
Overnight trips
400
300
200
100
l
2000 Q1
2005 Q1
2010 Q1
2015 Q1
2020 Q1
Quarter
evel 80
95
Thousands

Residuals
Response residuals
e t = y t − yˆ t | t − 1 Innovation residuals
Additive error model:
εˆ t = y t − yˆ t | t − 1
Multiplicative error model:
εˆ t = y t − yˆ t | t − 1
yˆt|t−1
81

Example: Australian holiday tourism
aus_holidays <- tourism %>%
filter(Purpose == “Holiday”) %>%
summarise(Trips = sum(Trips))
fit <- aus_holidays %>%
model(ets = ETS(Trips)) %>%
report()
## Series: Trips
## Model: ETS(M,N,M)
Smoothing parameters:
alpha = 0.3578
gamma = 0.0009686
##
##
##
##
##
##
## 9667 0.943 0.9268 0.9684 1.162
Initial states:
l s1 s2 s3 s4
82

Example: Australian holiday tourism
residuals(fit)
residuals(fit, type = “response”)
0.10 0.05 0.00
−0.05 −0.10
1000 500 0 −500 −1000
2000 Q1
2005 Q1
2010 Q1
2015 Q1
Quarter
83
Innovation residuals Response residuals

Example: Australian holiday tourism
fit %>%
augment()
## # A tsibble: 80 x 6 [1Q]
## # Key: .model [1]
## .model Quarter Trips .fitted .resid .innov
##
## 1 ets 1998 Q1 11806. 11230. 576. 0.0513
## 2 ets 1998 Q2 9276. 9532. -257. -0.0269
## 3 ets 1998 Q3 8642. 9036. -393. -0.0435
## 4 ets 1998 Q4 9300. 9050. 249. 0.0275
## 5 ets 1999 Q1 11172. 11260. -88.0 -0.00781
## 6 ets 1999 Q2 9608. 9358. 249. 0.0266
## 7 ets 1999 Q3 8914. 9042. -129. -0.0142
## 8 ets 1999 Q4 9026. 9154. -129. -0.0140
## 9 ets 2000 Q1 11071. 11221. -150. -0.0134
## 10 ets 2000 Q2 9196. 9308. -111. -0.0120
## # … with 70 more rows
84

Some unstable models
Some of the combinations of (Error, Trend, Seasonal) can lead to numerical difficulties; see equations with division by a state.
These are: ETS(A,N,M), ETS(A,A,M), ETS(A,Ad,M). Models with multiplicative errors are useful for strictly positive data, but are not numerically stable with data containing zeros or negative values. In that case only the six fully additive models will be applied.
85

Exponential smoothing models
Additive Error Seasonal Component Trend N A M
Component
N (None)
A (Additive)
Ad (Additive damped)
(None) A,N,N A,A,N A,Ad ,N
(Additive) (Multiplicative)
A,N,A
A,A,A A,Ad ,A
A,N,M A,A,M A,Ad ,M
Seasonal Component Trend N A M
Multiplicative Error
Component
N (None)
A (Additive)
Ad (Additive damped)
(None) M,N,N M,A,N M,Ad,N
(Additive) (Multiplicative) M,N,A M,N,M M,A,A M,A,M
M,Ad,A M,Ad,M 86

Outline
1 Exponential smoothing
2 Simple exponential smoothing
3 Models with trend
4 Models with seasonality
5 Innovations state space models
6 Forecasting with exponential smoothing
87

Forecasting with ETS models
Traditional point forecasts: iterate the equations for t=T+1,T+2,…,T+handsetallεt =0fort>T.
88

Forecasting with ETS models
Traditional point forecasts: iterate the equations for t=T+1,T+2,…,T+handsetallεt =0fort>T.
Not the same as E(yt+h|xt) unless trend and seasonality are both additive.
fable uses E(yt+h|xt).
Point forecasts for ETS(A,*,*) are identical to ETS(M,*,*) if the parameters are the same.
88

Example: ETS(A,A,N)
yT+1 =lT +bT +εT+1 yˆ T + 1 | T = l T + b T
yT+2 = lT+1 + bT+1 + εT+2
= (lT +bT +αεT+1)+(bT +βεT+1)+εT+2
yˆ T + 2 | T = l T + 2 b T etc.
89

Example: ETS(M,A,N)
yT+1 = (lT + bT)(1 + εT+1) yˆ T + 1 | T = l T + b T .
yT+2 = (lT+1 + bT+1)(1 + εT+2)
= {(lT + bT)(1 + αεT+1) + [bT + β(lT + bT)εT+1]} (1 + εT+2)
yˆ T + 2 | T = l T + 2 b T etc.
90

Forecasting with ETS models
Prediction intervals: can only be generated using the models.
The prediction intervals will differ between models with additive and multiplicative errors. Exact formulae for some models.
More general to simulate future sample paths, conditional on the last estimate of the states, and to obtain prediction intervals from the percentiles of these simulated future paths.
91

Prediction intervals
PI for most ETS models: yˆT+h|T ± cσh, where c depends on coverage probability and σh is forecast standard deviation.
(A,N,N) σh (A,A,N) σh
(A,A ,N) σ d h
(A,N,A) σh (A,A,A) σh
(A,A ,A) σ d h
=σ2􏰍1+α2(h−1)􏰎 =σ2􏰉1+(h−1){α2+αβh+1β2h(2h−1)}􏰊
6
=σ2 1+α2(h−1)+ βφh {2α(1−φ)+βφ}
􏰑
(1−φ)2
− βφ(1−φh) 􏰏2α(1 − φ2) + βφ(1 + 2φ − φh)􏰐􏰒
(1−φ)2 (1−φ2 ) =σ2􏰉1+α2(h−1)+γk(2α+γ)􏰊
=σ2􏰉1+(h−1){α2 +αβh+ 1β2h(2h−1)}+γk{2α+γ+βm(k+1)}􏰊 6
􏰑
=σ2 1+α2(h−1)+ βφh {2α(1−φ)+βφ} (1−φ)2
− βφ(1−φh) 􏰏2α(1 − φ2) + βφ(1 + 2φ − φh)􏰐 (1−φ)2 (1−φ2 )
+γk(2α+γ)+ 2βγφ 􏰏k(1−φm)−φm(1−φmk)􏰐 (1−φ)(1−φm )
92
􏰒

Example: Corticosteroid drug sales
h02 <- PBS %>%
filter(ATC2 == “H02”) %>%
summarise(Cost = sum(Cost))
h02 %>%
autoplot(Cost)
93
1250000
1000000
750000
500000
1995 Jan
2000 Jan
2005 Jan
Month [1M]
Cost

Example: Corticosteroid drug sales
h02 %>%
model(ETS(Cost)) %>%
report()
## Series: Cost
## Model: ETS(M,Ad,M)
## Smoothing parameters:
## alpha = 0.3071
## beta = 0.0001007
## gamma = 0.0001007
## phi = 0.9775
##
## Initial states:
## l b s1 s2 s3 s4
## 417269 8206 0.8717 0.826 0.7563 0.7733 0.6872 1.284 1.325 1.18
## s9 s10 s11 s12
## 1.164 1.105 1.048 0.9806
##
## sigma^2: 0.0046
##
## AIC AICc BIC
## 5515 5519 5575
s5 s6 s7 s8
94

Example: Corticosteroid drug sales
h02 %>%
model(ETS(Cost ~ error(“A”) + trend(“A”) + season(“A”))) %>%
report()
## Series: Cost
## Model: ETS(A,A,A)
##
##
##
##
##
##
##
## 409706 9097 -99075 -136602 -191496 -174531 -241437 210644
## s7 s8 s9 s10 s11 s12
## 244644 145368 130570 84458 39132 -11674
##
## sigma^2: 3.499e+09
##
## AIC AICc BIC
## 5585 5589 5642
Smoothing parameters:
alpha = 0.1702
beta = 0.006311
gamma = 0.4546
Initial states:
l b s1
s2 s3 s4
s5 s6
95

Example: Corticosteroid drug sales
h02 %>% model(ETS(Cost)) %>% forecast() %>% autoplot(h02)
1500000
1200000
900000
600000
300000
l
evel 80
95
1995 Jan
2000 Jan
Month
2005 Jan
2010 Jan
96
Cost

Example: Corticosteroid drug sales
h02 %>%
model(
auto = ETS(Cost),
AAA = ETS(Cost ~ error(“A”) + trend(“A”) + season(“A”))
) %>%
accuracy()
Model
auto AAA
ME MAE
2461 38649 -5780 43378
RMSE
51102 56784
MAPE
4.989 6.048
MASE
0.6376 0.7156
97

Related Posts