程序代写代做代考 Predictive Analytics - Week 12: Exponential Smoothing

Predictive Analytics – Week 12: Exponential Smoothing

Predictive Analytics
Week 12: Exponential Smoothing

Semester 2, 2018

Discipline of Business Analytics, The University of Sydney Business School

Week 12: Exponential Smoothing

1. Simple exponential smoothing

2. Trend corrected exponential smoothing

3. Holt winters smoothing

4. Damped trend exponential smoothing

2/46

Exponential smoothing methods

Exponential smoothing forecasts are weighted averages of past
observations, where the weights decay exponentially as we go
further into the past.

Exponential smoothing can be useful when the time series
components are changing over time.

3/46

Simple exponential smoothing

Simple exponential smoothing (keyboard)

The simple exponential smoothing method specifies the
forecasting rule

ŷt+1 = `t (forecast equation)
`t = αyt + (1− α)`t−1 (smoothing equation)

for an initial value `0 and 0 ≤ α ≤ 1.

4/46

Exponentially weighted moving average

`1 = αy1 + (1− α)`0

`2 = αy2 + (1− α)`1
= αy2 + (1− α)αy1 + (1− α)2`0

`3 = αy3 + (1− α)`2
= αy3 + (1− α)αy2 + (1− α)2αy1 + (1− α)3`0

`4 = αy4 + (1− α)`3
= αy4 + (1− α)αy3 + (1− α)2αy2 + (1− α)3αy1 + (1− α)4`0
…

5/46

Exponentially weighted moving average

It follows that

`t =αyt + (1− α)`t−1
=αyt + (1− α)αyt−1 + (1− α)2αyt−2 + . . .+ (1− α)t−1αy1

+ (1− α)t`0.

Simple exponential smoothing is also known as the exponentially
weighted moving average (EWMA) method.

6/46

Simple exponential smoothing

• Useful for forecasting time series with changing levels.

• A higher α gives larger weight to recent observations, making
the forecasts more adaptive to recent changes in the series.

• A lower α leads to a larger weights for past observations,
making the forecasts smoother.

• Initialisation: we typically set `0 = y1 for simplicity.
Alternatively, we can treat it as a parameter.

7/46

Example: AUD/USD exchange rate

8/46

Example: AUD/USD exchange rate

9/46

Estimation

We estimate α by least squares (empirical risk minimisation).

α̂ = argmin
α

N∑
t=1

(yt − `t−1)2

Each `t is a nonlinear function of α, so that there is no formula for
α̂. We use numerical optimisation methods to obtain the solution.

10/46

Statistical model

In order to say more about the simple exponential smoothing
method, we need to formulate it as a statistical model. We assume
that

Yt = `t−1 + εt,
`t = αyt + (1− α)`t−1,

where the errors εt are i.i.d with constant variance σ2.

11/46

Statistical model

In forecasting, we want to:

1. To compute point forecasts for multiple forecasting horizons h.

2. To compute interval forecasts for multiple forecasting horizons
h.

In order to this for the exponential smoothing method, we rewrite
the model in error correction form.

12/46

Error correction form

We obtain the error correction form as

`t = αYt + (1− α)`t−1

= `t−1 + α(Yt − `t−1)

= `t−1 + αεt.

Hence, we can rewrite the model as:

Yt+1 = `t + εt+1,
`t = `t−1 + αεt.

13/46

Error correction form

Using `t = `t−1 + αεt,

`t+1 = `t + αεt+1

`t+2 = `t+1 + αεt+2
= `t + αεt+1 + αεt+2

`t+3 = `t+2 + αεt+3
= `t + αεt+1 + αεt+2 + αεt+3
…

`t+h = `t +
h∑
i=1

αεt+i

14/46

Constant plus noise representation

Using Yt = `t−1 + εt and the previous slide,

Yt+1 = `t + εt+1

Yt+2 = `t+1 + εt+2
= `t + αεt+1 + εt+2

Yt+3 = `t+2 + εt+3
= `t + αεt+1 + αεt+2 + εt+3
…

Yt+h = `t+h−1 + εt+h

= `t +
h−1∑
i=1

αεt+i + εt+h

15/46

Point forecast

Constant plus noise representation of future observations:

Yt+h = `t +
h−1∑
i=1

αεt+i + εt+h

From the linearity of expectations, the point forecast for any
horizon h is

ŷt+h = E(Yt+h|y1:t)

= E
(
`t +

h−1∑
i=1

αεt+i + εt+h

∣∣∣∣∣ y1:t
)

= `t

16/46

Forecast variance

Var(Yt+1|y1:t) = Var(`t + εt+1|y1:t)
= σ2

Var(Yt+2|y1:t) = Var(`t + αεt+1 + εt+2|y1:t)
= σ2(1 + α2)
…

Var(Yt+h|y1:t) = Var
(
`t +

h−1∑
i=1

αεt+h−i + εt+h

∣∣∣∣∣ y1:t
)

= σ2(1 + (h− 1)α2)

17/46

Forecast equations for simple exponential smoothing

ŷt+h = `t

Var(Yt+h|y1:t) = σ2(1 + (h− 1)α2)

18/46

Interval forecast

If we assume that εt ∼ N(0, σ2),

Yt+h|y1:t ∼ N
(
`t, σ

2
[
1 + (h− 1)α2

])
.

To compute an interval forecast, we use the estimated values of α
and σ2: ̂̀

t ± zcrit ×
√
σ̂2 [1 + (h− 1)α̂2],

where
σ̂2 =

∑n
t=1(yt − `t−1)2

N − 1
.

If the errors are not normal, you should use the Bootstrap method
or other distributional assumptions.

19/46

Example: AUD/USD exchange rate

20/46

Trend corrected exponential
smoothing

Trend corrected exponential smoothing

The trend corrected or Holt exponential smoothing method
allows for a time-varying trend:

ŷt+1 = `t + bt (forecast equation)
`t = αyt + (1− α)(`t−1 + bt−1) (smoothing equation)
bt = β(`t − `t−1) + (1− β)bt−1 (trend equation)

for an initial values `0 and b0, 0 ≤ α ≤ 1, and 0 ≤ β ≤ 1.

21/46

Trend corrected exponential smoothing

Consider the simple time series trend model

`t = a+ b× t,
Yt = `t + εt.

What is `t − `t−1 here?

22/46

Trend corrected exponential smoothing model

The statistical model is

Yt+1 = `t + bt + εt+1,
`t = αYt + (1− α)(`t−1 + bt−1),
bt = β(`t − `t−1) + (1− β)bt−1,

where the errors εt are i.i.d with constant variance σ2.

The least squares estimates of α and β are

α̂, β̂ = argmin
α,β

N∑
t=1

(yt − `t−1 − bt−1)2

23/46

Error correction form

`t = αYt + (1− α)(`t−1 + bt−1)
= `t−1 + bt−1 + α(Yt − `t−1 − bt−1)
= `t−1 + bt−1 + αεt

bt = β(`t − `t−1) + (1− β)bt−1
= bt−1 + β(`t − `t−1 − bt−1)
= bt−1 + βα(`t−1 + bt−1 + αεt − `t−1 − bt−1)
= bt−1 + βαεt

24/46

Error correction form

Yt+1 = `t + bt + εt+1
`t = `t−1 + bt−1 + αεt
bt = bt−1 + βαεt

25/46

Constant plus noise representation

Yt+1 = `t + bt + εt+1

Yt+2 = `t+1 + bt+1 + εt+2
= `t + 2bt + α(1 + β)εt+1 + εt+2

Yt+3 = `t+2 + bt+2 + +εt+3
= `t+1 + 2bt+1 + α(1 + β)εt+2 + εt+3
= `t + 3bt + α(1 + 2β)εt+1 + α(1 + β)εt+2 + εt+3
…

Yt+h = `t + hbt + α
h−1∑
i=1

(1 + iβ)εt+h−i + εt+h

26/46

Point forecast

Constant plus noise representation of future observations:

Yt+h = `t + hbt + α
h−1∑
i=1

(1 + iβ)εt+h−i + εt+h

From the linearity of expectations, the point forecast for any
horizon h is

ŷt+h = E(Yt+h|y1:t)

= E
(
`t + hbt + α

h−1∑
i=1

(1 + iβ)εt+h−i + εt+h

∣∣∣∣∣ y1:t
)

= `t + hbt.

27/46

Forecast variance

Var(Yt+1|y1:t) = Var(`t + bt + εt+1|y1:t)
= σ2

Var(Yt+2|y1:t) = Var(`t + 2bt + α(1 + β)εt+1 + εt+2|y1:t)
= σ2(1 + α2(1 + β)2)
…

Var(Yt+h|y1:t) = Var
(
`t + hbt + α

h−1∑
i=1

(1 + iβ)εt+h−i + εt+h|y1:t

)

= σ2
(

1 + α2
h−1∑
i=1

(1 + iβ)2
)

28/46

Forecast equations for the trend corrected smoothing method

Point forecast:

ŷt+h = ̂̀t + hb̂t
Variance:

Var(Yt+h|y1:t) = σ2
(

1 + α2
h−1∑
i=1

(1 + iβ)2
)

We compute interval forecasts as before.

29/46

Example: assaults in Sydney

30/46

Example: assaults in Sydney

31/46

Example: visitor arrivals

32/46

Example: visitor arrivals

33/46

Holt winters smoothing

Holt Winters exponential smoothing

The Holt-Winters exponential smoothing method extend the
trend corrected method to seasonal data. It allows for additive or
multiplicative seasonality.

34/46

Additive Holt Winters Smoothing (key concept)

ŷt+1 = `t + bt + St+1−L (forecast equation)
`t = α(yt − St−L) + (1− α)(`t−1 + bt−1) (level)
bt = β(`t − `t−1) + (1− β)bt−1, (trend)
St = δ(yt − `t) + (1− δ)St−L, (seasonal indices)

for a seasonal frequency L, initial values `0, b0, and Si−L for
i = 1, . . . , L, and parameters 0 ≤ α ≤ 1, 0 ≤ β ≤ 1, 0 ≤ δ ≤ 1.

35/46

Multiplicative Holt Winters Smoothing (key concept)

ŷt+1 = (`t + bt)× St+1−L (forecast equation)
`t = α(yt/St−L) + (1− α)(`t−1 + bt−1) (level)
bt = β(`t − `t−1) + (1− β)bt−1, (trend)
St = δ(yt/`t) + (1− δ)St−L, (seasonal indices)

for a seasonal frequency L, initial values `0, b0, and Si−L for
i = 1, . . . , L, and parameters 0 ≤ α ≤ 1, 0 ≤ β ≤ 1, 0 ≤ δ ≤ 1.

36/46

Statistical model

As before, we formulate a statistical model by specifying an
observation equation.

Additive:

Yt+1 = `t + bt + St+1−L + εt+1,

where εt+1 is i.i.d with variance σ2

Multiplicative:

yt+1 = (`t + bt)× St+1−L + εt+1,

where εt+1 is i.i.d with variance σ2.

37/46

Estimation

We estimate α, β and δ by least squares.

Additive:

α̂, β̂, δ̂ = argmin
α,β,δ

N∑
t=1

(yt − `t−1 − bt−1 − St+1−L)2

Multiplicative:

α̂, β̂, δ̂ = argmin
α,β,δ

N∑
t=1

(yt − (`t + bt)× St+1−L)2

38/46

Forecast equations

Additive:

ŷt+h = ̂̀t + hb̂t + St−L+(h mod L)
Var(Yt+h|y1:t) = σ2

(
1 +

h−1∑
i=1

[α(1 + iβ) + Ii,Lδ(1− α)]2
)
,

where mod is the modulo operator, Ii,L = 0 if h mod L 6= i and
Ii,L = 1 if h mod L = i.

Multiplicative:

ŷt+h = (̂̀t + hb̂t)× St−L+(h mod L)
No simple expression exists for the variance in the multiplicative
model.

39/46

Example: assaults in Sydney

The estimated parameters are α̂ = 0.117, β̂ = 0.023, and
δ̂ = 0.370. 40/46

Example: visitor arrivals

The estimated parameters are α̂ = 0.154, β̂ = 0.088, and
δ̂ = 0.271. 41/46

Example: visitor arrivals

42/46

Damped trend exponential
smoothing

Damped trend exponential smoothing

Damped trend exponential smoothing addresses the problem
that extrapolating trends indefinitely into the future can lead to
implausible forecasts.

43/46

Model and forecast

Model:

yt+1 = `t + φbt + εt+1,
`t = αyt + (1− α)(`t−1 + φbt−1),
bt = β(`t − `t−1) + (1− β)φbt−1,

where φ is the damping parameter, with 0 ≤ φ ≤ 1.

Forecast equation:

ŷt+h = `t + φbt + φ2bt + φ3bt + . . .+ φhbt

We can extend it to allow for additive or multiplicative seasonality.

44/46

Illustration: visitor arrivals

45/46

Review questions

• What is exponential smoothing?

• What is the difference between simple, trend corrected, and
Holt-Winters exponential smoothing methods?

• Derive the point forecasts and forecast variances for the SES
and trend corrected methods, starting from the model
equations.

• Explain how to compute forecast intervals based on the SES
and trend corrected methods.

46/46

Simple exponential smoothing
Trend corrected exponential smoothing
Holt winters smoothing
Damped trend exponential smoothing

Related Posts