CS计算机代考程序代写 matlab algorithm Smoothing with Levels

Smoothing with Levels

Australian National University

(James Taylor) 1 / 14

J I

Exponential Smoothing

Overview: Construct forecasts as a weighted average of past

observations – smoothing the observed time series

Heavier weight is given to more recent observations

Weights decrease exponentially with time (thus exponential

smoothing)

(James Taylor) 2 / 14

Exponential Smoothing

Pro – Intuitive

Pro – Easy to implement

Pro – Forecast performance is often surprisingly good

Con – Only point forecasts are possible, so cannot analyse statistical

properties

(James Taylor) 3 / 14

45ha

A Basic Framework

Suppose a series µ1, µ2, . . . follows a random walk.

µt = µt�1 + ht

where the ht are iid N (0, s2h )

The level of µt wanders randomly up and down

Optimal h-step ahead forecast is

E(µT+h | IT , q) = µT

The best forecast of future value is current value

(James Taylor) 4 / 14

µ
eta mu

A Basic Framework

Suppose we don’t observe µt , but observe yt :

yt = µt + et

where the et are iid N (0, s2e )

What is a good point forecast for yT+h?

If we observed µT then

E(yT+h | µT ) = E(µT+h + eT+h | µT ) = µT

(James Taylor) 5 / 14

Theobserveddata hidden
t

noise

A Basic Framework

yt = µt + et , et ⇠ N (0, s2e )

µt = µt�1 + ht , ht ⇠ N (0, s2h )

We don’t observe µt

Need to estimate µT

Two sources of information; yT and µT�1

But we don’t observe µT�1, so use yT�1 and µT�2,…

and so on

(James Taylor) 6 / 14

true noise

Figure: Example random walk data

(James Taylor) 7 / 14

Hidden

Figure: Example random walk data seen with noise

(James Taylor) 8 / 14

Figure: Noisy Data

(James Taylor) 9 / 14

seen

A Basic Framework

The model

yt = µt + et , et ⇠ N (0, s2e )

µt = µt�1 + ht , ht ⇠ N (0, s2h )

is a simple state-space model

Estimation of such models can be quite involved

Usually use the Kalman Filter (the last part of the course)

The final topic of the course is State-Space models

For the moment we will use an easy smoothing method to produce

reasonable forecasts

(James Taylor) 10 / 14

Additive Smoothing

Suppose at time t � 1 we have a reasonable forecast for yt
Call this forecast the level Lt�1 so that

ŷt|t�1 = Lt�1

Then at time t we observe yt , and we update the level by taking a

weighted average of the old level and the new observation

Lt = ayt + (1� a)Lt�1, or

Lt = Lt�1 + a(yt � Lt�1)

(James Taylor) 11 / 14

expected
jaw 6

updating

Additive Smoothing

Lt = Lt�1 + a(yt � Lt�1)

We adjust the level to correct for the forecast error

If we underestimate yt , then we make the next forecast Lt larger

If we overestimate yt , then we make the next forecast Lt smaller

The value a a↵ects the size of the adjustment. For small a the level

will change slowly, for large a the level will change quickly.

The parameter a is called the smoothing parameter

(James Taylor) 12 / 14

negative

n
new old

Simple Additive Smoothing Algorithm

Simple Additive Smoothing

Initialize with L1 = y1

For t = 2, . . . ,T update the level Lt via

Lt = ayt + (1� a)Lt�1

(James Taylor) 13 / 14

Figure: Example random walk data seen with noise

(James Taylor) 14 / 14

Why Level Smoothing is Exponential

Australian National University

(James Taylor) 1 / 3

5 I a

Exponential Weights

This clearly give smoothing; but what makes it exponential?

Let’s write out the first few levels:

L1 = y1

L2 = ay2 + (1� a)L1 = ay2 + (1� a)y1
L3 = ay3 + (1� a)L2 = ay3 + a(1� a)y2 + (1� a)2y1
L4 = ay4 + (1� a)L3 = ay4 + a(1� a)y3 + a(1� a)2y2 + (1� a)3y1

(James Taylor) 2 / 3

Lt Lyft TN Lt I

O 0

Exponential Weights

We can show that

Lt = ayt + a(1� a)yt�1 + a(1� a)2yt�2 + · · ·+ (1� a)t�1y1

The weight declines exponentially (by a constant factor) as we go back in

time.

Aside – There is a small issue at t = 1, but it’s a vanishingly small problem.

(James Taylor) 3 / 3

f lo a I Llalt 1219 41 x0.02

all 21Yt311

h-step ahead Forecast

One-step-ahead forecast: ŷT+1|T = LT

Two-step-ahead forecast: need to use ŷT+1|T instead of yT+1

ŷT+2|T = aŷT+1|T + (1� a)LT
= aLT ++(1� a)LT
= LT

In general, the h-step ahead forecast is

ŷT+h|T = LT

(James Taylor) 4 / 3

Holt-Winters Smoothing

Australian National University

(James Taylor) 1 / 7

J 2

Data with Trend

The additive smoothing discussed previously only applies to data

without trend

Consider a model with an evolving local level, but also a trend with

an evolving local slope

yt = µt + ltt + et , et ⇠ N (0, s2e )

µt = µt�1 + ht , ht ⇠ N (0, s2h )

lt = lt�1 + nt , nt ⇠ N (0, s2n )

(James Taylor) 2 / 7

noise
levee

Tateof
observed charge

slope
O RW

RWrandomwalk

Figure: Example data with evolving trend with se = 0.5, sn = 0.1, and sh = 0.5

(James Taylor) 3 / 7

positiveslope

noise

poorer
shocktotheslope

Holt-Winters Smoothing

The Holt-Winters Smoothing method is a two-component approach

The forecast ŷt|t�1 has two components, Lt�1 and bt�1.

These are a ’level’ and a ’slope’

ŷt|t�1 = Lt�1 + bt�1

(James Taylor) 4 / 7

level slope

Holt-Winters Smoothing – Updating

The level and slope are updated according to

Lt = ayt + (1� a)ŷt = ayt + (1� a)(Lt�1 + bt�1)

bt = b(DLt) + (1� b)bt�1 = b(Lt � Lt�1) + (1� b)bt�1

There are now two smoothing parameters a and b

Lt is the same, except ŷt = Lt�1 + bt�1

bt is a weighted average of the previous slope and the change in level

(James Taylor) 5 / 7

Obs forecast

level

slope

obs guess

Holt-Winters Smoothing Algorithm

Holt-Winters Smoothing

Initialize with L1 = y1 and b1 = y2 � y1
For t = 2, . . . ,T update the level Lt and slope bt via

Lt = ayt + (1� a)(Lt�1 + bt�1)

bt = b(Lt � Lt�1) + (1� b)bt�1

(James Taylor) 6 / 7

1st

2nd

h-step-ahead Forecast

One-step-ahead: ŷT+1|T = LT + bT

Two-step-ahead: use ŷT+1|T instead of yT+1 so

LT+1|T = a(ŷT+1|T ) + (1� a)(LT + bT )

= a(LT + bT ) + (1� a)(LT + bT ) = LT + bT
bT+1|T = b(LT+1 � LT ) + (1� b)bT

= b(LT + bT � LT ) + (1� b)bT = bT
ŷT+2|T = LT+1|T + bT+1|T = LT + 2bT

In general:

ŷT+h|T = LT + hbT

(James Taylor) 7 / 7

y T

Holt-Winters and US GDP

Australian National University

(James Taylor) 1 / 6

Forecasting U.S. GDP with Holt-Winters Smoothing

We’ve looked at US GDP data using linear, quadratic and cubic trend

models.

We found quadratic performed the best in terms of AIC/BIC and

MSFE

i.e. both in-sample and pseudo-out-of-sample forecasting measures

(James Taylor) 2 / 6

Forecasting U.S. GDP with Holt-Winters Smoothing

Now we will use Holt-Winters smoothing to produce forecasts for

h = 4 and h = 20

We will use three sets of smoothing parameters: a = b = 0.8,

a = b = 0.5 and a = b = 0.2

(James Taylor) 3 / 6

1year t year

T0 = 40; h = 4;

syhat = zeros(T�h�T0+1,1);
ytph = y(T0+h:end); % observed y {t+h}
alpha = .5; beta = .5; % smoothing parameters

Lt = y(1); bt = y(2) � y(1); %initialise
for t = 2:T�h

newLt = alpha*y(t) + (1�alpha)*(Lt+bt);
newbt = beta*(newLt�Lt) + (1�beta)*bt;
yhat = newLt + h*newbt;

Lt = newLt; bt = newbt; % update Lt and bt

if t>= T0 % store the forecasts for t >= T0

syhat(t�T0+1,:) = yhat;
end

end

MSFE = mean((ytph�syhat).ˆ2);

(James Taylor) 4 / 6

C ne
Lt aYt U L Lt tbt i
bt f lLt Lt It4Pbt

Itt 4tbt

Out-of-sample Forecast Results

quadratic trend a, b = 0.2 a, b = 0.5 a, b = 0.8

h = 4 170400 82515 61051 61295

h = 20 413280 814090 1253500 1561500

Table: MSFE for various models of US GDP

For short-horizon forecasts the Holt-Winters method performs much

better

But for long-horizon forecasts it is much worse; probably worth taking

logs of the data

(James Taylor) 5 / 6

Figure: Holt-Winters forecast for h = 4 and a, b = 0.5 and real US GDP data

(James Taylor) 6 / 6

our

Holt-Winters with Seasonality

Australian National University

(James Taylor) 1 / 5

5.4

Holt-Winters Smoothing with Seasonality

If our data has seasonality as well, we simply add an extra component

to the point forecast

ŷt|t�1 = Lt�1 + bt�1 + St�s

where s the periodicity of seasonality

E.g. usually s = 4 for quarterly data, s = 12 for monthly data

(James Taylor) 2 / 5

HI
o

Updating Formulae

Level, slope and seasonality are updated as:

Lt = a(yt � St�s) + (1� a)(Lt�1 + bt�1)

bt = b(Lt � Lt�1) + (1� b)bt�1

St = g(yt � Lt) + (1� g)St�s

Three smoothing parameters: a for level, b for slope, and g for

seasonal change

(James Taylor) 3 / 5

What does it mean when the data is high?

Suppose we get back Quater 1 data and it’s higher than expected.

The model suggests it’s some combination of:

random upward noise (transient; ignore)

random upward movement of the underlying data (permanent

once-o↵; increase level)

an increase in the ‘slope’ (permanent slope increase; increase slope)

Quarter 1’s being a bit higher from now on (permanent once-o↵ for

future Quarter 1’s; increase St)

(James Taylor) 4 / 5

ummm

rumrunner

murmured
men

h-step-ahead Forecast

One-step-ahead: ŷT+1|T = LT + bT + ST+1�s

Two-step-ahead: use ŷT+1|T instead of yT+1 for updating to get

ŷT+2|T = LT + 2bT + ST+2�s

h-step-ahead forecast is

ŷT+h|T = LT + hbT + ST+h�s

(James Taylor) 5 / 5

F January
Ttl Feb
Ttt 12_Feb kill slope seasonality

March March

Tth Is
Tth35

111412 1 2
1 1424 1 lo

Australian Retail Sales

Holt-Winters with Seasonality Example

Australian National University

(James Taylor) 1 / 7

5.5

Forecasting Australian Retail Sales with Holt-Winters

Smoothing

We’ve looked at Australian Retail sales previously, with various

seasonal trend models.

We found the specifications were all very similar, and usually not very

good

(James Taylor) 2 / 7

Forecasting Australian Retail Sales with Holt-Winters

Smoothing

Now produce forecasts with Holt-Winters smoothing with seasonality

Forecast horizon: 1- and 2-step-ahead

Three sets of smoothing parameters: a = b = g = 0.8 and

a = b = g = 0.5, and a = b = g = 0.2

(James Taylor) 3 / 7

Matlab Code

T0 = 15;

h = 1;

s = 4; %periodicity of seasonality

syhat = zeros(T�h�T0+1,1);
ytph = y(T0+h:end); % observed y {t+h}
alpha = .5; beta = .5; gamma = .5;

St = zeros(T�h,1);
% initialize. Many options exist, this is an easy one

Lt = mean(y(1:s)); bt = 0; St(1:s) = y(1:s) � Lt;

(James Taylor) 4 / 7

Ytth
parameter

slope

Matlab Code

for t = s+1:T�h
newLt = alpha*(y(t) � St(t�s)) + (1�alpha)*(Lt+bt);
newbt = beta*(newLt�Lt) + (1�beta)*bt;
St(t) = gamma*(y(t)�newLt) + (1�gamma)*St(t�s);
yhat = newLt + h*newbt + St(t+h�s);
Lt = newLt; bt = newbt; % update Lt and bt

if t>= T0

syhat(t�T0+1,:) = yhat;
end

end

MSFE = mean((ytph�syhat).ˆ2);

(James Taylor) 5 / 7

teal
slope

seasonality

Lot but

otteroption

Out-of-sample Forecast Results

Table: MSFE comparing Seasonal Dummy to Holt-Winters

Seasonal

Dummies

a, b,g

= 0.2

a, b,g

= 0.5

a, b,g

= 0.8

h = 1 1.3405 0.7204 0.4573 0.7466

h = 2 1.6084 0.8934 0.4648 1.0113

Holt-Winters generally works better than just using seasonal dummies

Smoothing parameters around 0.5 work well

For large h and large smoothing parameters, the model does not work

very well

(James Taylor) 6 / 7

Holt-Winters Forecasts

Figure: Holt-Winters forecast for h = 1 and a, b,l = 0.5 and Australian Retail

data

(James Taylor) 7 / 7

Optimising Holt-Winters Parameters

Australian National University

(James Taylor) 1 / 5

5.6

Being very tricky – Optimising Holt-Winters

If we want to be very tricky, we can optimise our Holt-Winters process

Make a function file which takes in the a, b parameter values, and

calculates the MSFE

Make an m file which minimises the preceding function with respect to

a and b

Make even better forecasts.

There is a small amount of statistical justification for this approach,

but not much.

(James Taylor) 2 / 5

Being very tricky – Optimising Holt-Winters

If we want to be very tricky, we can optimise our Holt-Winters process

Make a function file which takes in the a, b parameter values, and

calculates the MSFE

Make an m file which minimises the preceding function with respect to

a and b

Make even better forecasts.

There is a small amount of statistical justification for this approach,

but not much.

(James Taylor) 2 / 5

Optimising Holt-Winters for Aus Retail

If you do this for Aus Retail, you get nice results

The best parameters are a = 0.4556, b = 0.2459, and g = 0.9173.

This suggests seasonality changes quickly, and slope changes slowly.

Level changes at a moderate pace.

MSFE is 0.3667. The best we have previously was 0.4573. About a

30% reduction.

(James Taylor) 3 / 5

Holt-Winters Optimisation

Figure: Holt-Winters MSFE for Aus Retail data with g = 0.9173, various a, b

(James Taylor) 4 / 5

M 7

Optimising Holt-Winters for US GDP

If you do this for US GDP, you get very very strange results

The ‘best’ parameters are a = 1.42 and b = 0.035.

The outcome of b ⇡ 0 is that the slope doesn’t change much.

The outcome of a > 1 is that if we observe a true value above the

forecast, then we should increase the level above even the observation

(very strange).

Possible reason – Very strong feedback loops (unlikely)

Possible reason – The data is exponential, and should have been

logged first

(James Taylor) 5 / 5

Mum

Optimising Holt-Winters for US GDP

If you do this for US GDP, you get very very strange results

The ‘best’ parameters are a = 1.42 and b = 0.035.

The outcome of b ⇡ 0 is that the slope doesn’t change much.

The outcome of a > 1 is that if we observe a true value above the

forecast, then we should increase the level above even the observation

(very strange).

Possible reason – Very strong feedback loops (unlikely)

Possible reason – The data is exponential, and should have been

logged first

(James Taylor) 5 / 5

Related Posts