CS计算机代考程序代写 Some Simple Statistical Models

Some Simple Statistical Models

Australian National University

(James Taylor) 1 / 24

I I

Introduction

Forecast: A statement about a future observable, given some set of

current information

Framework:

yt is the quantity of interest at time t

could be any data: inflation rate, exchange rate, GDP, annual number

of tourists, course grades, etc.

we usually observe y1, y2, . . . , yT

and want to describe yT+h.

ŷT+h will denote the point forecast, our “best guess”

(James Taylor) 2 / 24

Main Approaches

Two main approaches: ad-hoc methods and model-based methods

Ad-hoc Methods:

Use rules of thumb, often reasonable

Example 1: Let ŷT+h = yT . That is, just use the last observed value

Example 2: Let ŷT+h be some weighted average of past observations

(James Taylor) 3 / 24

ef stockprice

Ad-hoc Methods – Pros and Cons

Pros:

Appear reasonable (mostly)

Does not require us to specify a model

Modelling is a lot of work

We may not have a good model in mind

Easy to implement

Cons:

Doesn’t make the best use of the data

Not statistically justified

Di�cult to analyse statistical properties of the forecast

(James Taylor) 4 / 24

Model-Based Approach

Build a statistical model, thinking carefully about the key features of

the data

Specify how the data are generated

Estimate the model parameters based on past observations

Use this to produce forecasts

(James Taylor) 5 / 24

Model-Based Approach – Pros and Cons

Pros:

Unified approach – can attack di↵erent problems using the same

approach

Can model relevant features of the data to produce good forecasts

Can analyse statistical properties of the forecasts

Cons:

Modelling is hard

Requires a background in statistical inference

Requires more involved programming

(James Taylor) 6 / 24

Data Generating Process

A model is a stylized description of the object of interest

We will work with statistical models which describe how the data is

generated

We’ll introduce a range of models which can generate

Autocorrelation

Trends

Cycles

(James Taylor) 7 / 24

The AR(1) Process

Fix y1 = 0

For t = 2 onward, generate yt according to

yt = ryt�1 + et , et ⇠ N (0, s2)

This is called a first-order autoregressive process or AR(1)

For r = 1, it is called a random walk. Why?

For |r| < 1, the process is stationary. Why? (James Taylor) 8 / 24 Autoregressiveprocess It tr Yz Py t E z t Ez N o 64 Example AR(1) Process (a) Stationary Process (b) Random Walk Figure: AR(1) Processes (James Taylor) 9 / 24 P o f E l Yz ofya 1Ez Yu Yat Su Code for AR(1) T = 100; y = [1:T]'; b = .8; a=1; y(1) = 0; for t = 2:T y(t) = b*y(t�1) + a*randn; end x = 1:T; plot(x,y) (James Taylor) 10 / 24 f t.io y 4 StNN oil Yt b ya ta randomnormaldistribution The MA(1) Process Set e0 = 0 and draw e1, e2, . . . independently from N (0, s2) For t = 1 onward, generate yt according to yt = qet�1 + et This is called a first-order moving average process or MA(1). (James Taylor) 11 / 24 MovingAverageProcess I Yt1 0421sty Example MA(1) Processes (a) Small q (b) Large q Figure: MA(1) Processes (James Taylor) 12 / 24 D o D of les relationship bwtoday'sobserve merecorrelation Value w yesterday'sobservevalue Code for MA(1) T = 100; e = [1:T]'; a = 1; b=0.8; for t=1:T e(t) = a*randn; end y = [1:T]'; y(1) = e(1); for t=2:T y(t) = e(t) + b*e(t�1); end x = [1:T]'; plot(x,y) (James Taylor) 13 / 24 El o y si y Eatb E y3 4 1b EL i Two Ewotb Eff Models with Trend and Cycle Want a time series y1, y2, . . . where yt = mt + ct + et , et ⇠ N (0, s2) with Trend component mt Cycle component ct Error term et (James Taylor) 14 / 24 Example - Model with trend and cycle Let trend be linear: mt = a0 + a1t so a0 is the level and a1 is the slope of the trend Let cycle be sinusoidal: ct = b1 sin(wt) + b2 cos(wt) amplitude and position are determined by b1, b2 frequency is determined by w (James Taylor) 15 / 24 Mt Ao t union www T Example Cyclical Processes (a) With drift (b) Without drift Figure: Trend + Cycle Models (James Taylor) 16 / 24 a 70 a _0 Code of model T = 50; y = [1:T]'; a0 = 0; a1 = .5; c =.8; b1 = 1; b2 = 2; w = 1; for t = 1:T y(t) = a0 + a1*t + b1*sin(w*t) + b2*cos(w*t) + c*randn; end x = 1:T; plot(x,y) (James Taylor) 17 / 24 L Yg ao ta it t b sin wt t bzGsCwtI t C Et Regression Model Often we observe data in addition to the variable of interest E.g. if forecasting international visitor arrivals, we also know Exchange rates Fuel costs Major events in other tourist markets We may be able to use this to generate a better forecast (James Taylor) 18 / 24 g GULD Let xt�1 = (x1,t�1, . . . , xk,t�1) be a (row) vector of data for predicting yt . The linear regression model specifies the linear relationship between yt and the regressors as yt = x1,t�1b1 + · · ·+ xk,t�1bk + et , et ⇠ N (0, s2) where (b1, . . . , bk)0 is a (column) vector of regression coe�cients (James Taylor) 19 / 24 Tomorrow Joe Xi'tJi t Xi tp t u Simple Example Let T = 3, k = 2. Then the regression model is y1 = x1,0b1 + x2,0b2 + e1 y2 = x1,1b1 + x2,1b2 + e2 y3 = x1,2b1 + x2,2b2 + e3 In applications T will (must) be much bigger than k . (James Taylor) 20 / 24 Matrix notation Usually we want to write this more concisely as a matrix form. In general 0 BBBBB @ y1 y2 . . . yT 1 CCCCC A = 0 BBBBB @ x1,0 x2,0 · · · xk,0 x1,1 x2,1 · · · xk,1 . . . . . . . . . . . . x1,T x2,T · · · xk,T 1 CCCCC A 0 BB @ b1 . . . bk 1 CC A+ 0 BBBBB @ e1 e2 . . . eT 1 CCCCC A or more conveniently y = Xb + e (James Taylor) 21 / 24 y X Simple Example redux 0 BB @ y1 y2 y3 1 CC A = 0 BB @ x1,0 x2,0 x1,1 x2,1 x1,2 x2,2 1 CC A b1 b2 ! + 0 BB @ e1 e2 e3 1 CC A with 0 BB @ e1 e2 e3 1 CC A ⇠ N 0 BB @ 0 BB @ 0 0 0 1 CC A , 0 BB @ s2 0 0 0 s2 0 0 0 s2 1 CC A 1 CC A (James Taylor) 22 / 24 Regression Model Finale By making an appropriate change of variable (as we will see in a future lecture), the regression model gives that y follows the multivariate normal distribution: y ⇠ N (Xb, s2IT ) (James Taylor) 23 / 24 hyun Combining these models Very very often we will want to combine models, to make an ARX, or ARMA, or ARMAX model (for example). An ARX model would look like: yt = ryt�1 + x1,t�1b1 + · · ·+ xk,t�1bk + et , et ⇠ N (0, s2) (James Taylor) 24 / 24 w w Point Forecasts Iterated and Direct Forecasts Australian National University (James Taylor) 1 / 14 1.2 Forecast Horizon The forecast horizon is the number of periods between the current period and the period which we forecast Example: Annual GDP data, forecast GDP one year from now; forecast horizon is one Example: Quarterly inflation data, forecast inflation one year from now; forecast horizon is four Example: Monthly sales data, forecast sales one year from now; forecast horizon is twelve Models for the forecast horizon when h > 1

Iterated Forecasts

Direct Forecasts

(James Taylor) 2 / 14

Ytth

Tt

t

h

Forecast Horizon

The forecast horizon is the number of periods between the current period

and the period which we forecast

Example: Annual GDP data, forecast GDP one year from now;

forecast horizon is one

Example: Quarterly inflation data, forecast inflation one year from

now; forecast horizon is four

Example: Monthly sales data, forecast sales one year from now;

forecast horizon is twelve

Models for the forecast horizon when h > 1

Iterated Forecasts

Direct Forecasts

(James Taylor) 2 / 14

Forecast Horizon – AR(1)

Usual AR(1) process:

yt = ryt�1 + et , et ⇠ N (0, s2)

We observe y1, . . . , yT , and want to forecast yT+1

Assume we know the parameters r and s.

What is a reasonable ŷT+1?

(James Taylor) 3 / 14

rho
Epsilon

IPl l

rho sigma

As yT+1 is a random variable, a reasonable estimate might be the

conditional expected value

Let It denote the information set are time t, then

ŷT+1 = E(yT+1 | IT , (r, s))

= E(ryT + eT+1 | IT , (r, s))

= E(ryT | IT , (r, s)) + E(eT+1 | IT , (r, s))

= ryT

(James Taylor) 4 / 14

quditionalon

YetI PYT tETH

O pexpectedvalueoftheerrornextperiodgivenwhat
weknowtodayandtheparameterofthemodel

Tf1451 6 ErrorTmr ETt NNLOG
T

Two-step-ahead Forecast

What about ŷT+2 ?

Can’t just use E(yT+2 | IT+1, (r, s)) because we don’t know IT+1.

One option – iterate the AR(1) process

(James Taylor) 5 / 14

Iterated Forecasts

So we have

yT+2 = ryT+1 + eT+2

= r(ryT + eT+1) + eT+2

= r2yT + reT+1 + eT+2

Then taking conditional expectation we find

E(yT+2 | IT , (r, s)) = r2yT

This is an iterated forecast.

(James Taylor) 6 / 14

IF IyTtr l It lP62 IE P y tPEItETIl IT CP64
O o

1hexpectation

E lpy l IT lP 64

p
Z
yT

Iterated Forecasts

So we have

yT+2 = ryT+1 + eT+2

= r(ryT + eT+1) + eT+2

= r2yT + reT+1 + eT+2

Then taking conditional expectation we find

E(yT+2 | IT , (r, s)) = r2yT

This is an iterated forecast.

(James Taylor) 6 / 14

More Iterations

If y is an AR(1) process it is straightforward to show that

E(yT+h | IT , (r, s)) = rhyT

From this we see that for random walks

E(yT+h | IT , (r, s)) = yT

and for stationary processes

lim
h!•

E(yT+h | IT , (r, s)) = 0

(James Taylor) 7 / 14

IPki

0 1

in ur

1014

Direct Forecast

Instead of producing iterated forecasts, we could instead re-specify the

model to

yt+h = r̃yt + et+h

then find that E(yT+h | IT , (r̃, s)) = r̃yT .

This is a direct h-step-ahead forecast

This is no longer an AR(1) model

It behaves very di↵erently to our original model

(James Taylor) 8 / 14

Hi

tilde
I

Iterated vs Direct

May give quite di↵erent forecasts

Iterated forecasts often perform better than direct forecasts

Especially for large forecast horizons

But, we can’t always do an iterated forecast

(James Taylor) 9 / 14

Linear Regression Forecasts

Recall the linear model

yt = xt�1b + et , et ⇠ N (0, s2)

Iterated forecasts are not possible for h > 1, because

yT+h = xT+h�1b + eT+h

and we have no idea what xT+h�1 could be.

No recursive relationship between xt and xt�1

(James Taylor) 10 / 14

Xt l LXi t l X2 t l

Linear Regression Forecasts

So instead we will use a direct forecast model

yt = xt�h b̃ + et , et ⇠ N (0, s2)

So that

yT+h = xT b̃ + eT+h, eT+h ⇠ N (0, s2)

Taking conditional expectation we find

E(yT+h | IT , (b̃, s)) = xT b̃

(James Taylor) 11 / 14

I

0
IFHTT EthelLt LF61

Short vs Long Term

The forecast horizon will a↵ect the choice of forecasting model.

For example, for GDP forecasts:

for near future forecast the short-term business cycle fluctuations will

drive almost all changes

for long horizon forecast, the business cycle matters very little, and

the trend component becomes important

(James Taylor) 12 / 14

Usapp
32

Short vs Long Term – Trend-cycle model

yt = mt + ct + et , et ⇠ N (0, s2)

with cycle component

ct = b1 sin(wt) + b2 cos(wt)

The ct is bounded in time

|ct | = |b1 sin(wt) + b2 cos(wt)|

 |b1 sin(wt)|+ |b2 cos(wt)|

 |b1|+ |b2|

(James Taylor) 13 / 14

fixedintime

Short vs Long Term – Trend-cycle model

While the trend term mt is typically unbounded in time.

For example, if we specify a linear trend mt = a0 + a1t, then |mt | is
unbounded.

As |ct | is bounded, and |mt | is not, we find

lim
t!•

|yt/mt | = 1

That is, for large t, the variable yt is determined almost entirely by

the trend component.

(James Taylor) 14 / 14

we

Interval and Density Forecasts

Australian National University

(James Taylor) 1 / 8

e

More Informative Forecasts

In previous lecture modules we have discussed only point forecasts.

This gives our best guess for ŷT+h.

But a single value might not be enough information to aid decision

making

So Interval Forecasts, and

Density Forecasts

(James Taylor) 2 / 8

Interval Forecast

Forecast GDP growth rate next quarter

Point forecast maybe -10%

How confident are we of this forecast? Are we nearly certain? Is it

just a guess?

What is the variability of the forecast?

(James Taylor) 3 / 8

I3

Interval Forecast

Better Forecast: with probability 0.95 the growth rate will fall in

(�30%,�5%).

This is an interval forecast

While a point forecast gives a very succinct summary, the interval

forecast tells us something about the forecast uncertainty

(James Taylor) 4 / 8

Density Forecasts

Can we do even better than an interval forecast?

The future observable yT+h is a random variable

So all the information about yT+h is summarized in its probability

density function

What is a suitable pdf for yT+h ?

A good estimate would be the conditional density f (yT+h|IT , q).

This is a density forecast.

(James Taylor) 5 / 8

Concerns

There are some issues with using f (yT+h|IT , q)

It assumes q is known.

It implicitly assumes a particular data generation process

So ignores both parameter and model uncertainty

We can deal with the first problem, the latter is trickier

(James Taylor) 6 / 8

AR(1) Density Forecast

yt = ryt�1 + et , et ⇠ N (0, s2)

We want a density forecast for yT+1.

yT+1 = ryT + eT+1, eT+1 ⇠ N (0, s2)

r and yT are known, just eT+1 is unknown.

But we know the distribution of eT+1.

So yT+1 ⇠ N (ryT , s2)

(James Taylor) 7 / 8

AR(1) Density Forecast

yT+1 ⇠ N (ryT , s2)

That is, yT+1 is distributed according to a normal distribution with

mean ryT and variance s
2
.

So a 95% interval forecast is (ryT � 1.96s, ryT + 1.96s).

(James Taylor) 8 / 8