程序代写代做代考 Models for Stationary Time Series

Models for Stationary Time Series

Models for Stationary Time Series

MAS 640

1/22/2018

Outline

I Residual Analysis and The Autocorrelation Function (ACF)
I Moving Average Models
I Autoregressive Models

Bigger Picture

I In Lecture 2 our focus was on detrending non-stationary time
series

I Procedure:
I Plot time series
I Determine appropriate regression model
I Fit that model
I Diagnose it’s goodness

I Now we have residuals
I In a typical regression course, we stop here
I In this course, we try to model those residuals

Comments

Yt = µt + �t

I We’re breaking this into pieces, slowly building up to a
comprehensive class of Time Series models.

I Model µt with a regression model
I Model �t with a time series model

Residual Analysis

I Previously eyeballed residual scatterplot and said “looks ok”
(not perfect but common practice)

I Two alternatives
I Runs test
I Estimate and plot autocorrelation of standardized residuals for

a number of time lags

I Independence is particularly important to us now
I If correlations remain in the residuals, we will model that

correlation with a time series model
I Plot of estimated autocorrelation particularly useful, because it

will help determine which model to fit

Sample Autocorrelation Function (ACF)

I Estimating and plotting the ACF is essential
I When we detrend the data, we are hoping to get a stationary

time series
I It’s hard to see if remaining correlations exist, but if they do,

we want to model them
I The ACF will help determine how we model them

Sample Autocorrelation

I Under independence, the sample ACF of the standardized
residuals are approximately N(0, 1/n)

I Estimate and plot the sample ACF using the standardized
residuals

I If certain lags are above 1/n, we suspect remaining correlation
in the residuals

I Plot using acf(rstudent(FITTED MODEL))

Autocorrelation Plot Example – gtemp2 Data

5 10 15


0

.2

0
.1

0
.0

0
.1

0
.2

0
.3

0
.4

Series rstudent(fit)

Lag

A
C

F

Autocorrelation Plot Example – gold Data

5 10 15 20


0

.2
0

.0
0

.2
0

.4
0

.6
0

.8
Series rstudent(fit)

Lag

A
C

F

Autocorrelation Plot Example – beersales Data

5 10 15 20


0

.2

0
.1

0
.0

0
.1

0
.2

0
.3

Series rstudent(fit)

Lag

A
C

F

Moving Average and Autoregressive Processes

I Today we cover the two basic models for stationary time series
I Moving averages
I Autoregressive

Moving Average Processes

I Suppose et is a mean 0 white noise process with var(et) = σ2.
I The process:

Yt = et + θ1et−1 + θ2et−2 + · · ·+ θqet−q

I is called a moving average process of order q, and is
denoted MA(q)

I Todays value is a weighted average of the previous q error
terms

Comment on coefficient signs. . .

I For technical reasons, this model is often written as:

Yt = et − θ1et−1 − θ2et−2 − · · · − θqet−q

I R uses positive signs, so we’ll stick with that.
I Might need to check on this if using other software.

MA(1) Process

The 1st order moving average process, denoted MA(1) is

Yt = et + θ1et

Properties of an MA(1) Process
1. Mean: E (Yt) = 0
2. Variance:

var(Yt) = var(et + θ1et−1)
= var(et) + θ21var(et−1) + 2cov(et , et−1)
= σ2e + θ

2

2
e = σ

2
e (1 + θ

2
1)

3. Autocorrelation function (ACF):

ρk =



1 k = 0

θ1
1+θ21

k = 1
0 k > 1

Properties of an MA(1) Process

I This process has no correlation beyond lag 1!
I Observations 1 time unit apart are correlated, but observations

more than 1 time unit apart are not
I Important to keep in mind when we consider models for real

data using empirical evidence
I i.e. when we look at ACF plots and see high correlation at lag 1

but low to no correlation at higher time lags

Properties of an MA(1) Process

The following theoretical properties apply to an MA(1) process

I When θ1 = 0, the MA(1) process reduces to white noise

Yt = et + 0et−1 = et

I θ1 restricted to have absolute value less than 1 (for
invertibility)

Properties of an MA(1) Process

I As θ1 ranges from -1 to 1, the lag 1 autocorrelation ρ1 ranges
from -0.5 to 0.5

I For -1:

ρ1 =
−1

1 + (−1)2
= −0.5

I For 1:

ρ1 =
1

1 + 12
= 0.5

I Observing lag 1 autocorrelation well outside of this range is
inconsistent with the MA(1) model

Simulated MA(1) Processes

theta = −0.9

Time

si
m

1

0 20 40 60 80 100


3


2


1

0
1

2
3

theta = −0.3

Time

si
m

2

0 20 40 60 80 100


2


1

0
1

theta = 0.5

Time

si
m

3

0 20 40 60 80 100


3


2


1

0
1

2

theta = 0.9

Time

si
m

4

0 20 40 60 80 100


3


2


1

0
1

2
3

ACF for Simulated MA(1) Processes

5 10 15 20


0

.5

0
.3


0

.1
0

.1
Series sim1

Lag

A
C

F

5 10 15 20


0

.3

0
.1

0
.0

0
.1

0
.2

Series sim2

Lag

A
C

F

5 10 15 20


0

.2
0

.0
0

.1
0

.2
0

.3

Series sim3

Lag

A
C

F

5 10 15 20


0

.2
0

.0
0

.2
0

.4

Series sim4

Lag

A
C

F

Example

I Try a few on your own.
I Vary the ma parameter between -1 and 1 and look at the

resulting time series.
I Run each through the acf() function and look at results.

What do you notice?
I We’ll model these at the end today, but what would you expect

your estimated coefficient to be in each case?

MA(2) Process

The 2nd order moving average process, denoted by MA(2) is

Yt = et + θ1et−1 + θ2et−2

Properties of an MA(2) Process

I Mean: E (Yt) = 0
I Variance:

var(Yt) = σ2e (1 + θ
2
1 + θ

2
2)

I Autocorrelation function (ACF):

ρk =




1 k = 0
θ1+θ1θ2
1+θ21+θ

2
2

k = 1
θ2

1+θ21+θ
2
2

k = 2
0 k > 2.

Properties of an MA(2) Process

I The MA(2) process has zero correlation beyond lag 2
I So, an ACF plot with spikes at lags 1 and 2 only would indicate

a possible MA(2) model.

Simulated MA(2) Processes

theta1=0.8, theta2=0.7

Time

si
m

1

0 50 100 150 200 250


4


2

0
2

4
theta1=0.9, theta2=0.4

Time

si
m

2

0 50 100 150 200 250


2

0
2

4

theta1=−0.9, theta2=−0.5

Time

si
m

3

0 50 100 150 200 250


4


2

0
2

4

theta1=−0.5, theta2=−0.5

Time

si
m

4

0 50 100 150 200 250


3


2


1

0
1

2
3

ACF for Simulated MA(2) Processes

5 10 15 20

0
.0

0
.2

0
.4

0
.6

Series sim1

Lag

A
C

F

5 10 15 20

0
.0

0
.2

0
.4

0
.6

Series sim2

Lag

A
C

F

5 10 15 20


0

.2

0
.1

0
.0

0
.1

0
.2

Series sim3

Lag

A
C

F

5 10 15 20


0

.4

0
.2

0
.0

0
.1

Series sim4

Lag

A
C

F

General MA(q) Process

The qth order moving average process, denoted by MA(q) is

Yt = et + θ1et−1 + . . .+ θqet−q

Properties of an MA(q) Process

I Mean: E (Yt) = 0
I Variance:

var(Yt) = var(Yt) = σ2e (1 + θ
2
1 + . . .+ θ

2
q)

I Autocorrelation function (ACF):

ρk =




1 k = 0
θ1+θ1θ2+···+θq−kθq

1+θ21+···+θ
2
q

k = 1, . . . , q − 1
θq

1+θ21+···+θ
2
q

k = q

0 k > q.

MA(q) Process

I Key feature of MA(q) models:
I Nonzero autocorrelations for the first q lags
I Autocorrelations = 0 for all lags > q

Comment

I q is always used to denote the order of an MA process
I All the literature
I All the software
I When you see q, it’s referring to the MA order

I We’ll use functions that require us to specify p, d , q,P,D,Q
and S so it’s important to keep track

Autoregressive (AR) Process

Suppose {et} is a zero mean white noise process with var(et) = σ2e .
The process

Yt = φ1Yt−1 + φ2Yt−2 + . . .+ φpYt−p + et

is called an autoregressive process of order p, denoted by AR(p).

I Todays value is a linear function of the previous p values, plus
some error.

AR(1) Process

The AR(1) process is

Yt = φ1Yt−1 + et

I Note that if φ1 = 1, then the process reduces to a random
walk.

I If φ1 = 0, this process reduces to a white noise.

Properties of an AR(1) Process

I Variance:

var(Yt) =
σ2e

1− φ21

Because var(Yt) > 0, this implies that −1 < φ1 < 1 I The correlation between observations k time periods apart is ρk = φk1 Properties of an AR(1) Process Because −1 < φ1 < 1, the ACF ρk decays exponentially as k increases I If φ1 is close to ±1, ACF decays slowly I If φ1 is closer to 0, ACF decays rapidly I If φ1 > 0, then all of the ACFs will be positive
I If φ1 < 0, the ACF alternates between positive and negative I Remember these theoretical patterns so that when we see sample ACFs (from real data), we can make sensible decisions about potential model selection Simulated AR(1) Processes phi = 0.9 Time si m 1 0 20 40 60 80 100 − 1 0 1 2 3 4 5 phi = 0.7 Time si m 2 0 20 40 60 80 100 − 4 − 2 0 2 phi = .1 Time si m 3 0 20 40 60 80 100 − 2 − 1 0 1 2 phi = −0.8 Time si m 4 0 20 40 60 80 100 − 4 − 2 0 2 4 ACF for Simulated AR(1) Processes 5 10 15 20 − 0 .2 0 .0 0 .2 0 .4 0 .6 0 .8 Series sim1 Lag A C F 5 10 15 20 − 0 .2 0 .0 0 .2 0 .4 0 .6 0 .8 Series sim2 Lag A C F 5 10 15 20 − 0 .2 − 0 .1 0 .0 0 .1 0 .2 Series sim3 Lag A C F 5 10 15 20 − 0 .8 − 0 .4 0 .0 0 .4 Series sim4 Lag A C F AR(2) Process The AR(2) process is Yt = φ1Yt−1 + φ2Yt−2 + et I Todays value is a linear function of the previous two values, plus some error I ACF gets quite involved, so we leave it out here I But ACF continues to trail off similar to the AR(1) Simulated AR(2) Processes phi1=0.5, phi1=0.5 Time si m 1 0 20 40 60 80 100 − 4 − 2 0 2 phi2=0.7, phi2=0.2 Time si m 2 0 20 40 60 80 100 − 2 0 2 4 6 phi3=−0.5, phi3=0.4 Time si m 3 0 20 40 60 80 100 − 4 − 2 0 2 4 6 phi4=−0.4, phi4=−0.4 Time si m 4 0 20 40 60 80 100 − 2 − 1 0 1 2 3 4 ACF for Simulated AR(2) Processes 5 10 15 20 − 0 .2 0 .2 0 .4 0 .6 0 .8 Series sim1 Lag A C F 5 10 15 20 − 0 .4 0 .0 0 .2 0 .4 0 .6 0 .8 Series sim2 Lag A C F 5 10 15 20 − 0 .5 0 .0 0 .5 Series sim3 Lag A C F 5 10 15 20 − 0 .3 − 0 .1 0 .1 Series sim4 Lag A C F AR(p) Process The general AR(p) process is given by Yt = φ1Yt−1 + φ2Yt−2 + · · ·+ φpYt−p + et I Note that p is always used to denote the order of an AR process, just as q is always used to denote the order of an MA process ACF for AR Processes I Looking at the ACF plots for the AR(1) and the AR(2) process. . . I Nothing to indicate the order in the plots I They both trail off to 0 I Need something else to help determine order p I Partial Autocorrelation Function I Discuss this next class Fitting MA(q) or AR(p) Models in R I Many functions for fitting and forecasting time series I Google “fitting an MA or AR model in R” I Some examples: I ar(x) - Determines order p via AIC, can set order.max I arma(x, order=c(p,q)) - No quick function for forecasts, pretty worthless IMO I arima(x, order=c(p,d,q)) - Use predict(.., n.ahead=) for quick forecasts I sarima(x, p, d, q, P, D, Q, S) - Must specify at least p, d , q, the others default to 0. I sarima.for(x, n.ahead=..., p, d, q, P, D, Q, S) - Gives forecasts, outputs several useful plots I sarima appears to be the latest and greatest. . . Combining today with the last class Time ch ic ke n 2010 2011 2012 2013 2014 2015 2016 2017 8 0 9 0 1 0 0 1 1 0 1 2 0 1 3 0 Example 2 Time C R E F 0 100 200 300 400 500 1 7 0 1 8 0 1 9 0 2 0 0 2 1 0 2 2 0