INTRODUCTION TO TIME SERIES ANALYSIS
Copyright By PowCoder代写 加微信 powcoder
1. Introduction
We now cover the basic concepts of time series analysis. As we will see, understanding
these concepts is crucial to an understanding of the time series models that capture time-
varying volatility in financial data that will be developed in topic 4.
2. Covariance Stationary Time Series
Let ty be the value of an economic or financial variable at time t. In practice, the data on
the variable we actually observe at time T is { }1 2, , Ty y y , where it is assumed that
values of the variable are recorded from 1t = onward.
In time series analysis, we say that ty is a covariance stationary process if the following
conditions hold;
(a) ( ) for all tE y tµ= . (The mean of the series is constant over time)
(b) cov( , ) ( )t t jy y jj− = . (The covariance between ty and t jy − depends only on the
displacement in time between the two ‘y s , which is j periods in this case, and not
on the time period t ). Note that the autocovariance function is symmetric; that is,
( ) ( )j jj j= − . Symmetry reflects the fact that the autocovariances of a covariance
stationary series depends only on displacement (i.e. on j).
(c) var( ) (0)j=
y must be finite. Note that (0)j is the variance of ty since
cov( , ) var( )=
y y y . It can be shown that no autocovariance can be larger in
absolute value than (0)j , so if (0)j < ∞ then so are all the other autocovariances.
The (population) autocorrelation function is
var( ) var( )
by covariance stationarity
var( ) var( )
Note: Brooks (book) uses τ
to denote autocorrelation. This is very uncommon notation
in the literature and ρ is typically used.
By contrast, the partial autocorrelation at lag j measures the association between
ty and t jy − after controlling for the effects of the intervening values 1ty − through ( 1)t jy − − .
The partial autocorrelation at lag j, denoted p(j) is just the coefficient on t jy − in a
(population) regression of ty on a constant and 1 2, , ,t t t jy y y− − − .
For a covariance stationary process the autocorrelations and partial
autocorrelations approach zero as the displacement (j) becomes large. The estimator of
the autocorrelation function from the sample is found by replacing expected values by
sample averages in the formula for the autocorrelation function. Thus,
t t j t t j
y y y y y y y y
− − − −
The sample partial autocorrelation at displacement j is
ˆ(̂ ) jp j β=
where the fitted regression is
0 1 1ˆ ˆ ˆt̂ t j t jy y yβ β β− −= + + + .
The sampling distribution of both the autocorrelation and partial autocorrelation
function is (0,1/ )N T so that under the null hypothesis of zero correlation a 95%
confidence interval is (2/ )T± for both the sample autocorrelation and partial
autocorrelation coefficients.
3. White Noise Process
White noise processes are the building blocks of time series analysis. Suppose tε
is distributed with mean zero and constant variance and is serially uncorrelated. That is,
2(0, )tε σ0
and cov( , ) 0t t jε ε − = for all t and j. In addition, assume that the variance is finite, that is,
2σ < ∞ . Such a process is called white noise process (with 0 mean) and is denoted as
2(0, )t WNε σ0 .
Although tε is serially uncorrelated, it is not necessarily independent.
Independence is a property that pertains to a conditional distribution. Let
{ }1 1 2, ,t t tε ε− − −Ω = be the information set comprising the past history of the process at
If the random variable
, conditional on the information set 1t−Ω , has the
same distribution as unconditional random variable ε
then tε is independently and identically distributed. Moreover, if random variable tε is
continuous with probability density
( | ) ( ) for all possible (realizations of )
When tε is a white noise process and the 't sε are independently and identically
distributed, then the process for tε is said to be independent white noise or strong white
noise and is denoted as
2(0, )t iidWNε σ0
Conditional and unconditional means and variances for an independent white
noise process are identical since the conditional and unconditional distributions are the
same. As before, the information set upon which we condition contains the past history of
the series so that { }1 1 2, ,t t tε ε− − −Ω = . Then the conditional mean is
( | ) ( | ) by independence
Note: when we condition on { }1 1 2, ,t t tε ε− − −Ω = , we reveal the realizations
. That is why
( | ) , for all 1ε ε
and, similarly, the conditional variance is
var( | ) ( ( | )) | by independence
Ω = − Ω Ω
= − =
t t t t t t
var( | ) 0, for all 1ε
j , again in this case we know the realization
exactly and variance of deterministic variable is 0.
For a white noise process, the autocovariance function is
and the autocorrelation function is
The partial autocorrelation function is
since in a population regression of tε on its lagged values, the regression coefficients are
4. General Linear Processes
The Wold representation theorem is a very important result in time series
analysis. It says that if ty is a covariance stationary process, it can be represented as
where 0 1b = and
< ∞∑ . The latter condition is to ensure that var( )ty is finite.
This says that any covariance stationary series can be represented by some infinite
distributed lag of white noise. The 't sε are often called innovations or shocks. This
representation for ty is known as the general linear process. It is general since any
covariance stationary process can be represented this way and linear because ty is a
linear function of the innovations. Although Wold’s theorem says the innovations are
serially uncorrelated, we will also make the assumption that the innovations are
independent. Thus, we will assume that the innovations are strong or independent white
The unconditional mean and variance of ty is, respectively,
var( ) var
var( ) since ' are uncorrelated
We will now calculate the conditional moments of the process under the
assumption that the innovations are strong white noise. Define the information set as
follows: 1 2{ , , , }t t t ty y y− −Ω = . Now
t t i t i t
t t i t i t
Ω = + Ω
= + Ω + Ω
The expression on the last line above is a conditional mean but it is also the optimal one-
step ahead forecast of y, conditional on information available at time t. Similarly, we can
t t i t i t
t t t t i t i t
Ω = + Ω
= + Ω + Ω + Ω
and this conditional mean gives the optimal two-step ahead forecast of y on the basis of
information available at time t. The key point is that the conditional mean moves over
time – it is time-varying as it depends on t. Another way of thinking about this is to
consider the one-step ahead forecast of y conditional on information available at time
t+1. It is
( | )t t i t i
which incorporates the latest information to arrive, namely, by way of the innovation
1tε + . In other words, the conditional mean moves over time in response to an evolving
information set – the conditional mean depends on the conditioning information set.
Now let us calculate the corresponding conditional variances.
var( | ) ( ( | )) |
( ) by independence
t t t t t t
Ω = − Ω Ω
var( | ) ( ( | )) |
( ) by independence
(1 ) since the innovations are uncorrelated
Ω = − Ω Ω
= + Ω
t t t t t t
Note that conditional variance is not time varying – it does not depend on t. This is the
important observation. It does depend on the forecast horizon, however. In the one step
ahead case, the conditional variance is given by one term ( 2σ ) and in the two step ahead
case by the sum of two terms 2 2 21bσ σ+ . Note also that the conditional variance is
always smaller than the unconditional variance. Nevertheless, the important point is that
the conditional variance does not evolve over time – it does not depend on the
conditioning information set. To see this, note that
22 1var( | )t ty σ+ +Ω =
2 23 1 1var( | ) (1 )t ty b σ+ +Ω = +
so that the arrival of new information by way of 1ty + (or equivalently 1tε + ) does not affect
the conditional variance. This is a very undesirable feature for the purposes of modeling
financial data. In financial data, the conditional variance appears to change over time in
response of the arrival of new information and the model as it stands cannot capture this
feature. The reason why the conditional variances above are not time-varying is because
the innovations are assumed to be independent white noise. Later, we will relax the
assumption of independence and assume that the innovations are white noise with a
particular dependence structure. This gives rises to the ARCH\GARCH class of models.
Finally, the two standard error confidence interval for the one and two step ahead
forecasts is, respectively,
5. Parsimonious Models
It is impractical to estimate models of the form
from a finite sample of data because there is an infinite number of parameters in the sum
to estimate. In many applications, however, simpler specifications involving far fewer
parameters (parsimonious specifications) provide good approximations to the
representation of a covariance stationary series given above. We will consider two such
representations.
(a) The MA(1) Model.
The moving average model of order one (the MA(1) model) is
In terms of the Wold representation, 0ib = for 2,3,i = . The MA(1) is a “short
memory” process because it depends only on the innovation last period and not on
innovations earlier than that. From the results above, the unconditional mean and
variance of the MA(1) is
var( ) (1 )
The conditional means and variances of the MA(1) process are
1 for j 1
(1 ) for j 1
As before, the conditional variance is not time varying. Notice that for j>1, the
conditional mean is the same as the unconditional mean and the conditional variance is
the same as the unconditional variance.
The two-standard error confidence interval for the one-step ahead forecast from
the MA(1) model is
1 2tbα ε σ+ ±
The two-standard error confidence interval for the j-step ahead forecast where 1j > is
212 (1 bα σ± +
Estimation of an MA(1) model requires nonlinear methods (e.g. maximum likelihood).
Once estimates of the parameters ( 1, ,bα σ ) are found, they can be substituted into the
above two equations.
MA model can be easily extended to higher order processes, MA(q), where q is
maximum lag:
= + + + + +
t t t t q t q
The unconditional mean and variance of the MA(q) are
var( ) (1 )
The conditional means and variances of the MA(q) process are
l t l ql j
var( | ) (1 ) for 1
z for all i.
z is complex number = + >* 2 2Re Im 1
z . That is where the “unit circle” comes in.
*Additional use: MA representation of an AR(p) process
The characterization above is extremely important if you want to represent an AR(p)
process as an MA(∝) process.
For AR(1) it is relatively simple
(1 ) α ε− = +
b L y simply divide by
Unfortunately we do not know how the lag operator works when it is in the denominator,
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com