CS代考 Set 5 - Time Series Regression

Set 5 – Time Series Regression
Richard 2 2021
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 1 / 38

Regression Model
A regression model is formulated as follows:
yt =f(z1t,…,zqt)+εt
yt is the response (dependent) variable at time t
zjt , j = 1, . . . , q are explanatory variables; usually z1t = 1 (intercept,
captures a constant effect).
f (z1t , . . . , zqt ) is the regression function
εt is (hopefully) a random variable capturing the effect of non measurable factors that are not attributable to the zj ’s in some sense, and affect the response variable.
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 2 / 38

Model specification
1. Specification of f (zt ) Linear regression model:
f(z1t,…,zqt)=β1z1t +β1z2t +…+βqzqt
βj, j = 1,…,q are q parameters. We can account for the intercept by assuming zit = 1 ∀t, which is nice and mathematically pure. If we don’t have an intercept – the line goes through (0,0) – then β1 = 0.
However for the rest of these notes I have made the intercept β0 as when we create models without an intercept then the formulation seems sensible – the zjt have j starting from 1, not 2, and client doens’t wonder where
j = 1 term disappeared to.
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 3 / 38

Model specification
2. Nature of random disturbance ε:
Mean independence and homoscedasticity:
E(εt|zt) = 0,Var(εt|zt) = σ2.
This implies that the εt is not correlated with values zt. It is random, thus not modellable.
Since we are building regression models on time series data we also need mo serial correlation. Let t and s be two time units, s ̸= t. Then E(εt,εs|zt,zs) = 0.
We will relax some of the assumptions later on (e.g. uncorrelatedness).
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 4 / 38

Implications
:
In other words the betaj we estimate should be equal to the true βj .
E(yt|zt)=β1z1t +β2z2t +…+βqzqt, ˆ
Var(yt|zt) = σ2.
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 5 / 38

In some cases zt can be considered as deterministic.
Example: time series regression model. Linear trend plus irregular.
yt = β0 + β1t + εt
The random variable yt has expectation E(yt) = β0 + β1t, and variance
Var(yt) = σ2
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 6 / 38

Estimation – Least squares
We aim to estimate the unknown parameters β0, β1, . . . , βq and σ2. The former to estimate and the latter to derive uncertainty of that estimate. Assume that a random sample of T observations is available:
such that
{(zt,yt),t = 1,2,…,T} E(yt|zt)=β0 +β1z1t +…+βqzqt
with Var(yt|zt) = σ2. Equivalently
yt =β0 +β1z1t +…+βqzqt +εt
with E(εt|zt) = 0 and E(ε2t |zt) = σ2
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 7 / 38

OLS Estimation of the regression parameters (review)
The Ordinary Least Squares (OLS) estimates of the regression parameters, denoted βˆj , are the minimum values that that satisfy the sum of squares
criterion
S 􏰄 β ˆ 􏰅 = 􏰋T 􏰄 y − β ˆ − β ˆ z − . . . − β ˆ z 􏰅 2 t 0 1 1t q qt
t=1
Note: yt,zjt are observed values, not random variables.
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 8 / 38

There is a closed form solution to the minimisation problem on previous slide. The F.O.C.s for a minimum are those where the partial derivatives are
∂ S = − 2 􏰋 􏰄 y − βˆ − βˆ z − · · · − βˆ z 􏰅 ∂βˆ t 0 1 1t q qt
= 0 =0
0t
∂S =−2􏰋􏰄y −βˆ −βˆz −···−βˆz 􏰅z
∂βˆ t 0 11t qqt jt jt
j = 1, . . . , q
This yields a linear system of q equations with q unknowns, which is solved with respect to βˆj. The solution is unique provided that the z’s are not collinear.
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 9 / 38

Example: q = 1. That is, for a simple linear regression. n βˆ 0 + βˆ 1 􏰋 z t = 􏰋 y t
tt
βˆ 0 􏰋 z t + βˆ 2 􏰋 z t2 = 􏰋 z t y t ttt
Solving w.r.t. βˆ0 in the first equation gives: βˆ 0 = y ̄ − βˆ 1 z ̄
Replacing into the 2nd equation and solving with respect to βˆ1 gives:
ˆ n1 􏰊t ztyt −z ̄y ̄ Cov(z,y) 􏰊t(zt −z ̄)(yt −y ̄ β1=1􏰊22= =􏰊 2
n t z t − z ̄ V a r ( z ) t ( z t − z ̄ )
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 10 / 38

Fitted values
Residuals
yt =βˆ0 −βˆ1z1t −…−βˆqzqt
e = y − yˆ ttt
We assume throughout these slides that the model includes the intercept.
Properties (review)
􏰊t et = 0
􏰊tztet =0
􏰊 yˆ e = 0 ttt
1 􏰊 yˆ = y ̄ ntt
The regression line goes through the point (y ̄, z ̄)
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 11 / 38

Estimation of σ2 (review)
2 􏰊tet2
s=n−q
n − q is known as the number of degrees of freedom.
s is the standard error of regression (SER).
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 12 / 38

Goodness of fit (review)
Define
TSS = 􏰊t (yt − y ̄)2 (Total sum of squares – Deviance of yt – What we
measure)
we estimated (“explained”)
RSS = 􏰊t et2 (Residual sum of squares – Deviance of et – what we haven’t modelled (“explained”)
It is easy to prove that
ESS = 􏰊 (yˆ − y ̄)2 (Explained sum of squares – Deviance of yˆ – What
t
tt
TSS = ESS + RSS
A relative measure of goodness of fit – how much of the model fits the
and it makes sense as well.
data – is
R2 = ESS = 1 − RSS TSS TSS
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 13 / 38

This measures in the previous slide suffers from a serious drawback when used for model selection: the inclusion of (possibly irrelevant) additional regressors always produces an increase in R2. In linear regression you use adjusted R2
R ̄2 = 1 − RSS/(n − q) = 1 − n − 1 􏰀1 − R2􏰁 TSS/(n − 1) n − q
Let σˆ2 = RSS . Model selection can be based on the following information n
criteria: Akaike (AIC) and Schwarz (SIC or BIC) AIC = ln σˆ2 + 2q
n SIC = ln σˆ2 + qn ln n
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 14 / 38

Properties of the OLS estimators (review)
Are we doing the right thing? Under the above assumptions, the OLS estimators are Unbiased (from Gauss-Markov theorem):
E[βˆj ] = βj ∀j and Var (βˆj ) = MSE (βˆj ) is the smallest among the class of all linear unbiased estimators. Moreover, E[s2] = σ2
Example: q = 1 (simple regression model) 􏰄􏰅􏰈2􏰉
ˆ 2 1 z ̄
Var β0 =σ n􏰊t(zt−z ̄)2
􏰄ˆ 􏰅 σ2
Var β1 =􏰊t(zt−z ̄)2
z ̄ Covβ0,β1 =−σ􏰊t(zt−z ̄)2
􏰄ˆˆ􏰅
2
©Fidelio Statistical Services
Set 5 – Time Series Regression
Semester 2 2021
15 / 38

If we further assume normality for εt, i.e. εt ∼ N(0,σ2), then: yt|zt ∼N(β0+β1zt,σ2)
The OLS estimators have a Gaussian distribution. For instance,in the simple regression model,
􏰆 􏰈 2 􏰉􏰇 􏰆 2 􏰇 ˆ 21 z ̄ ˆ σ
β1∼N β1,σ n+􏰊t(zt−z ̄)2 ,β2∼N β2,􏰊t(zt−z ̄)2
βˆj are the maximum likelihood estimators of the regression coefficient.
Letting s.e.(βˆj ) denote the estimated standard error of the OLS estimator, e.g. in the simple regression model
􏱩 s2 s.e.􏰄βˆ􏰅= 􏰊
j ( z t − z ̄ ) 2 t
βˆj−βj ∼tn−q s . e . ( βˆ j )
©Fidelio Statistical Services
Set 5 – Time Series Regression
Semester 2 2021
16 / 38

F-statistic
The test statistic for the null hypothesis H0 : β0 = β1 = ··· = βk = 0, under which R2 = 0
ESS/(q − 1) R2/(q − 1)
F = RSS/(n−q) = (1−R2)/(n−q)
Under the Gaussinity (Normality) assumption we have been heavily assuming to now
F ∼F(q−1,n−q)
So what does the null hypothesis represent? And how does it reflect the
time series DGP?
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 17 / 38

Trends Again
What is a trend?
… that part of the series which when extrapolated gives the clear- est indication of the future long-term movements in the series.
A.C. Harvey, Forecasting, Structural Time Series Models and the Kalman filter, 1989, p. 284.
Let us consider again the following trend plus irregular decomposition
yt = μt + εt where μt denotes the trend component – the level of the series at time t – and εt is the noise – unmodellable, or rather what we haven’t modelled – component.
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 18 / 38

Assume for the moment that the trend is a deterministic function (e.g. a polynomial) of time t. Even with this assumption there are many models trend we can use (try to fit)
Constant trend μt = β0 (intercept – polynomial of degree zero) Linear trend μt = β0 + β1t
Quadratic trend μt = β0 + β1t + β2t2 Logistic trend μt = β0
1+β1exp(−β2t) Exponential trend μt = exp (β0 + β1t)
These are explicit models for the trend rather than the implicit modelling we did using MA. The main advantage is you can provide confidence intervals based on the analysis of the residuals, plus an indication of how well the trend model actually models the time series.
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 19 / 38

Modelling Seasonally Adjusted QGDP
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 20 / 38

Seasonality
Consider the representation yt = μt + St + εt – you’ve seen this before, but not the same form – where St is the seasonal component. I have discussed seasonality in the time series decomposition section of the course, but here are some more formal definitions
Seasonality is the systematic, although not necessarily regular, intra-year movement caused by the changes of the weather – why it’s called seasonality – the calendar, and timing of decisions, di- rectly or indirectly through the production and consumption de- cisions made by the agents of the economy. These decisions are influenced by endowments, the expectations and preferences of the agents, and the production techniques available in the economy. [Hylleberg (1995)]
… that part of the series which, when extrapolated, repeats itself over any one-year time period and averages to zero over such a time period. [Harvey (1989)]
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 21 / 38

Let s denote the number of seasons in a year, e.g. s = 4 for quarterly data – resembles seasons! – s = 12 for monthly data, and s ≈ 52 for weekly data.
A deterministic seasonal component repeats itself exactly every year and is defined by the deviation from the mean (trend) of the series: if St denotes the seasonal component, then, ∀t,
St +St−1 +…+St−s+1 =0.
Let us define a set of seasonal dummies variables, denoted Djt, taking value Djt = 1 in season j and Djt = 0 otherwise. You should be able to work out how many values j can take.
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 22 / 38

Suppose that we aim at estimating the trend + seasonal + irregular linear regression model given deterministic seasonal factors as defined in previous
slide
s
yt =β1+β2t+􏰋δjDjt+εt,
j=1
where the δj measure the seasonal effect associated with season j. There is a fundamental difficulty with this representation: one parameter is not identified. This arises from our expectation 􏰊j Djt = 1, so that one explanatory variable can be expressed as a linear combination of the others. That is we have collinearity in our regression model.
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 23 / 38

The seasonal component is identified in deviation from the trend: i.e. if St =􏰊sj=1δjDjt,theconditionSt+St−1+…+St−s+1=0implies 􏰊j δj = 0. There are many possible modifications to our linear model to account for this collinearity.
What appears obvious is to estimate the model by constrained least squares, which forces 􏰊j δj = 0. Constrained least squares is messy so the good news is we can avoid constrained LS!
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 24 / 38

An alternative, and better approach, is to reparameterise the model by defining a new set of s − 1 seasonal terms. They are defined as
D ̃jt = Djt − Dst for j = 1, . . . , s − 1, and estimate the model
s−1
yt = β0 + β1t + 􏰋 δj D ̃jt + εt
j=1
This gives us the seasonal effect for s − 1 of the s. The seasonal effect
associated with season s is obtained as:
s−1 δs =−􏰋δj
j=1
In other words as we assume the s seasonal effects add to zero, if I have estimates for s − 1 of the seasonal effects I can by simple arithmetic estimating the remaining one.
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 25 / 38

Depending on what we want our estimates of the seasonal effects to represent we may want to drop the intercept from the model. Our new
model is
where
s
yt =β1t +􏰋δj∗Djt +εt
j=1
δ j∗ = δ j + β 0
These modified parameters have the following interpretation:
β0=1/s􏰋δj∗; δj=δj∗−β0.
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 26 / 38

We could just not faff about and drop one of the seasonal dummies, e.g.
the last:
Interpretation:
s−1
x =β∗∗+βt+􏰋δ∗∗D +ε
t01 jjtt j=1
β∗∗+δ∗∗=β +δ, j=1,…,s−1 0j0j
β∗∗ = β0 + δs. Summing wrt j: 0
1 s−1 β=β∗∗ 􏰋δ∗∗
00sj j=1
©Fidelio Statistical Services
Set 5 – Time Series Regression
Semester 2 2021
27 / 38

Modelling and Forecasting Visitor Arrivals
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 29 / 38

Using regression models can be extraordinarily helpful to model other effects. for example, let’s say we want to compare output in July 2020 with output in August 2020. Both are 31 days long, so we can directly compare them can’t we? Maybe.
Consider July 2020 started on a Wednesday whereas August 2020 started on a Saturday.
Number in July August
D.o.W.
Monday 4 5 Tuesday 4 4
Wednesday 5 4 Thursday 5 4 Friday 5 4 Saturday 4 5 Sunday 4 5
Table: Number of Days of the Week in July and August 2020
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 30 / 38

Calendar effects
For many series the daily outputs on the weekend vary significantly from the outputs during the working week a significant part of the difference between what we measure in July and August will just a factor from the set up of the calendar. We term this Trading day variation and it can effect monthly or quarterly data.
This is an example of how our measurements for adjacent time points can be affected by the calendar, so the trading day effect is an example of a calendar effect.
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 31 / 38

Trading Day Effect Regression Model
Our regression model for the trading day component is
6
TDt = 􏰋 γj (njt − n7t ) + γt
j=1
where njt is the number of days of type j in the month, and nt = 􏰊j njt . And why am I summing to 6 when there are 7 days in the week. Hint: it is very similar to what we did for regression model for seasonal factors.
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 32 / 38

Calendar Effects – general
Often we can assume that weekday are similar and weekend days are similar. Then a single TD regressor (number of weekdays in excess of weekend days) will do:
5 2
TDt=γ 􏰋njt−5(n6t+n7t) j=1
This has the advantage of a smaller number of regressors, and thus we aren’t trying to estimate a lot of parameters from a small number of values in the time series (and remember we also rely on DGP being the same). We can create a similar style of regressor for moving festivals e.g. Easter, Chinese New Year, school holidays.
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 33 / 38

Other Regression Models
Interventions for special features: outliers, structural breaks
This is a series of regression models that a very important at present in the time series I work with at StatsNZ.
Outliers are a common problem when trying to model time series. As I have said previously missing data is very hard to handle in time series, so you can’t merely drop them from your data like you can in most data analysis. However leaving them in can materially affect your parameter estimates, which then leads to misleading forecasts.
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 34 / 38

Modelling for Outliers
Let’s assume that we suspect an outlier at time τ and we want to estimate the effect of the outlier on the time series λ at time τ. We can use the regression model
yt =Trend+Seas.+λIt(t=τ)+εt It(t=τ)=1atift=τ andzeroatallothertimes.
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 35 / 38

We can extend this model to account for other possible effects on the data, say a structural change in the DGP.
Additive outlier at time τ
yt =Trend+Seas.+λIt(t=τ)+εt
Level shift at time τ
yt =Trend+Seas.+λIt(t≥τ)+εt
Slope change at time tau
yt =Trend+Seas.+λ[It(t≥τ)×t]+εt
It (expression) = 1 if the expression is true, zero if not.
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 36 / 38

Conditional Prediction E(xT+h|zT+h) The estimator of the regression function is:
yˆ = 􏰋 βˆ z t jjt
j
Properties:
Unbiased:E(xˆ|z)=􏰊βz =E(x|z)
t t j j jt t t
Var(xˆ |z ) is readily available (we will not bother with this). If we tt
ignore parameter uncertainty (i.e. consider the βj ’s as fixed and known), then V (xˆ |z ) = σ2
The variance can be estimated by replacing σ2 with s2.
tt
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 37 / 38

Assuming no parameter uncertainty, interval forecasts can be constructed from the approximate (large sample) result
y n + m ∼ N 􏰀 yˆ n + m | T , s 2 􏰁
This assumption makes the forecast uncertainty invariant to the forecast horizon but this is just an approximation. The DGP is unlikely to be similar the farther out you forecast for many series, even with physics.
If we consider parameter uncertainty, interval forecasts can be constructed from the approximate (large sample) result
yn+m ∼ N(yˆn+m|T , s2 􏰀1 + yt􏱸(ZZ􏱸)−1y)􏰁
©Fidelio Statistical Services Set 5 – Time Series Regression Semester 2 2021 38 / 38

Related Posts