Collinearity
Australian National University
(James Taylor) 1 / 5
4.0
KEKE
Perfect Multicollinearity
Perfect multicollinearity occurs in OLS problems when there is an
exact linear relation among the regressors/explanatory data
Mathematically, this is a huge problem, as the data matrix X is not
‘full rank’
Which means (X0X) is not invertible,
Which means b̂ = (X0X)�1X0y doesn’t exist.
The problem is really one of uniqueness, rather then existence. Lots
of di↵erent b̂ options will minimise the squared error.
(James Taylor) 2 / 5
3454845442
Perfect Multicollinearity – Example
Consider the model
y =
0
BBBBB
@
1 1 0
1 0 1
1 0.5 0.5
0 1 �1
1
CCCCC
A
0
BB
@
b1
b2
b3
1
CC
A+ e
Suppose (b1, b2, b3) = (1, 2, 3) is a least-squares estimator for the
previous problem
Then (a1, a2, a3) = (0, 3, 4) is also a least squares estimator
Because in both cases Xb̂ is the same.
(James Taylor) 3 / 5
Ci Ut4
eg.u a asa a G
Iv Iv Iv
X B
g
Perfect Multicollinearity – Solution
Very easy solution
Identify a column of X which is a linear combination of the other
columns (usually many choices, any choice is fine)
Remove that column from X, and it’s associated b term
Repeat until the columns are independent
(James Taylor) 4 / 5
L
(Near) Multicollinearity
(almost) Multicollinearity occurs when the true values of the
regressors are collinear, though we see them with error, so it appears
that they are (just) not collinear.
Means that X0X is nearly singular, so numerical inversions will be
inaccurate.
This is sometimes a problem, but not always.
Can still give acceptable forecasts, but the regression coe�cients
cannot be interpreted.
(James Taylor) 5 / 5
Seasonality
Using Seasonal Dummies
Australian National University
(James Taylor) 1 / 9
4 I
Seasonality
A seasonal pattern is a regular intra-year pattern which repeats every
year
Can appear from many sources; preferences, technology, social
institutions etc.
For example:
any consumption or product which involves the weather (energy,
agriculture, medicine)
retail sales (peaks around Christmas)
personal tax advice (peaks around June/July)
(James Taylor) 2 / 9
Modelling Seasonality: An Example
Easiest way is to introduce seasonal dummy variables
Suppose yt is observed quarterly and has no trend
Could specify yt as fluctuating around µ:
yt = µ + et , E[et | µ] = 0
(James Taylor) 3 / 9
Modelling Seasonality: An Example
Under this specification, E[yt | µ] = µ regardless of which quarter
we’re in.
There is no seasonal variation.
Suppose our historical data shows peaks regularly in the fourth
quarter (Christmas).
How could we model that?
(James Taylor) 4 / 9
Seasonal Dummies
Ceteris parabis, we expect a larger yt if it’s the fourth quarter
Consider the specification
yt = µ + aDt + et
where Dt is a dummy variable which is 1 in the fourth quarter, and 0
otherwise.
What does this dummy variable do?
(James Taylor) 5 / 9
Seasonal Dummies
The dummy variable allows the expectation of yt to di↵er between
the fourth quarter and other quarters.
E[yt | µ, a,Dt = 0] = µ
E[yt | µ, a,Dt = 1] = µ + a
(James Taylor) 6 / 9
More Seasonal Dummies
Similarly, we could include more dummy variables to allow for a more
complex seasonal pattern.
yt = µ + a1D1t + a2D2t + a3D3t + a4D4t + et
In matrix form this is
y =
0
BBBBB
@
1 D11 D21 D31 D41
1 D12 D22 D32 D42
…
…
…
…
…
1 D1T D2T D3T D4T
1
CCCCC
A
0
BBBBBBB
@
µ
a1
a2
a3
a4
1
CCCCCCC
A
+ e
(James Taylor) 7 / 9
X p
More Seasonal Dummies
That is, X is given by
X =
0
BBBBB
@
1 D11 D21 D31 D41
1 D12 D22 D32 D42
…
…
…
…
…
1 D1T D2T D3T D4T
1
CCCCC
A
As D1t +D2t +D3t +D4t = 1, we have that the rank of X is less
than 5.
Therefore X0X is not invertible.
Therefore we cannot uniquely solve OLS.
(James Taylor) 8 / 9
Dealing with Collinearity
Solving this problem is very easy, just drop the intercept (or more
often, any of the dummies).
Consider instead
yt = a1D1t + a2D2t + a3D3t + a4D4t + et
Then X0X is invertible (as X has only 4 columns)
Additionally, ai gives the mean of y in the i-th quarter
(James Taylor) 9 / 9
IE YtCd41 2
Australian Retail Sales
Seasonality Example
Australian National University
(James Taylor) 1 / 13
4 L
Forecasting Australian Retail Sales
Figure: AUS non-seasonally adjusted retail sales from 2009 Q1 to 2019 Q1, from
FRED
(James Taylor) 2 / 13
A Brief Look at the Data
Two prominent features: trend and seasonality
There is a clear trend pattern, there seems to be an overall linear
upward trend
There is a clear seasonal pattern – sales figures jump in the fourth
quarter
So our specification should allow for trending and seasonal variation
(James Taylor) 3 / 13
Allowing for Seasonal Variation
Focus on modelling seasonality, assume a linear trend for all
specifications
We will consider three di↵erent specifications for seasonal variation
Let Dit be a dummy variable for the i-th quarter, i.e.
Dit =
8
<
:
1 if t is in the i-th quarter
0 if t is not in the i-th quarter
(James Taylor) 4 / 13
Competing Specifications
Consider the following three specifications
S1 : yt = a0 + a1t + a4D4t + et
S2 : yt = a0 + a1t + a1D1t + a4D4t + et
S3 : yt = a1t + a1D1t + a2D2t + a3D3t + a4D4t + et
In S1 we only include a dummy variable for the fourth quarter
In S3 we allow all quarters to be di↵erent (and remove the intercept)
Specification S2 is a compromise.
(James Taylor) 5 / 13
Data
Australian Retail from 2009Q1 to 2019Q1, not seasonally adjusted, in
real terms
The dataset AUSRetail.csv has two columns
The first column contains the retail sales values, the second is a
quarter indicator
(James Taylor) 6 / 13
Setting up the problem
Consider the third specification S3:
S3 : yt = a1t + a1D1t + a2D2t + a3D3t + a4D4t + et
Need this in the form y = Xb + e.
0
BBBBBBBBBBB
@
y1
y2
y3
...
...
yT
1
CCCCCCCCCCC
A
=
0
BBBBBBBBBBB
@
1 1 0 0 0
2 0 1 0 0
3 0 0 1 0
...
...
...
...
...
...
...
...
...
...
T 1 0 0 0
1
CCCCCCCCCCC
A
0
BBBBBBB
@
a1
a1
a2
a3
a4
1
CCCCCCC
A
+
0
BBBBBBBBBBB
@
e1
e2
e3
...
...
eT
1
CCCCCCCCCCC
A
(James Taylor) 7 / 13
Setting up the problem
Consider the third specification S3:
S3 : yt = a1t + a1D1t + a2D2t + a3D3t + a4D4t + et
Need this in the form y = Xb + e.
0
BBBBBBBBBBB
@
y1
y2
y3
...
...
yT
1
CCCCCCCCCCC
A
=
0
BBBBBBBBBBB
@
1 1 0 0 0
2 0 1 0 0
3 0 0 1 0
...
...
...
...
...
...
...
...
...
...
T 1 0 0 0
1
CCCCCCCCCCC
A
0
BBBBBBB
@
a1
a1
a2
a3
a4
1
CCCCCCC
A
+
0
BBBBBBBBBBB
@
e1
e2
e3
...
...
eT
1
CCCCCCCCCCC
A
(James Taylor) 7 / 13
OO
MATLAB Code for third specification
load 'AUSRetail.csv';
y = AUSRetail(:,1); Q = AUSRetail(:,2);
T = length(y); t = (1:T)';
%% construct 4 dummy variables
D1 = (Q == 1); D2 = (Q == 2);
D3 = (Q == 3); D4 = (Q == 4);
%% 3rd spec: linear trend + all dummies
X = [t D1 D2 D3 D4];
betahat = (X'*X)\(X'*y);
yhat = X*betahat;
MSE3 = mean((y�yhat).ˆ2);
AIC = T*MSE3 + 5*2;
BIC = T*MSE3 + 5*log(T);
(James Taylor) 8 / 13
I
an
In-sample Fitting Measures
Table: MSE, AIC and BIC under the three seasonality specifications
S1 S2 S3
MSE 1.0358 0.8310 0.7194
AIC 52.467 44.072 39.497
BIC 61.035 52.6397 48.065
Number of parameters 3 4 5
As the specifications are nested, the MSE’s are decreasing
Both AIC and BIC also prefer the more complex specification
(James Taylor) 9 / 13
Hakan
Fitted Lines
Figure: Fitted Values for AUS non-seasonally adjusted retail sales from 2009 Q1
to 2019 Q1, from FRED
(James Taylor) 10 / 13
Fitted Lines
Figure: Fitted Values for AUS non-seasonally adjusted retail sales from 2009 Q1
to 2019 Q1, from FRED
All fitted values look basically identical
To fit the data better we would need seasonal changes that increase
in magnitude
(James Taylor) 11 / 13
Out-of-Sample Forecasting
Table: MSFE under various specifications
S1 S2 S3
h=1 1.5458 1.4374 1.3405
h=2 1.8277 1.7048 1.6084
The recursive forecasting exercises start from T0 = 15
The 3rd specification still gives the best forecasts
(James Taylor) 12 / 13
MATLAB Code for MSFE for Second Specification
T0 = 15;
h = 2; % h�step�ahead forecast
syhat = zeros(T�h�T0+1,1);
ytph = y(T0+h:end); % observed y {t+h}
for t = T0:T�h
yt = y(1:t);
D1t = D1(1:t); D2t = D2(1:t);
D3t = D3(1:t); D4t = D4(1:t);
Xt = [ones(t,1) (1:t)' D1t D4t];
beta2 = (Xt'*Xt)\(Xt'*yt);
yhat2 = [1 t+h D1(t+h) D4(t+h)]*beta2;
syhat(t�T0+1) = yhat2;
end
MSFE2 = mean((ytph�syhat).ˆ2);
(James Taylor) 13 / 13
Yt AotaitthDittatDyttEt
i I H iI
gy yyy
Exponential Change
Case Study - Australian Retail Data
Australian National University
(James Taylor) 1 / 9
43
Exponential Change
Previous models have had linear, and polynomial growth.
These are nice, as they have simple OLS implementations.
What if we think the data has an exponential growth pattern?
Any data with a (roughly) fixed percentage growth will exhibit
exponential growth.
(James Taylor) 2 / 9
Exponential Change
Consider the model
yt = exp(a0 + a1t) exp(x1,tb1 + x2,tb2 + et)
This model cannot be written as y = Xb + e, so we cannot use OLS
(directly)
However, by taking the log of both sides, we have
ln(yt) = a0 + a1t + x1,tb1 + x2,tb2 + et
We can solve this using OLS methods
Sometimes we also want to take logs of the x terms, but it depends.
Use your judgement.
(James Taylor) 3 / 9
um
Exponential Change
Consider the model
yt = exp(a0 + a1t) exp(x1,tb1 + x2,tb2 + et)
This model cannot be written as y = Xb + e, so we cannot use OLS
(directly)
However, by taking the log of both sides, we have
ln(yt) = a0 + a1t + x1,tb1 + x2,tb2 + et
We can solve this using OLS methods
Sometimes we also want to take logs of the x terms, but it depends.
Use your judgement.
(James Taylor) 3 / 9
Comparing Logged Models to Raw Models
We need to be very careful when comparing logged models to
non-logged models.
CANNOT just work out MSFE immediately, as the scale will be all
wrong (because ln(yt) is much smaller than yt).
Need to:
1. Log the y value (maybe also some x values)
2. Estimate the b parameters
3. Forecast ln(ŷt+h)
4. Take the exponent to get ŷt+h
5. Use the ŷt+h to calculate MSFE
(James Taylor) 4 / 9
Australian Retail Data
Australian Retail Data seems like a strong candidate for exponential
growth
Growth in retail sales is driven by population growth, real per capita
economic growth, and inflation
All of these are percentage changes, so retail sales probably increases
exponentially
We will compare MSFE for logged and non-logged models
(James Taylor) 5 / 9
Australian Retail Data
Australian Retail Data seems like a strong candidate for exponential
growth
Growth in retail sales is driven by population growth, real per capita
economic growth, and inflation
All of these are percentage changes, so retail sales probably increases
exponentially
We will compare MSFE for logged and non-logged models
(James Taylor) 5 / 9
Australian Retail Data - Exponential Model
We don’t want to take logs of the dummy variables
So really all we do is follow the list:
1. Log the y value
2. Estimate the b parameters
3. Forecast ln(ŷt+h)
4. Take the exponent to get ŷt+h
5. Use the ŷt+h to calculate MSFE
(James Taylor) 6 / 9
Competing Specifications
Consider the following three specifications
lnS1 : ln(yt) = a0 + a1t + a4D4t + et
lnS2 : ln(yt) = a0 + a1t + a1D1t + a4D4t + et
lnS3 : ln(yt) = a1t + a1D1t + a2D2t + a3D3t + a4D4t + et
These are our previous specifications, just with ln(y)
(James Taylor) 7 / 9
O
iny xet s I e Taz
MATLAB Code for MSFE for Logged Third Specification
y = log(y); % take the log of the data
T0 = 15; h = 1; % h�step�ahead forecast
syhat = zeros(T�h�T0+1,1);
ytph = y(T0+h:end); % observed y {t+h}
for t = T0:T�h
yt = y(1:t);
D1t = D1(1:t); D2t = D2(1:t); D3t = D3(1:t); D4t = D4(1:t);
Xt = [(1:t)' D1t D2t D3t D4t];
beta2 = (Xt'*Xt)\(Xt'*yt);
yhat2 = [ t+h D1(t+h) D2(t+h) D3(t+h) D4(t+h)]*beta2;
syhat(t�T0+1) = yhat2;
end
syhat = exp(syhat); ytph = exp(ytph); %un�log the data
MSFE2 = mean((ytph�syhat).ˆ2)
(James Taylor) 8 / 9
gyu.ae y p q
Out-of-Sample Forecasting
Table: MSFE under logged and non-logged specifications
S1 lnS1 S2 lnS2 S3 lnS3
h=1 1.5458 1.2077 1.4374 1.0807 1.3405 0.9839
h=2 1.8277 1.4387 1.7048 1.2676 1.6084 1.2041
In every case, the logged model performed better than its non-logged
version
(James Taylor) 9 / 9
Density Forecasting with OLS
Australian National University
(James Taylor) 1 / 8
44
Density Forecast
Point forecasts are great, but we can do more
We want to determine the conditional density f (yT+1 | IT , q)
Finding this for OLS has some (fixable) problems:
Don’t know the distribution of eT+1
Don’t know q = (b, s2)0
Don’t (yet) know even an estimate for s2
(James Taylor) 2 / 8
Density Forecast: Problem 1
OLS technically only requires E(e) = 0
Suppose we are willing to assume et ⇠ N (0, s2)
The model specification is
yT+1 = xT+1b + eT+1
We have yT+1 ⇠ N (xT+1b, s2)
(James Taylor) 3 / 8
Tid
SteNN 0,62
mean variance
Density Forecast: Problem 2
We still don’t know q
A reasonable approximation might be the OLS estimate q̂ = (b̂, ŝ2)0
That is, we calculate f (yT+1 | IT , q̂)
We are ignoring parameter uncertainty here. This may be ill-advised.
(James Taylor) 4 / 8
f 64
Density Forecast: Problem 3
While we know b̂, we still don’t know ŝ2.
For now, we can use the unbiased estimator
ŝ2 =
1
T � k
T
Â
t=1
(yt � ŷt)2
where k is the number of b parameters.
We will use a (potentially) di↵erent estimator when we get to
’Maximising Log-Likelihood’
(James Taylor) 5 / 8
p ex'x5x'y
Density Forecast - Australian Retail Data
Our previous sample ends at 2019Q1
Suppose we wish to produce a density forecast for sales in 2019Q2
under S3
Find the OLS parameter estimates, find ŝ2, compute xT+1b̂
As yT+1 is normal, the 95% confidence interval is xT+1b̂ ± 1.96ŝ
(James Taylor) 6 / 8
Density Forecast - Australian Retail Data
The OLS estimates are
b̂ =
0
BBBBBBB
@
0.4822
57.5498
58.1587
59.1157
69.1578
1
CCCCCCC
A
We also have ŝ2 = 0.8194
Lastly, xT+1 =
⇣
T + 1 0 1 0 0
⌘
(James Taylor) 7 / 8
time
Density Forecast
Putting it all together:
Compute xT+1b̂ = 78.4104
Hence yT+1 ⇠ N (78.4104, 0.8194)
In particular, a 95% confidence interval is (76.6362, 80.1845)
(James Taylor) 8 / 8
K
K