ETW3420: Principles of Forecasting and Applications
Principles of
Copyright By PowCoder代写 加微信 powcoder
Forecasting and
Applications
Topic 7: Regression Models
1 The linear model with time series
2 Residual diagnostics
3 Some useful predictors for linear models
4 Selecting predictors and forecast evaluation
5 Forecasting with regression
6 Matrix formulation
7 Correlation, causation and forecasting
Multiple regression and forecasting
yt = β0 + β1×1,t + β2×2,t + · · · + βkxk,t + εt.
yt is the variable we want to predict: the “response”
Each xj,t is numerical and is called a “predictor”. They
are usually assumed to be known for all past and future
The coefficients β1, . . . , βk measure the effect of each
predictor after taking account of the effect of all other
predictors in the model.
That is, the coefficients measure themarginal effects.
εt is a white noise error term
Example: US consumption expenditure
autoplot(uschange[,c(“Consumption”,”Income”)]) +
ylab(“% change”) + xlab(“Year”)
1970 1980 1990 2000 2010
Consumption
Example: US consumption expenditure
−2.5 0.0 2.5
Income (quarterly % change)
Example: US consumption expenditure
tslm(Consumption ~ Income, data=uschange) %>% summary
## tslm(formula = Consumption ~ Income, data = uschange)
## Residuals:
## Min 1Q Median 3Q Max
## -2.40845 -0.31816 0.02558 0.29978 1.45157
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.54510 0.05569 9.789 < 2e-16 ***
## Income 0.28060 0.04744 5.915 1.58e-08 ***
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Residual standard error: 0.6026 on 185 degrees of freedom
## Multiple R-squared: 0.159, Adjusted R-squared: 0.1545
## F-statistic: 34.98 on 1 and 185 DF, p-value: 1.577e-08
Example: US consumption expenditure
1970 1980 1990 2000 2010
Example: US consumption expenditure
● ●●●●●●●● ●●
● ● ●● ●● ●●
●●●●●●●●● ●●
●● ●● ●●● ●● ●
● ●●●●●●●● ●●
●●● ●● ●●●
●● ●●● ●● ●●● ●
●●●● ●●● ●●●
●●●●●●●●●●●
● ●●● ●●● ● ●●
● ●●●●● ●●●●●
Consumption Income Production Savings Unemployment Co
t−2 −1 0 1 2 −2.5 0.0 2.5 −5.0−2.5 0.0 2.5 −50−25 0 25 50−1.0−0.5 0.0 0.5 1.0 1.5
Example: US consumption expenditure
fit.consMR <- tslm(
Consumption ~ Income + Production + Unemployment + Savings,
data=uschange)
summary(fit.consMR)
## tslm(formula = Consumption ~ Income + Production + Unemployment +
## Savings, data = uschange)
## Residuals:
## Min 1Q Median 3Q Max
## -0.88296 -0.17638 -0.03679 0.15251 1.20553
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.26729 0.03721 7.184 1.68e-11 ***
## Income 0.71449 0.04219 16.934 < 2e-16 ***
## Production 0.04589 0.02588 1.773 0.0778 .
## Unemployment -0.20477 0.10550 -1.941 0.0538 .
## Savings -0.04527 0.00278 -16.287 < 2e-16 ***
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Residual standard error: 0.3286 on 182 degrees of freedom
## Multiple R-squared: 0.754, Adjusted R-squared: 0.7486
## F-statistic: 139.5 on 4 and 182 DF, p-value: < 2.2e-16
Example: US consumption expenditure
1970 1980 1990 2000 2010
Percentage change in US consumption expenditure
Example: US consumption expenditure
−2 −1 0 1 2
Fitted (predicted values)
Percentage change in US consumption expenditure
Example: US consumption expenditure
1970 1980 1990 2000 2010
Residuals from Linear regression model
4 8 12 16 20
−1.0 −0.5 0.0 0.5 1.0
1 The linear model with time series
2 Residual diagnostics
3 Some useful predictors for linear models
4 Selecting predictors and forecast evaluation
5 Forecasting with regression
6 Matrix formulation
7 Correlation, causation and forecasting
Multiple regression and forecasting
For forecasting purposes, we require the following assumptions:
εt are uncorrelated and zero mean
εt are uncorrelated with each xj,t.
It is useful to also have εt ∼ N(0, σ2) when producing prediction
intervals or doing statistical tests.
Multiple regression and forecasting
For forecasting purposes, we require the following assumptions:
εt are uncorrelated and zero mean
εt are uncorrelated with each xj,t.
It is useful to also have εt ∼ N(0, σ2) when producing prediction
intervals or doing statistical tests.
Residual plots
Useful for spotting outliers and whether the linear model was
appropriate.
Scatterplot of residuals εt against each predictor xj,t.
Scatterplot residuals against the fitted values ŷt
Expect to see scatterplots resembling a horizontal band with
no values too far from the band and no patterns such as
curvature or increasing spread.
Residual patterns
If a plot of the residuals vs any predictor in the model shows
a pattern, then the relationship is nonlinear.
If a plot of the residuals vs any predictor not in the model
shows a pattern, then the predictor should be added to the
If a plot of the residuals vs fitted values shows a pattern, then
there is heteroscedasticity in the errors. (Could try a
transformation.)
Breusch-Godfrey test
OLS regression:
yt = β0 + β1xt,1 + · · · + βkxt,k + ut
Auxiliary regression:
ût = β0 + β1xt,1 + · · · + βkxt,k + ρ1ût−1 + · · · + ρpût−p + εt
If R2 statistic is calculated for this model, then
(T − p)R2 ∼ χ2p,
when there is no serial correlation up to lag p, and T = length of
Breusch-Godfrey test better than Ljung-Box for regression
US consumption again
## Breusch-Godfrey test for serial correlation of order up to 8
## data: Residuals from Linear regression model
## LM test = 14.874, df = 8, p-value = 0.06163
If the model fails the Breusch-Godfrey test . . .
The forecasts are not wrong, but have higher variance than they need to.
There is information in the residuals that we should exploit.
This is done with a regression model with ARMA errors.
1 The linear model with time series
2 Residual diagnostics
3 Some useful predictors for linear models
4 Selecting predictors and forecast evaluation
5 Forecasting with regression
6 Matrix formulation
7 Correlation, causation and forecasting
Linear trend
t = 1, 2, . . . , T
Strong assumption that trend will continue.
Dummy variables
If a categorical variable takes only
two values (e.g., ‘Yes’ or ‘No’), then
an equivalent numerical variable
can be constructed taking value 1 if
yes and 0 if no. This is called a
dummy variable.
Dummy variables
If there are more than two
categories, then the variable
can be coded using several
dummy variables (one fewer
than the total number of
categories).
Beware of the dummy variable trap!
Using one dummy for each category gives too many dummy
variables!
The regression will then be singular and inestimable.
Either omit the constant, or omit the dummy for one
The coefficients of the dummies are relative to the omitted
Uses of dummy variables
Seasonal dummies
For quarterly data: use 3 dummies
For monthly data: use 11 dummies
For daily data: use 6 dummies
What to do with weekly data?
If there is an outlier, you can use a dummy variable
(taking value 1 for that observation and 0 elsewhere) to
remove its effect.
Public holidays
For daily data: if it is a public holiday, dummy=1,
otherwise dummy=0.
Uses of dummy variables
Seasonal dummies
For quarterly data: use 3 dummies
For monthly data: use 11 dummies
For daily data: use 6 dummies
What to do with weekly data?
If there is an outlier, you can use a dummy variable
(taking value 1 for that observation and 0 elsewhere) to
remove its effect.
Public holidays
For daily data: if it is a public holiday, dummy=1,
otherwise dummy=0.
Uses of dummy variables
Seasonal dummies
For quarterly data: use 3 dummies
For monthly data: use 11 dummies
For daily data: use 6 dummies
What to do with weekly data?
If there is an outlier, you can use a dummy variable
(taking value 1 for that observation and 0 elsewhere) to
remove its effect.
Public holidays
For daily data: if it is a public holiday, dummy=1,
otherwise dummy=0.
Beer production revisited
1995 2000 2005 2010
Australian quarterly beer production
Beer production revisited
Regression model
yt = β0 + β1t + β2d2,t + β3d3,t + β4d4,t + εt
di,t = 1 if t is quarter i and 0 otherwise.
Beer production revisited
fit.beer <- tslm(beer ~ trend + season)
summary(fit.beer)
## tslm(formula = beer ~ trend + season)
## Residuals:
## Min 1Q Median 3Q Max
## -42.903 -7.599 -0.459 7.991 21.789
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 441.80044 3.73353 118.333 < 2e-16 ***
## trend -0.34027 0.06657 -5.111 2.73e-06 ***
## season2 -34.65973 3.96832 -8.734 9.10e-13 ***
## season3 -17.82164 4.02249 -4.430 3.45e-05 ***
## season4 72.79641 4.02305 18.095 < 2e-16 ***
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Residual standard error: 12.23 on 69 degrees of freedom
## Multiple R-squared: 0.9243, Adjusted R-squared: 0.9199
## F-statistic: 210.7 on 4 and 69 DF, p-value: < 2.2e-16 27
Beer production revisited
1995 2000 2005 2010
Quarterly Beer Production
Beer production revisited
400 450 500
Actual values
Quarterly beer production
Beer production revisited
1995 2000 2005 2010
Residuals from Linear regression model
−40 −20 0 20
Beer production revisited
1995 2000 2005 2010
Forecasts from Linear regression model
Fourier series
Periodic seasonality can be handled using pairs of Fourier terms:
sk(t) = sin
ck(t) = cos
yt = a + bt +
[αksk(t) + βkck(t)] + εt
Every periodic function can be approximated by sums of sin
and cos terms for large enough K.
Choose K by minimizing AICc.
Called “harmonic regression”
fit <- tslm(y ~ trend + fourier(y, K))
Harmonic regression: beer production
fourier.beer <- tslm(beer ~ trend + fourier(beer, K=2))
summary(fourier.beer)
## tslm(formula = beer ~ trend + fourier(beer, K = 2))
## Residuals:
## Min 1Q Median 3Q Max
## -42.903 -7.599 -0.459 7.991 21.789
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 446.87920 2.87321 155.533 < 2e-16 ***
## trend -0.34027 0.06657 -5.111 2.73e-06 ***
## fourier(beer, K = 2)S1-4 8.91082 2.01125 4.430 3.45e-05 ***
## fourier(beer, K = 2)C1-4 53.72807 2.01125 26.714 < 2e-16 ***
## fourier(beer, K = 2)C2-4 13.98958 1.42256 9.834 9.26e-15 ***
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Residual standard error: 12.23 on 69 degrees of freedom
## Multiple R-squared: 0.9243, Adjusted R-squared: 0.9199
## F-statistic: 210.7 on 4 and 69 DF, p-value: < 2.2e-16 33
Fourier series
With Fourier terms, we often need fewer predictors than
with dummy variables, especially whenm is large.
This makes them useful for weekly data, for example, where
For short seasonal periods (e.g., quarterly data), there is little
advantage in using Fourier terms over seasonal dummy
variables.
Intervention variables
Equivalent to a dummy variable for handling an outlier ;
account for the effect which lasts for only one period.
Variable takes value 0 before the intervention and 1
afterwards.
Change of slope
Variables take values 0 before the intervention and values
{1, 2, 3, . . . } afterwards; (See piecewise linear trend)
Intervention variables
Equivalent to a dummy variable for handling an outlier ;
account for the effect which lasts for only one period.
Variable takes value 0 before the intervention and 1
afterwards.
Change of slope
Variables take values 0 before the intervention and values
{1, 2, 3, . . . } afterwards; (See piecewise linear trend)
Intervention variables
Equivalent to a dummy variable for handling an outlier ;
account for the effect which lasts for only one period.
Variable takes value 0 before the intervention and 1
afterwards.
Change of slope
Variables take values 0 before the intervention and values
{1, 2, 3, . . . } afterwards; (See piecewise linear trend)
For monthly data
Christmas: always in December so part of monthly seasonal
Easter: use a dummy variable vt = 1 if any part of Easter is in
that month, vt = 0 otherwise.
Ramadan and Chinese new year similar.
Trading days
With monthly data, if the observations vary depending on how
many different types of days in the month, then trading day
predictors can be useful.
z1 = # Mondays in month;
z2 = # Tuesdays in month;
z7 = # Sundays in month.
Distributed lags
Lagged values of a predictor.
Example: x is advertising which has a delayed effect
x1 = advertising for previous month;
x2 = advertising for two months previously;
xm = advertising formmonths previously.
Nonlinear trend
Piecewise linear trend with bend at τ
0 t < τ(t− τ ) t ≥ τ
Nonlinear trend
τ is the “knot” or point in time at which the line should bend.
By fitting a piecewise linear trend which bends at some point
in time, the nonlinear trend can be constructed via a series of
linear pieces.
If the associated coefficients of x1,t and x2,t are β1 and β2,
- β1 = slope of trend before time τ .
- β1 + β2 = slope of trend after time τ .
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com