CS代考 ETW3420: Principles of Forecasting and Applications

ETW3420: Principles of Forecasting and Applications

Principles of

Copyright By PowCoder代写 加微信 powcoder

Forecasting and
Applications
Topic 7: Regression Models

1 The linear model with time series

2 Residual diagnostics

3 Some useful predictors for linear models

4 Selecting predictors and forecast evaluation

5 Forecasting with regression

6 Matrix formulation

7 Correlation, causation and forecasting

Multiple regression and forecasting

yt = β0 + β1×1,t + β2×2,t + · · · + βkxk,t + εt.

yt is the variable we want to predict: the “response”
Each xj,t is numerical and is called a “predictor”. They
are usually assumed to be known for all past and future
The coefficients β1, . . . , βk measure the effect of each
predictor after taking account of the effect of all other
predictors in the model.

That is, the coefficients measure themarginal effects.
εt is a white noise error term

Example: US consumption expenditure

autoplot(uschange[,c(“Consumption”,”Income”)]) +
ylab(“% change”) + xlab(“Year”)

1970 1980 1990 2000 2010

Consumption

Example: US consumption expenditure

−2.5 0.0 2.5

Income (quarterly % change)

Example: US consumption expenditure

tslm(Consumption ~ Income, data=uschange) %>% summary

## tslm(formula = Consumption ~ Income, data = uschange)
## Residuals:
## Min 1Q Median 3Q Max
## -2.40845 -0.31816 0.02558 0.29978 1.45157
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.54510 0.05569 9.789 < 2e-16 *** ## Income 0.28060 0.04744 5.915 1.58e-08 *** ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## Residual standard error: 0.6026 on 185 degrees of freedom ## Multiple R-squared: 0.159, Adjusted R-squared: 0.1545 ## F-statistic: 34.98 on 1 and 185 DF, p-value: 1.577e-08 Example: US consumption expenditure 1970 1980 1990 2000 2010 Example: US consumption expenditure ● ●●●●●●●● ●● ● ● ●● ●● ●● ●●●●●●●●● ●● ●● ●● ●●● ●● ● ● ●●●●●●●● ●● ●●● ●● ●●● ●● ●●● ●● ●●● ● ●●●● ●●● ●●● ●●●●●●●●●●● ● ●●● ●●● ● ●● ● ●●●●● ●●●●● Consumption Income Production Savings Unemployment Co t−2 −1 0 1 2 −2.5 0.0 2.5 −5.0−2.5 0.0 2.5 −50−25 0 25 50−1.0−0.5 0.0 0.5 1.0 1.5 Example: US consumption expenditure fit.consMR <- tslm( Consumption ~ Income + Production + Unemployment + Savings, data=uschange) summary(fit.consMR) ## tslm(formula = Consumption ~ Income + Production + Unemployment + ## Savings, data = uschange) ## Residuals: ## Min 1Q Median 3Q Max ## -0.88296 -0.17638 -0.03679 0.15251 1.20553 ## Coefficients: ## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.26729 0.03721 7.184 1.68e-11 ***
## Income 0.71449 0.04219 16.934 < 2e-16 *** ## Production 0.04589 0.02588 1.773 0.0778 . ## Unemployment -0.20477 0.10550 -1.941 0.0538 . ## Savings -0.04527 0.00278 -16.287 < 2e-16 *** ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## Residual standard error: 0.3286 on 182 degrees of freedom ## Multiple R-squared: 0.754, Adjusted R-squared: 0.7486 ## F-statistic: 139.5 on 4 and 182 DF, p-value: < 2.2e-16 Example: US consumption expenditure 1970 1980 1990 2000 2010 Percentage change in US consumption expenditure Example: US consumption expenditure −2 −1 0 1 2 Fitted (predicted values) Percentage change in US consumption expenditure Example: US consumption expenditure 1970 1980 1990 2000 2010 Residuals from Linear regression model 4 8 12 16 20 −1.0 −0.5 0.0 0.5 1.0 1 The linear model with time series 2 Residual diagnostics 3 Some useful predictors for linear models 4 Selecting predictors and forecast evaluation 5 Forecasting with regression 6 Matrix formulation 7 Correlation, causation and forecasting Multiple regression and forecasting For forecasting purposes, we require the following assumptions: εt are uncorrelated and zero mean εt are uncorrelated with each xj,t. It is useful to also have εt ∼ N(0, σ2) when producing prediction intervals or doing statistical tests. Multiple regression and forecasting For forecasting purposes, we require the following assumptions: εt are uncorrelated and zero mean εt are uncorrelated with each xj,t. It is useful to also have εt ∼ N(0, σ2) when producing prediction intervals or doing statistical tests. Residual plots Useful for spotting outliers and whether the linear model was appropriate. Scatterplot of residuals εt against each predictor xj,t. Scatterplot residuals against the fitted values ŷt Expect to see scatterplots resembling a horizontal band with no values too far from the band and no patterns such as curvature or increasing spread. Residual patterns If a plot of the residuals vs any predictor in the model shows a pattern, then the relationship is nonlinear. If a plot of the residuals vs any predictor not in the model shows a pattern, then the predictor should be added to the If a plot of the residuals vs fitted values shows a pattern, then there is heteroscedasticity in the errors. (Could try a transformation.) Breusch-Godfrey test OLS regression: yt = β0 + β1xt,1 + · · · + βkxt,k + ut Auxiliary regression: ût = β0 + β1xt,1 + · · · + βkxt,k + ρ1ût−1 + · · · + ρpût−p + εt If R2 statistic is calculated for this model, then (T − p)R2 ∼ χ2p, when there is no serial correlation up to lag p, and T = length of Breusch-Godfrey test better than Ljung-Box for regression US consumption again ## Breusch-Godfrey test for serial correlation of order up to 8 ## data: Residuals from Linear regression model ## LM test = 14.874, df = 8, p-value = 0.06163 If the model fails the Breusch-Godfrey test . . . The forecasts are not wrong, but have higher variance than they need to. There is information in the residuals that we should exploit. This is done with a regression model with ARMA errors. 1 The linear model with time series 2 Residual diagnostics 3 Some useful predictors for linear models 4 Selecting predictors and forecast evaluation 5 Forecasting with regression 6 Matrix formulation 7 Correlation, causation and forecasting Linear trend t = 1, 2, . . . , T Strong assumption that trend will continue. Dummy variables If a categorical variable takes only two values (e.g., ‘Yes’ or ‘No’), then an equivalent numerical variable can be constructed taking value 1 if yes and 0 if no. This is called a dummy variable. Dummy variables If there are more than two categories, then the variable can be coded using several dummy variables (one fewer than the total number of categories). Beware of the dummy variable trap! Using one dummy for each category gives too many dummy variables! The regression will then be singular and inestimable. Either omit the constant, or omit the dummy for one The coefficients of the dummies are relative to the omitted Uses of dummy variables Seasonal dummies For quarterly data: use 3 dummies For monthly data: use 11 dummies For daily data: use 6 dummies What to do with weekly data? If there is an outlier, you can use a dummy variable (taking value 1 for that observation and 0 elsewhere) to remove its effect. Public holidays For daily data: if it is a public holiday, dummy=1, otherwise dummy=0. Uses of dummy variables Seasonal dummies For quarterly data: use 3 dummies For monthly data: use 11 dummies For daily data: use 6 dummies What to do with weekly data? If there is an outlier, you can use a dummy variable (taking value 1 for that observation and 0 elsewhere) to remove its effect. Public holidays For daily data: if it is a public holiday, dummy=1, otherwise dummy=0. Uses of dummy variables Seasonal dummies For quarterly data: use 3 dummies For monthly data: use 11 dummies For daily data: use 6 dummies What to do with weekly data? If there is an outlier, you can use a dummy variable (taking value 1 for that observation and 0 elsewhere) to remove its effect. Public holidays For daily data: if it is a public holiday, dummy=1, otherwise dummy=0. Beer production revisited 1995 2000 2005 2010 Australian quarterly beer production Beer production revisited Regression model yt = β0 + β1t + β2d2,t + β3d3,t + β4d4,t + εt di,t = 1 if t is quarter i and 0 otherwise. Beer production revisited fit.beer <- tslm(beer ~ trend + season) summary(fit.beer) ## tslm(formula = beer ~ trend + season) ## Residuals: ## Min 1Q Median 3Q Max ## -42.903 -7.599 -0.459 7.991 21.789 ## Coefficients: ## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 441.80044 3.73353 118.333 < 2e-16 *** ## trend -0.34027 0.06657 -5.111 2.73e-06 *** ## season2 -34.65973 3.96832 -8.734 9.10e-13 *** ## season3 -17.82164 4.02249 -4.430 3.45e-05 *** ## season4 72.79641 4.02305 18.095 < 2e-16 *** ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## Residual standard error: 12.23 on 69 degrees of freedom ## Multiple R-squared: 0.9243, Adjusted R-squared: 0.9199 ## F-statistic: 210.7 on 4 and 69 DF, p-value: < 2.2e-16 27 Beer production revisited 1995 2000 2005 2010 Quarterly Beer Production Beer production revisited 400 450 500 Actual values Quarterly beer production Beer production revisited 1995 2000 2005 2010 Residuals from Linear regression model −40 −20 0 20 Beer production revisited 1995 2000 2005 2010 Forecasts from Linear regression model Fourier series Periodic seasonality can be handled using pairs of Fourier terms: sk(t) = sin ck(t) = cos yt = a + bt + [αksk(t) + βkck(t)] + εt Every periodic function can be approximated by sums of sin and cos terms for large enough K. Choose K by minimizing AICc. Called “harmonic regression” fit <- tslm(y ~ trend + fourier(y, K)) Harmonic regression: beer production fourier.beer <- tslm(beer ~ trend + fourier(beer, K=2)) summary(fourier.beer) ## tslm(formula = beer ~ trend + fourier(beer, K = 2)) ## Residuals: ## Min 1Q Median 3Q Max ## -42.903 -7.599 -0.459 7.991 21.789 ## Coefficients: ## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 446.87920 2.87321 155.533 < 2e-16 *** ## trend -0.34027 0.06657 -5.111 2.73e-06 *** ## fourier(beer, K = 2)S1-4 8.91082 2.01125 4.430 3.45e-05 *** ## fourier(beer, K = 2)C1-4 53.72807 2.01125 26.714 < 2e-16 *** ## fourier(beer, K = 2)C2-4 13.98958 1.42256 9.834 9.26e-15 *** ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## Residual standard error: 12.23 on 69 degrees of freedom ## Multiple R-squared: 0.9243, Adjusted R-squared: 0.9199 ## F-statistic: 210.7 on 4 and 69 DF, p-value: < 2.2e-16 33 Fourier series With Fourier terms, we often need fewer predictors than with dummy variables, especially whenm is large. This makes them useful for weekly data, for example, where For short seasonal periods (e.g., quarterly data), there is little advantage in using Fourier terms over seasonal dummy variables. Intervention variables Equivalent to a dummy variable for handling an outlier ; account for the effect which lasts for only one period. Variable takes value 0 before the intervention and 1 afterwards. Change of slope Variables take values 0 before the intervention and values {1, 2, 3, . . . } afterwards; (See piecewise linear trend) Intervention variables Equivalent to a dummy variable for handling an outlier ; account for the effect which lasts for only one period. Variable takes value 0 before the intervention and 1 afterwards. Change of slope Variables take values 0 before the intervention and values {1, 2, 3, . . . } afterwards; (See piecewise linear trend) Intervention variables Equivalent to a dummy variable for handling an outlier ; account for the effect which lasts for only one period. Variable takes value 0 before the intervention and 1 afterwards. Change of slope Variables take values 0 before the intervention and values {1, 2, 3, . . . } afterwards; (See piecewise linear trend) For monthly data Christmas: always in December so part of monthly seasonal Easter: use a dummy variable vt = 1 if any part of Easter is in that month, vt = 0 otherwise. Ramadan and Chinese new year similar. Trading days With monthly data, if the observations vary depending on how many different types of days in the month, then trading day predictors can be useful. z1 = # Mondays in month; z2 = # Tuesdays in month; z7 = # Sundays in month. Distributed lags Lagged values of a predictor. Example: x is advertising which has a delayed effect x1 = advertising for previous month; x2 = advertising for two months previously; xm = advertising formmonths previously. Nonlinear trend Piecewise linear trend with bend at τ  0 t < τ(t− τ ) t ≥ τ Nonlinear trend τ is the “knot” or point in time at which the line should bend. By fitting a piecewise linear trend which bends at some point in time, the nonlinear trend can be constructed via a series of linear pieces. If the associated coefficients of x1,t and x2,t are β1 and β2, - β1 = slope of trend before time τ . - β1 + β2 = slope of trend after time τ . 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com