Model Specification and Diagnostics
Model Specification and Diagnostics
MAS 640 – Time Series Analysis and Forecasting
1/30/2018
Outline
I Chapter 6 – Model Specification
I ACF/PACF/EACF for determining p, d , q
I Chapter 7 – Parameter Estimation
I We will skip this
I Chapter 8 – Model Diagnostics
I Does the model fit well?
I Which model fits better?
Strategy
1. Plot the time series
2. Determine appropriate transformation for nonconstant variance
3. Difference or detrend as needed
4. Plot ACF, PACF, EACF to help identify appropriate order of p
and q
Strategy
1. Plot the time series. Looking for
I trends
I seasonality
I nonconstant variance
I outliers
I abrupt changes
Strategy
2. Determine appropriate transformation if nonconstant variance
exists
I Box-Cox procedure
I Try several and stick with best
Strategy
3. Difference data as needed to remove trends
I Will likely notice trend in time series plot
I Can plot ACF on original series, ACF will decay very slowly
I First difference likely sufficient, possibly 2nd order difference
I Rarely, if ever, will you need to difference more than twice
Strategy
4. Plot sample ACF, PACF, and EACF on the stationary series to
determine p and q
I Can be original, transformed, differenced, or
transformed/differenced series (whichever one is stationary)
I p and q are generally never too high, say ≤ 4 (excluding
seasonal models which we haven’t covered yet)
I Principle of Parsimony
I Use knowledge of theoretical patterns in ACF, PACF, and EACF
to guide selection of p and q
I ACF cuts off at q for MA(q)
I PACF cuts off at p for AR(p)
I Wedge or upper left 0 in EACF table identifies p and q for
ARMA(p, q) models
Identifying Model Order
I “The art of model selection is very much like the method of an
FBI agent’s criminal search. Most criminals disguise themselves
to avoid being recognized.”
I Similar for ACF, PACF, and EACF. Sampling variation can
disguise theoretical patterns making it difficult to clearly
ascertain p and q
I At the end of these steps. . .
I probably have multiple candidate models
I rare to have one definite model in practice
Next Steps (Chapter 8)
I Fit candidate models
I Compare them
I Diagnose assumptions
I Determine which seems most appropriate
I After selecting the “best” model, forecasting becomes the
central focus (Chapter 9)
Supreme Court Data
I The supreme court dataset represents the acceptance rate of
cases appealed to the Supreme Court during 1926-2004.
I Convert to stationary, then identify p and q
Supreme Court Data
Original Data
Time
ca
se
s
0 20 40 60 80
0
.0
5
0
.1
0
0
.1
5
0
.2
0
−2 −1 0 1 2
2
5
0
3
0
0
3
5
0
λ
L
o
g
L
ik
e
lih
o
o
d
95%
Log Transformed
Time
ca
se
s
0 20 40 60 80
−
4
.5
−
3
.5
−
2
.5
−
1
.5
Log Transformed and Differenced
Time
ca
se
s
0 20 40 60 80
−
0
.4
0
.0
0
.2
0
.4
Supreme Court Data
5 10 15
−
0
.3
−
0
.1
0
.1
Series diff(log(sc))
Lag
A
C
F
5 10 15
−
0
.3
−
0
.1
0
.1
Lag
P
a
rt
ia
l A
C
F
Series diff(log(sc))
Supreme Court Data
AR/MA
0 1 2 3 4 5 6 7 8 9 10 11 12 13
0 x o o o o o o x o x x o o o
1 x x o o o o o x o o x o o o
2 x x o o o o o o o o o o o o
3 x x x o o o o o o o o o o o
4 x o o o o o o o o o o o o o
5 x o x o o o o o o o o o o o
6 x x x o x x o o o o o o o o
7 o o x x o o x o o o o o o o
Supreme Court Data
I Based on everything, probably start with an MA(1) model
I Why?
arima(log(sc), order=c(0, 1, 1))$coef
ma1
-0.3556319
Supreme Court Data – MA(1) Model
After fitting an MA(1) model, we find the following regression
equation –
∇log(Yt) = et − 0.356et−1
I As with linear regression, we study residuals to determine
appropriateness of model
I In time series, we want the residuals to look like white noise
I Normally distributed
I Independent
I Constant variance (though this should have been taken care of
in previous steps. . . )
Diagnostic Tools
I Normality
I Histogram of residuals
I Normal QQ plot of residuals
I Shapiro-Wilk Test
I Independence
I Runs test
I Plot of residual ACF
I Ljung-Box Test
Normality
scMA1 <- arima(log(sc), order=c(0, 1, 1) ) hist(rstandard(scMA1)) # histogram qqnorm(rstandard(scMA1)) # qqplot qqline(rstandard(scMA1)) shapiro.test(rstandard(scMA1) ) # shapiro test Histogram and QQ Plot Histogram of rstandard(scMA1) rstandard(scMA1) F re q u e n cy −3 −2 −1 0 1 2 0 5 1 0 1 5 −2 −1 0 1 2 − 2 − 1 0 1 Normal Q−Q Plot Theoretical Quantiles S a m p le Q u a n til e s Shapiro-Wilk Test I Ho: The standardized residuals are normally distributed I Ha: The standardized residuals are not norally distributed shapiro.test(rstandard(scMA1)) Shapiro-Wilk normality test data: rstandard(scMA1) W = 0.98632, p-value = 0.5639 Independence runs(rstandard(scMA1)) # runs test acf(rstandard(scMA1)) # residual ACF Box.test(residuals(scMA1), lag=k, type='Ljung-Box', fitdf=1) # test if correlation at lag k tsdiag(scMA1) # produces relevant plots Runs test I The runs test I Ho : The standardized residuals are independent I Ha: The standardized residuals are independent runs(rstandard(scMA1)) $pvalue [1] 0.864 $observed.runs [1] 37 $expected.runs [1] 38.21519 $n1 [1] 49 $n2 [1] 30 $k [1] 0 Residual ACF I Under “white noise” assumption, the autocorrelations follow N(0, σ2 = 1/n) I Plot them, if many are outside of ±1.96/ √ (n) then independence appears to be violated 5 10 15 − 0 .2 − 0 .1 0 .0 0 .1 0 .2 Series residuals(scMA1) Lag A C F Ljung-Box Test I In addition to examining lag k autocorrelation individually, we can assess their magnitude as a group I For example, most of the ACFs may be moderate, but, taken as a group might look excessive. I Ljung-Box test developed for this scenario I H0: ARMA(p, q) model is appropriate I Ha: ARMA(p, q) model is not appropriate Ljung-Box Test # tsdiag(scMA1) # runs test for multiple lags, want most p-values high Box.test(residuals(scMA1), lag=10, type='Ljung-Box', fitdf=1) Box-Ljung test data: residuals(scMA1) X-squared = 20.484, df = 9, p-value = 0.01515 Overfitting I In addition to performing thorough residual analysis, overfitting can be a useful diagnostic technique to further assess the validity of an assumed model. I Overfitting involves fitting a model more complicated than the one currenlty being considered and I examining the significance of the additional terms I examining the change in estimates from the assumed model Overfitting I When overfitting an ARMA(p, q) model, we consider the following two models - 1. ARMA(p + 1, q) 2. ARMA(p, q + 1) Overfitting Suppose my current model is an MA(1), or an ARMA(0, 1) I Fit an ARMA(0, 2) I If additional MA term is insignificant → evidence that more complicated model is not needed I If additional MA term is significant → evidence that the MA(2) is worth considering I Residual analysis, see if things improved/worsened I Fit an ARMA(1, 1) I Same as above. . . I Can continue this process I add q, add p, diagnose until you reach a suitable model that fits well Model Diagnostics with sarima() I Diagnostic plots are provided in output I Residual analysis as before 1. Standardized residuals should look random (white noise) 2. ACF of residuals should remain within boundary lines 3. Normal Q-Q plot 4. p values for Ljung-Box statistic I Tests for “randomness” of residuals I Most points should be above boundary line I If many are under it, start “overfitting” and hope for better results Comparing Models I Several models may look reasonable I Overfit supreme court data, you will find multiple ARIMA models that fit well I Common to fit several and accept the model with the lowest AIC or BIC value I AIC - Akaike Information Criterion I BIC - Bayesian Information Criterion Summary I We now know how to I specify a model I ACF/PACF/EACF for p, d , q I diagnose how well it fits (are the residuals white noise) I Normality I Independence I Compare models I overfit and diagnose I compare AIC or BIC I You are now well versed in modeling time series data in the ARIMA(p, d, q) family Summary I No model is perfect! I Model specification and selection is not always clear cut I Lots of trial and error I Not a “black box” exercise I In the end, want a simple model that fits data well I With this model, we focus on forecasting