ECON7350: Applied Econometrics for Macroeconomics and Finance
Tutorial 5: Trends and Cycles
At the end of this tutorial you should be able to:
• construct an adequate set of ADF specifications for unit root testing; • carry out ADF tests for a unit root and interpret the results;
Copyright By PowCoder代写 加微信 powcoder
• construct an adequate set of general ARIMA(p, d, q) models.
Problems with Solutions
The specification for a general ARIMA(p, d, q) model is
∆dyt =δt +aj∆dyt−j +bjεt +εt,
where δt is a general deterministic term.
• If the process has no deterministic terms, then δt = 0. • If the process includes a constant only, then δt = a0. • Ifthereisaconstantandatrend,thenδt =a0 +δt.
The file usdata.csv contains 209 observations on:
• yt ≡ log real per capita GDP (GDP); and
• rt ≡ the overnight Federal Funds Rate for the US (FFR). 1. For yt:
(a) Plot the observed time series and comment on potential trends.
Solution For this tutorial, we load the following useful packages.
Next, load the data and plot log real US GDP per capita. Clearly, per capita GDP levels are increasing in the US over the sample period, so we suspect a trend exists in the data generating process.
library(forecast)
library(dplyr)
library(zoo)
library(aTSA)
mydata <- read.delim("usdata.csv", header = TRUE, sep = ",")
dates <- as.yearqtr(mydata$obs)
y <- mydata$GDP
r <- mydata$FFR
plot(dates, y, type = "l", xlab = "Time (Quarters)",
main = "Log Real US GDP per Capita")
Log Real US GDP per Capita
1960 1970 1980 1990 2000
Time (Quarters)
(b) Construct an adequate set of ADF regression models.
Solution ADF regression are specified by lag length p along with the inclusion / exclusion of a constant and trend. Note that a trend is only included if a constant is included (i.e., we do not consider specifications with a trend, but no constant). With that in mind, we implement the familiar approach.
TT <- length(y)
ADF_est <- list()
ic <- matrix( nrow = 30, ncol = 5 )
colnames(ic) <- c("cons", "trend", "p", "aic", "bic") i <- 0
for (const in 0:1)
for (p in 0:9) {
i <- i + 1
ADF_est[[i]] <- Arima(diff(y), xreg = y[-TT],
order = c(p, 0, 0),
include.mean = as.logical(const),
include.drift = F)
ic[i,] <- c(const, 0, p, ADF_est[[i]]$aic,
ADF_est[[i]]$bic)
if (const) {
# only add a specification with trend if there is a # constant (i.e., exclude no constant with trend) for (p in 0:9)
i <- i + 1
ADF_est[[i]] <- Arima(diff(y), xreg = y[-TT],
order = c(p, 0, 0),
include.mean = as.logical(const),
include.drift = T)
ic[i,] <- c(const, 1, p, ADF_est[[i]]$aic,
ADF_est[[i]]$bic)
ic_aic <- ic[order(ic[,4]),][1:10,]
ic_bic <- ic[order(ic[,5]),][1:10,]
print(ic_aic)
## [10,] 1
print(ic_bic)
cons trend p aic 12 -1384.766 11 -1383.057 13 -1382.777 14 -1380.779 15 -1380.217 01 -1378.683 02 -1378.681 16 -1378.575 03 -1377.156 17 -1376.638
cons trend p aic 11 -1383.057 01 -1378.683 12 -1384.766 02 -1378.681 13 -1382.777
## [6,] 1
## [7,] 0
## [8,] 1
## [9,] 0
## [10,] 1
0 3 -1377.156 -1357.130
0 2 -1367.494 -1354.144
1 4 -1380.779 -1354.079
0 1 -1362.530 -1352.518
0 4 -1375.519 -1352.156
The AIC prefers specifications with both a constant and trend as well as lag lengths in the range p = 1, . . . , 5. The BIC ranking includes specifications with both a constant and trend as well as lags p = 1, . . . , 3. However, it also includes specifications with a constant only and lags p = 1, . . . , 2. Putting this information together, we select the top five specifications preferred by the BIC.
Finally, we check the residuals for all specifications in the adequate set.
adq_set <- as.matrix(arrange(as.data.frame(ic_bic[1:5,]),
const, trend, p))
adq_idx <- match(data.frame(t(adq_set[, 1:3])),
data.frame(t(ic[, 1:3])))
for (i in 1:length(adq_idx)) {
checkresiduals(ADF_est[[adq_idx[i]]])
Residuals from Regression with ARIMA(1,0,0) errors
0 50 100 150 200
0.10 0.05 0.00
−0.05 −0.10
0 5 10 15 20
## Ljung-Box test
## data: Residuals from Regression with ARIMA(1,0,0) errors
## Q* = 8.4811, df = 7, p-value = 0.2921
## Model df: 3. Total lags used: 10
Residuals from Regression with ARIMA(2,0,0) errors
0 50 100 150 200
0.10 0.05 0.00
−0.05 −0.10
0 5 10 15 20
## Ljung-Box test
## data: Residuals from Regression with ARIMA(2,0,0) errors
## Q* = 6.822, df = 6, p-value = 0.3376
## Model df: 4. Total lags used: 10
Residuals from Regression with ARIMA(1,0,0) errors
0 50 100 150 200
0.10 0.05 0.00
−0.05 −0.10
0 5 10 15 20
−0.02 0.00 0.02
## Ljung-Box test
## data: Residuals from Regression with ARIMA(1,0,0) errors
## Q* = 7.7825, df = 6, p-value = 0.2545
## Model df: 4. Total lags used: 10
Residuals from Regression with ARIMA(2,0,0) errors
0 50 100 150 200
0.10 0.05 0.00
−0.05 −0.10
0 5 10 15 20
−0.02 0.00 0.02
## Ljung-Box test
## data: Residuals from Regression with ARIMA(2,0,0) errors
## Q* = 5.4569, df = 5, p-value = 0.3627
## Model df: 5. Total lags used: 10
Residuals from Regression with ARIMA(3,0,0) errors
0 50 100 150 200
0.10 0.05 0.00
−0.05 −0.10
0 5 10 15 20
−0.02 0.00 0.02
## Ljung-Box test
## data: Residuals from Regression with ARIMA(3,0,0) errors
## Q* = 5.466, df = 4, p-value = 0.2427
## Model df: 6. Total lags used: 10
As no obvious problems jump out from the residuals analysis, we proceed with the adequate set constructed above.
(c) Implement the ADF test for a stochastic trend and draw inference regarding the integration properties of yt.
Solution Use the adf.test function from the aTSA package. adf.test(y)
## Augmented Dickey-
## alternative: stationary
## Type 1: no drift no trend
## lag ADF p.value
## [1,] 0 -10.33 0.01
## [2,] 1 -5.31 0.01
## [3,] 2 -4.05 0.01
## [4,] 3 -3.79 0.01
## [5,] 4 -3.54 0.01
## Type 2: with drift no trend
## lag ADF p.value
## Type 3: with drift and trend
## Note: in fact, p.value = 0.01 means p.value <= 0.01
Since only “Type 2” and “Type 3” specifications are in our adequate set, we ignore the output related to “Type 1” specifications.Consequently, for all specifications in our adequate set, the null of unit root cannot be rejected. We conclude that the process is not empirically distinguishable from an integrated process with a drift
(and possibly linear growth).
(d) Repeat parts (a)-(c) for the differenced series ∆yt.
Solution The procedure is nearly identical to that used in constructing an adequate set for y, but replacing it with ∆y instead. In R, this is done by simply replacing y with diff(y).
lag ADF p.value
TT <- length(diff(y))
ADF_est_diff <- list()
ic_diff <- matrix( nrow = 30, ncol = 5 )
colnames(ic_diff) <- c("cons", "trend", "p", "aic", "bic") i <- 0
for (const in 0:1)
for (p in 0:9) {
i <- i + 1
ADF_est_diff[[i]] <- Arima(diff(diff(y)),
xreg = diff(y)[-TT],
order = c(p, 0, 0),
include.mean = as.logical(const),
include.drift = F)
ic_diff[i,] <- c(const, 0, p, ADF_est_diff[[i]]$aic,
ADF_est_diff[[i]]$bic)
if (const) {
# only add a specification with trend if there is a # constant (i.e., exclude no constant with trend) for (p in 0:9)
i <- i + 1
ADF_est_diff[[i]] <- Arima(diff(diff(y)),
xreg = diff(y)[-TT],
order = c(p, 0, 0),
include.mean = as.logical(const),
include.drift = T)
ic_diff[i,] <- c(const, 1, p, ADF_est_diff[[i]]$aic,
ADF_est_diff[[i]]$bic)
ic_aic_diff <- ic_diff[order(ic_diff[,4]),][1:10,]
ic_bic_diff <- ic_diff[order(ic_diff[,5]),][1:10,]
print(ic_aic_diff)
## [10,] 1
cons trend p aic 01 -1374.441 00 -1373.402 11 -1372.788 02 -1372.466 10 -1371.969 12 -1370.838 03 -1370.571 05 -1369.103 13 -1368.922 04 -1368.730
print(ic_bic_diff)
## [10,] 1
cons trend p
00 -1373.402
01 -1374.441 10 -1371.969 11 -1372.788 02 -1372.466 12 -1370.838 03 -1370.571 13 -1368.922 04 -1368.730 05 -1369.103
Note that in this case the top five specifications ranked by the AIC is the same as the top five ranked by the BIC. Since they agree, we chose these five to form the adequate set: p = 0, 1, 2 with constant and no trend along with p = 0, 1 with both a constant and trend.
Finally, we check the residuals for all specifications in the adequate set.
adq_set_diff <- as.matrix(arrange(as.data.frame(
ic_bic_diff[1:5,]), const, trend, p))
adq_idx_diff <- match(data.frame(t(adq_set_diff[, 1:3])),
data.frame(t(ic_diff[, 1:3])))
for (i in 1:length(adq_idx)) {
checkresiduals(ADF_est[[adq_idx[i]]])
Residuals from Regression with ARIMA(1,0,0) errors
0 50 100 150 200
0.10 0.05 0.00
−0.05 −0.10
0 5 10 15 20
## Ljung-Box test
## data: Residuals from Regression with ARIMA(1,0,0) errors
## Q* = 8.4811, df = 7, p-value = 0.2921
## Model df: 3. Total lags used: 10
Residuals from Regression with ARIMA(2,0,0) errors
0 50 100 150 200
0.10 0.05 0.00
−0.05 −0.10
0 5 10 15 20
## Ljung-Box test
## data: Residuals from Regression with ARIMA(2,0,0) errors
## Q* = 6.822, df = 6, p-value = 0.3376
## Model df: 4. Total lags used: 10
Residuals from Regression with ARIMA(1,0,0) errors
0 50 100 150 200
0.10 0.05 0.00
−0.05 −0.10
0 5 10 15 20
−0.02 0.00 0.02
## Ljung-Box test
## data: Residuals from Regression with ARIMA(1,0,0) errors
## Q* = 7.7825, df = 6, p-value = 0.2545
## Model df: 4. Total lags used: 10
Residuals from Regression with ARIMA(2,0,0) errors
0 50 100 150 200
0.10 0.05 0.00
−0.05 −0.10
0 5 10 15 20
−0.02 0.00 0.02
## Ljung-Box test
## data: Residuals from Regression with ARIMA(2,0,0) errors
## Q* = 5.4569, df = 5, p-value = 0.3627
## Model df: 5. Total lags used: 10
Residuals from Regression with ARIMA(3,0,0) errors
0 50 100 150 200
0.10 0.05 0.00
−0.05 −0.10
0 5 10 15 20
−0.02 0.00 0.02
## Ljung-Box test
## data: Residuals from Regression with ARIMA(3,0,0) errors
## Q* = 5.466, df = 4, p-value = 0.2427
## Model df: 6. Total lags used: 10
As no obvious problems jump out from the residuals analysis, we proceed with ADF test using specifications in the adequate set.
adf.test(diff(y))
## Augmented Dickey-
## alternative: stationary
## Type 1: no drift no trend
## Type 2: with drift no trend
## lag ADF p.value
## [1,] 0 -10.58 0.01
lag ADF p.value
0 -7.13 0.01
1 -5.05 0.01
2 -4.30 0.01
3 -3.73 0.01
4 -3.42 0.01
## [2,] 1 -7.88 0.01
## [3,] 2 -7.21 0.01
## [4,] 3 -6.67 0.01
## [5,] 4 -6.66 0.01
## Type 3: with drift and trend
## lag ADF p.value
## [1,] 0 -10.60 0.01
## [2,] 1 -7.88 0.01
## [3,] 2 -7.21 0.01
## [4,] 3 -6.67 0.01
## [5,] 4 -6.68 0.01
## Note: in fact, p.value = 0.01 means p.value <= 0.01
A unit root is rejected at very small significance level for all specifications. Hence, the differenced process is empirically distinguishable from an integrated process.
(e) Interpret the overall findings in parts (c) and (d).
Solution Since {yt} is not empirically distinguishable from an integrated process, but {∆yt} is, we conclude that {yt} is not empirically distinguishable from an I(1) process. Remember, however, that we did not find evidence of {yt} being I(1) exactly!
(f) Construct an adequate set of ARIMA(p, d, q) models using information criteria and residuals analysis.
Solution We will consider models for p = 0,..,3; q = 0,...,3; d = 0,1 and either with or without constant and/or trend terms. There are 96 models to estimate altogether. We use the Arima function in a nested for loop to automate the estimation. There are two caveats to deal with.
• The Arima function with d = 1 will only specify an intercept when setting the include.drift = T option. Since we want to incude a linear growth term (i.e. t as a regressor) in the differenced specification, we need to pass it as an exogenous variable to Arima through the xreg option. However, with d = 1, Arima will also difference whatever data we pass this way, so we need to cummulatively sum t before passing it.
• Some specifications will be so bad that MLE will run into numerical problems and return an error. We want to ignore these specifications in the least disruptive way possible. The way to do it is to embed Arima as an argument to the try function with the silent = T option. This in general a very useful programming technique when automating a large number of steps that may potentially cause errors.
TT <- length(y)
ARIMA_est <- list()
ic_arima <- matrix( nrow = (2 * 2 + 2) * 4 ˆ 2, ncol = 7 )
colnames(ic_arima) <- c("d", "cons", "trend", "p", "q", "aic",
for (d in 0:1) {
for (const in 0:1) {
for (p in 0:3) {
for (q in 0:3) {
i <- i + 1
d1 <- as.logical(d)
c1 <- as.logical(const)
try(silent = T, expr =
ARIMA_est[[i]] <- Arima(y, order = c(p, d, q),
include.constant = c1)
ic_arima[i,] <- c(d, const, 0, p, q,
ARIMA_est[[i]]$aic,
ARIMA_est[[i]]$bic)
if (const) {
# only add a specification with trend if there is a # constant (i.e., exclude no constant with trend)
i <- i + 1
x <- c(0,cumsum(1:(TT - 1)))
try(silent = T, expr =
ARIMA_est[[i]] <- Arima(y, order = c(p, d, q),
include.constant = c1,
include.drift = T)
ic_arima[i,] <- c(d, const, 1, p, q,
ARIMA_est[[i]]$aic,
ARIMA_est[[i]]$bic)
ic_aic_arima <- ic_arima[order(ic_arima[,6]),][1:10,]
ic_bic_arima <- ic_arima[order(ic_arima[,7]),][1:10,]
print(ic_aic_arima)
## [10,] 0
120 -1384.489 020 -1379.434 130 -1385.796 002 -1379.046
d cons trend p q 1 130 1 121 1 120 1 112 1 122 1 131 1 133 1 131 1 113 1 123
print(ic_bic_arima)
## [1,] 1 1
## [2,] 0 1
## [3,] 1 1
## [4,] 0 1
## [5,] 1 1
d cons trend p q
010 -1379.386
## [6,] 1 1
## [7,] 1 1
## [8,] 0 1
## [9,] 1 1
## [10,] 0 1
0 1 1 -1378.786 -1365.436
0 0 1 -1375.166 -1365.153
1 2 1 -1385.080 -1365.026
1 1 0 -1378.280 -1364.930
1 1 2 -1383.951 -1363.897
The AIC generally prefers models with d = 0, while the BIC top 10 includes a more varied mix of integrated and non-integrated ARMAs. It also seems helpful in this case to compute the intersecting set of the top 10 AIC and top 10 BIC ranked specifications.
ic_int_arima <- intersect(as.data.frame(ic_aic_arima),
as.data.frame(ic_bic_arima))
print(ic_int_arima)
d cons trend p q 130 121 120 112
We observe that the intersection contains only specifications in levels (i.e. d = 0). However, given that a number of integrated specifications are in the top 10 ranked by the BIC as well as our inference that {yt} is not empirically distinguishable from an I(1) process, it is worth taking a closer look to see if any specifications in differences (i.e., d = 1) are worth considering.
We make note of the fact that ARIMA(1, 1, 0) and ARIMA(2, 1, 0)—both with a constant only—are in the top three of the BIC ranking. In light of this and the above considerations, we will add ARIMA(1, 1, 0) and ARIMA(2, 1, 0) to the four models in the intersecting set.
Finally, we check the residuals for every model in the adequate set.
adq_set_arima <- as.matrix(arrange(as.data.frame(
rbind(ic_int_arima,
ic_bic_arima[c(1, 3),])),
d, const, trend, p))
adq_idx_arima <- match(data.frame(t(adq_set_arima[, 1:5])),
data.frame(t(ic_arima[, 1:5])))
for (i in 1:length(adq_idx_arima)) {
checkresiduals(ARIMA_est[[adq_idx_arima[i]]])
Residuals from ARIMA(1,0,2) with drift
0 50 100 150 200
0.10 0.05 0.00
−0.05 −0.10
0 5 10 15 20
−0.02 0.00 0.02
## Ljung-Box test
## data: Residuals from ARIMA(1,0,2) with drift
## Q* = 6.4524, df = 5, p-value = 0.2647
## Model df: 5. Total lags used: 10
Residuals from ARIMA(2,0,1) with drift
0 50 100 150 200
0.15 0.10 0.05 0.00
−0.05 −0.10
0 5 10 15 20
−0.02 0.00 0.02
## Ljung-Box test
## data: Residuals from ARIMA(2,0,1) with drift
## Q* = 5.3012, df = 5, p-value = 0.3802
## Model df: 5. Total lags used: 10
Residuals from ARIMA(2,0,0) with drift
0 50 100 150 200
0.15 0.10 0.05 0.00
−0.05 −0.10
0 5 10 15 20
−0.02 0.00 0.02
## Ljung-Box test
## data: Residuals from ARIMA(2,0,0) with drift
## Q* = 7.0155, df = 6, p-value = 0.3194
## Model df: 4. Total lags used: 10
Residuals from ARIMA(3,0,0) with drift
0 50 100 150 200
0.15 0.10 0.05 0.00
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com