Journal of Economic Perspectives—Volume 15, Number 4—Fall 2001—Pages 157–168
GARCH 101: The Use of ARCH/GARCH Models in Applied Econometrics
The great workhorse of applied econometrics is the least squares model. This is a natural choice, because applied econometricians are typically called upon to determine how much one variable will change in response to a change in some other variable. Increasingly however, econometricians are being asked to forecast and analyze the size of the errors of the model. In this case, the questions are about volatility, and the standard tools have become the ARCH/ GARCH models.
Copyright By PowCoder代写 加微信 powcoder
The basic version of the least squares model assumes that the expected value of all error terms, when squared, is the same at any given point. This assumption is called homoskedasticity, and it is this assumption that is the focus of ARCH/ GARCH models. Data in which the variances of the error terms are not equal, in which the error terms may reasonably be expected to be larger for some points or ranges of the data than for others, are said to suffer from heteroskedasticity. The standard warning is that in the presence of heteroskedasticity, the regression coefficients for an ordinary least squares regression are still unbiased, but the standard errors and confidence intervals estimated by conventional procedures will be too narrow, giving a false sense of precision. Instead of considering this as a problem to be corrected, ARCH and GARCH models treat heteroskedasticity as a variance to be modeled. As a result, not only are the deficiencies of least squares corrected, but a prediction is computed for the variance of each error term. This prediction turns out often to be of interest, particularly in applications in finance.
The warnings about heteroskedasticity have usually been applied only to cross-section models, not to time series models. For example, if one looked at the
y is the Professor of Finance, of Business, University, , , and Chancellor’s Associates Professor of Economics, University of California at San Diego, La Jolla, California.
158 Journal of Economic Perspectives
cross-section relationship between income and consumption in household data, one might expect to find that the consumption of low-income households is more closely tied to income than that of high-income households, because the dollars of savings or deficit by poor households are likely to be much smaller in absolute value than high income households. In a cross-section regression of household consump- tion on income, the error terms seem likely to be systematically larger in absolute value for high-income than for low-income households, and the assumption of homoskedasticity seems implausible. In contrast, if one looked at an aggregate time series consumption function, comparing national income to consumption, it seems more plausible to assume that the variance of the error terms doesn’t change much over time.
A recent development in estimation of standard errors, known as “robust standard errors,” has also reduced the concern over heteroskedasticity. If the sample size is large, then robust standard errors give quite a good estimate of standard errors even with heteroskedasticity. If the sample is small, the need for a heteroskedasticity correction that does not affect the coefficients, and only asymp- totically corrects the standard errors, can be debated.
However, sometimes the natural question facing the applied econometrician is the accuracy of the predictions of the model. In this case, the key issue is the variance of the error terms and what makes them large. This question often arises in financial applications where the dependent variable is the return on an asset or portfolio and the variance of the return represents the risk level of those returns. These are time series applications, but it is nonetheless likely that heteroskedasticity is an issue. Even a cursory look at financial data suggests that some time periods are riskier than others; that is, the expected value of the magnitude of error terms at some times is greater than at others. Moreover, these risky times are not scattered randomly across quarterly or annual data. Instead, there is a degree of autocorre- lation in the riskiness of financial returns. Financial analysts, looking at plots of daily returns such as in Figure 1, notice that the amplitude of the returns varies over time and describe this as “volatility clustering.” The ARCH and GARCH models, which stand for autoregressive conditional heteroskedasticity and generalized autore- gressive conditional heteroskedasticity, are designed to deal with just this set of issues. They have become widespread tools for dealing with time series heteroske- dastic models. The goal of such models is to provide a volatility measure—like a standard deviation—that can be used in financial decisions concerning risk analy- sis, portfolio selection and derivative pricing.
ARCH/GARCH Models
Because this paper will focus on financial applications, we will use financial notation. Let the dependent variable be labeled rt, which could be the return on an asset or portfolio. The mean value m and the variance h will be defined relative to a past information set. Then, the return r in the present will be equal to the mean
Nasdaq, Dow Jones and Bond Returns
value of r (that is, the expected value of r based on past information) plus the standard deviation of r (that is, the square root of the variance) times the error term for the present period.
The econometric challenge is to specify how the information is used to forecast the mean and variance of the return, conditional on the past information. While many specifications have been considered for the mean return and have been used in efforts to forecast future returns, virtually no methods were available for the variance before the introduction of ARCH models. The primary descriptive tool was the rolling standard deviation. This is the standard deviation calculated using a fixed number of the most recent observations. For example, this could be calcu- lated every day using the most recent month (22 business days) of data. It is convenient to think of this formulation as the first ARCH model; it assumes that the variance of tomorrow’s return is an equally weighted average of the squared residuals from the last 22 days. The assumption of equal weights seems unattractive, as one would think that the more recent events would be more relevant and therefore should have higher weights. Furthermore the assumption of zero weights for observations more than one month old is also unattractive. The ARCH model proposed by Engle (1982) let these weights be parameters to be estimated. Thus, the model allowed the data to determine the best weights to use in forecasting the variance.
A useful generalization of this model is the GARCH parameterization intro- duced by Bollerslev (1986). This model is also a weighted average of past squared residuals, but it has declining weights that never go completely to zero. It gives parsimonious models that are easy to estimate and, even in its simplest form, has proven surprisingly successful in predicting conditional variances. The most widely used GARCH specification asserts that the best predictor of the variance in the next period is a weighted average of the long-run average variance, the variance
160 Journal of Economic Perspectives
predicted for this period, and the new information in this period that is captured by the most recent squared residual. Such an updating rule is a simple description of adaptive or learning behavior and can be thought of as Bayesian updating.
Consider the trader who knows that the long-run average daily standard deviation of the Standard and Poor’s 500 is 1 percent, that the forecast he made yesterday was 2 percent and the unexpected return observed today is 3 percent. Obviously, this is a high volatility period, and today is especially volatile, which suggests that the forecast for tomorrow could be even higher. However, the fact that the long-term average is only 1 percent might lead the forecaster to lower the forecast. The best strategy depends upon the dependence between days. If these three numbers are each squared and weighted equally, then the new forecast would be 2.16 (1 4 9)/3. However, rather than weighting these equally, it is generally found for daily data that weights such as those in the em- pirical example of (.02, .9, .08) are much more accurate. Hence the forecast is 2.08 .02*1 .9*4 .08*9.
To be precise, we can use ht to define the variance of the residuals of a regression rt mt htt. In this definition, the variance of is one. The GARCH model for variance looks like this:
ht1 rt mt2 ht ht2t ht.
The econometrician must estimate the constants , , ; updating simply requires knowing the previous forecast h and residual. The weights are (1 , , ), and the long-run average variance is /(1 ). It should be noted that this only works if 1, and it only really makes sense if the weights are positive, requiring 0, 0, 0.
The GARCH model that has been described is typically called the GARCH(1,1) model. The (1,1) in parentheses is a standard notation in which the first number refers to how many autoregressive lags, or ARCH terms, appear in the equation, while the second number refers to how many moving average lags are specified, which here is often called the number of GARCH terms. Sometimes models with more than one lag are needed to find good variance forecasts.
Although this model is directly set up to forecast for just one period, it turns out that based on the one-period forecast, a two-period forecast can be made. Ultimately, by repeating this step, long-horizon forecasts can be constructed. For the GARCH(1,1), the two-step forecast is a little closer to the long-run average variance than is the one-step forecast, and, ultimately, the distant-horizon forecast is the same for all time periods as long as 1. This is just the unconditional variance. Thus, the GARCH models are mean reverting and conditionally het- eroskedastic, but have a constant unconditional variance.
I turn now to the question of how the econometrician can possibly estimate an equation like the GARCH(1,1) when the only variable on which there are data is rt. The simple answer is to use maximum likelihood by substituting ht for 2 in the normal likelihood and then maximizing with respect to the parameters. An even
GARCH 101: The Use of ARCH/GARCH models in Applied Econometrics 161
simpler answer is to use software such as EViews, SAS, GAUSS, TSP, Matlab, RATS and many others where there exist already packaged programs to do this.
But the process is not really mysterious. For any set of parameters , , and a starting estimate for the variance of the first observation, which is often taken to be the observed variance of the residuals, it is easy to calculate the variance forecast for the second observation. The GARCH updating formula takes the weighted average of the unconditional variance, the squared residual for the first observation and the starting variance and estimates the variance of the second observation. This is input into the forecast of the third variance, and so forth. Eventually, an entire time series of variance forecasts is constructed. Ideally, this series is large when the residuals are large and small when they are small. The likelihood function provides a systematic way to adjust the parameters , , to give the best fit.
Of course, it is entirely possible that the true variance process is different from the one specified by the econometrician. In order to detect this, a variety of diagnostic tests are available. The simplest is to construct the series of {t}, which are supposed to have constant mean and variance if the model is correctly specified. Various tests such as tests for autocorrelation in the squares are able to detect model failures. Often a “Ljung box test” with 15 lagged autocorrelations is used.
A Value-at-Risk Example
Applications of the ARCH/GARCH approach are widespread in situations where the volatility of returns is a central issue. Many banks and other financial institutions use the concept of “value at risk” as a way to measure the risks faced by their portfolios. The 1 percent value at risk is defined as the number of dollars that one can be 99 percent certain exceeds any losses for the next day. Statisticians call this a 1 percent quantile, because 1 percent of the outcomes are worse and 99 percent are better. Let’s use the GARCH(1,1) tools to estimate the 1 percent value at risk of a $1,000,000 portfolio on March 23, 2000. This portfolio consists of 50 percent Nasdaq, 30 percent Dow Jones and 20 percent long bonds. The long bond is a ten-year constant maturity Treasury bond.1 This date is chosen to be just before the big market slide at the end of March and April. It is a time of high volatility and great anxiety.
First, we construct the hypothetical historical portfolio. (All calculations in this example were done with the EViews software program.) Figure 1 shows the pattern of returns of the Nasdaq, Dow Jones, bonds and the composite portfolio leading up to the terminal date. Each of these series appears to show the signs of ARCH effects in that the amplitude of the returns varies over time. In the case of the equities, it is clear that this has increased substantially in the latter part of the sample period. Visually, Nasdaq is even more extreme. In Table 1, we present some illustrative
1 The portfolio has constant proportions of wealth in each asset that would entail some rebalancing over time.
162 Journal of Economic Perspectives
Portfolio Data
Mean Std. Dev.
0.0115 0.5310 7.4936
0.0090 0.3593 8.3288
0.0073 0.2031 4.9579
0.0083 0.4738 7.0026
Sample: March 23, 1990 to March 23, 2000.
statistics for each of these three investments separately and for the portfolio as a whole in the final column. From the daily standard deviation, we see that the Nasdaq is the most volatile and interest rates the least volatile of the assets. The portfolio is less volatile than either of the equity series even though it is 80 percent equity—yet another illustration of the benefits of diversification. All the assets show evidence of fat tails, since the kurtosis exceeds 3, which is the normal value, and evidence of negative skewness, which means that the left tail is particularly extreme.
The portfolio shows substantial evidence of ARCH effects as judged by the autocorrelations of the squared residuals in Table 2. The first order autocorrelation is .210, and they gradually decline to .083 after 15 lags. These autocorrelations are not large, but they are very significant. They are also all positive, which is uncom- mon in most economic time series and yet is an implication of the GARCH(1,1) model. Standard software allows a test of the hypothesis that there is no autocor- relation (and hence no ARCH). The test p-values shown in the last column are all zero to four places, resoundingly rejecting the “no ARCH” hypothesis.
Then we forecast the standard deviation of the portfolio and its 1 percent quantile. We carry out this calculation over several different time frames: the entire ten years of the sample up to March 23, 2000; the year before March 23, 2000; and from January 1, 2000, to March 23, 2000.
Consider first the quantiles of the historical portfolio at these three different time horizons. To do this calculation, one simply sorts the returns and finds the 1 percent worst case. Over the full ten-year sample, the 1 percent quantile times $1,000,000 produces a value at risk of $22,477. Over the last year, the calculation produces a value at risk of $24,653—somewhat higher, but not enormously so. However, if the 1 percent quantile is calculated based on the data from January 1, 2000, to March 23, 2000, the value at risk is $35,159. Thus, the level of risk apparently has increased dramatically over the last quarter of the sample. Each of these numbers is the appropriate value at risk if the next day is equally likely to be the same as the days in the given sample period. This assumption is more likely to be true for the shorter period than for the long one.
The basic GARCH(1,1) results are given in Table 3. Under this table it lists the dependent variable, PORT, and the sample period, indicates that it took the algorithm 16 iterations to maximize the likelihood function and computed stan-
GARCH(1,1)
C ARCH(1) GARCH(1)
Variance Equation
Coef St. Err Z-Stat
1.40E-06 4.48E-07 3.1210 0.0772 0.0179 4.3046 0.9046 0.0196 46.1474
0.0018 0.0000 0.0000
Autocorrelations of Squared Portfolio Returns
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
1 0.210 2 0.183 3 0.116 4 0.082 5 0.122 6 0.163 7 0.090 8 0.099 9 0.081
10 0.081 11 0.069 12 0.080 13 0.076 14 0.074 15 0.083
115.07 202.64 237.59 255.13 294.11 363.85 384.95 410.77 427.88 445.03 457.68 474.29 489.42 503.99 521.98
Sample: March 23,
1990 to March 23, 2000.
Notes: Dependent Variable: PORT.
Sample (adjusted): March 23, 1990 to March 23, 2000. Convergence achieved after 16 iterations. Bollerslev-Woodridge robust standard errors and covariance.
dard errors using the robust method of Bollerslev-Wooldridge. The three coeffi- cients in the variance equation are listed as C, the intercept; ARCH(1), the first lag of the squared return; and GARCH(1), the first lag of the conditional variance. Notice that the coefficients sum up to a number less than one, which is required to have a mean reverting variance process. Since the sum is very close to one, this process only mean reverts slowly. Standard errors, Z-statistics (which are the ratio of coefficients and standard errors) and p-values complete the table.
The standardized residuals are examined for autocorrelation in Table 4. Clearly, the autocorrelation is dramatically reduced from that observed in the portfolio returns themselves. Applying the same test for autocorrelation, we now
164 Journal of Economic Perspectives
Autocorrelations of Squared Standardized Residuals
1 2 3 4 5 6 7 8 9
0.005 0.0589
0.039 4.0240 0.011 4.3367 0.017 5.0981
0.002 5.1046
0.009 5.3228 0.015 5.8836 0.013 6.3272 0.024 7.8169 0.006 7.9043 0.023 9.3163 0.013 9.7897 0.003 9.8110
0.009 10.038 0.012 10.444
0.808 0.134 0.227 0.277 0.403 0.503 0.553 0.611 0.553 0.638 0.593 0.634 0.709 0.759 0.791
find the p-values are about 0.5 or more, indicating that we can accept the hypothesis of “no residual ARCH.”
The forecast standard deviation for the next day is 0.0146, which is almost double the average standard deviation of 0.0083 presented in the last column of Table 1. If the residuals were normally distributed, then this would be multiplied by 2.327, because 1 percent of a normal random variable lies 2.327 standard deviations below the mean. The estimated normal value at risk $33,977. As it turns out, the standardized residuals, which are the estimated values of {t}, are not very close to a normal distribution. They have a 1 percent quantile of 2.844, which reflects the fat tails of the asset price distribution. Based on the actual distribution, the estimated 1 percent value at risk is $39,996. Notice how much this value at risk has risen to reflect the increased risk in 2000.
Finally, the value at risk can be computed based solely on estimation of the quantile of the forecast distribution. This has recently been proposed by Engle and Manganelli (2001), adapting the quantil
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com