5 Efficient market hypothesis
The efficient market hypothesis (EMH) is a fundamental yet controversial theory in financial economics. Roughly speaking, a financial market is said to be “efficient” if its prices always “fully reflect” available information. Consequently, stock prices should be “unpredictable”. In this section, we give a brief overview of the EMH and consider some statistical tests of predictability of stock prices based on past information. Our presentation here follows Section 1.5 and Chapter 2 of Cambell, Lo and Mackinlay (1997), as well as Chapter 3 of Linton (2019). Slides by Linton can be accessed at https://obl20.com/research/books/.
5.1 The EMH and the random walk model
, a pioneer of the theory of market efficiency who received the in Economics in 2013, defined (in 1970) “a market in which prices always ‘fully reflect’ available information is said to be ‘efficient’.” A more explicit definition is provided by Malkiel (1992):
Copyright By PowCoder代写 加微信 powcoder
A capital market is said to be efficient if it fully and correct reflects all relevant information in determining security prices. Formally, the market is said to be efficient with respect to some information set … if security prices would be unaffected by revealing that information to all participants. Moreover, efficiency with respect to an information set … implies that it is impossible to make economic profits by trading on the basis of [that information set].
This information makes more explicit the role of information in market efficiency (already explicit in the earlier works by Fama and others). In the literature, there are three main forms of market efficiency depending on the choice of the information set:
Weak form: Information from historical prices are fully reflected in current prices.
Semi-strong form: Information from publicly available information (such as past prices, annual reports, earning forecasts) is fully reflected in current prices.
Strong form: All private and public information is fully information is fully reflected in current prices.
Clearly, the semi-strong form is stronger (i.e., implies) than the weak form, and the strong form is stronger than the semi-strong form.
The above formulation of EMH is still rather vague; for example, the meaning of “fully reflected” has to be specified. Statistically, this means that in order to test the efficient market hypothesis, a model for efficient market prices has to be specified. If we reject the test on some data set, it may be because the market is inefficient or the model of “normal returns” is incorrect. This is known as the joint hypothesis problem. Furthermore, since in practice there is a cost to acquire and process information, it is unlikely that information is “instantly incorporated” in market prices. Thus, the market is unlikely to be efficient in the strictest sense of the word. It may be more meaningful and practically useful to consider the notion of relative efficiency. For example, market A is more efficient than market B (according to some measure), or a market has become more efficient over time.
A relatively uncontroversial prediction of the efficient market hypothesis is that prices should be “unpredictable”. If they were, market participants would be able use this information to profit in the market. One form of “unpredictability” is that successive price changes or returns should be statistically uncorrelated. To motivate this statement we provide an argument first contrived in the economic literature by
Samuelson (1965). Let Pt be the price of some security at time t. Suppose that Pt is given by the “rational expectation” by market participants of some “fundamental value” V ∗, conditional on information It available at time t. Then, we have
Pt = E[V ∗|It].
This representation implies that Pt is a martingale: by the tower property of condi-
tional expectation, we have
E[Pt+1|It] = E[E[V ∗|It+1]|It] = E[V ∗|Vt] = Pt.
Iterating, we see that for all s ≥ t one has E[Ps|It] = Pt.
Proposition 5.1. Let (Pt) be a (square integrable) martingale with respect to the fil- tration (It). Let ∆t = Pt −Pt−1 be the increment over [t−1, t]. Then Cov(∆t, ∆t−1) = 0.
Proof. Note that
It follows that
E[∆t] = E[Pt − Pt−1]
= E[E[Pt − Pt−1|It−1]]
= E[Pt−1 − Pt−1] = 0.
Cov(∆t, ∆t−1) = E[∆t∆t−1]
= E[(Pt − Pt−1)(Pt−1 − Pt−2)]
= E[E[(Pt − Pt−1)(Pt−1 − Pt−2)|It−1]] = E[(Pt−1 − Pt−2)E[Pt − Pt−1|It−1]] = E[(Pt−1 − Pt−2) · 0] = 0.
Exercise 5.2. In the context of Proposition 5.1, show that Cov(∆t,∆t−h) = 0 for any h.
If the different increments are uncorrelated, then any linear forecasting rule based on these increments are ineffective.
The random walk model is a model which captures the absence of predictability, and may be considered independently of the efficient market hypothesis. Let Xt be either the price Pt of the log price log Pt of an asset. The random walk model states that
Xt = μ + Xt + εt, (5.1)
where the drift term μ may be nonzero, and (εt) is a shock or innovation process with E[εt] = 0.
To specify more completely the random walk model (5.1), we need assumptions on the innovation process (εt). Various conditions have been considered in the literature. For example, Linton (2019) considered the following assumptions:
rw1 εt are i.i.d.
rw2 εt are independent over time.
rw3 εt is a martingale difference sequence in the sense that E[εt |εt−1 , εt−2 , . . .] = 0 for each t.
rw4 For all k, t we have E[εt|εt−k] = 0. rw5 For all k, t we have Cov(εt,εt−k) = 0.
Exercise 5.3. Show that if E[ε2t ] ≤ C < ∞ for all t, then rw1 ⇒ rw2 ⇒ rw3 ⇒ rw4 ⇒ rw5.
5.2 Evidence of linear weak form predictability
We consider the random walk model (5.1) for the log price Xt = log Pt. A theoretical advantage of using the log price is that the price Pt will stay positive regardless of the distribution of the innovation εt. If Pt itself follows a random walk model and the distribution of εt is fully supported on the real line, then it is possible that the price drops below zero. In Section 4.2 we discussed estimation of the autocorrelation function (ACF) and related statistical tests when the innovations are i.i.d. (this corre- sponds to rw1). Although it is unlikely that asset returns are truly i.i.d. , tests under rw1 still provide useful insights to the behaviours of asset returns. It is possible to consider statistical tests under more weaker conditions, but a detailed exposition is beyond the scope of this course.
We consider whether linear combinations of past (log) returns are helpful in pre- dicting future prices (this is related to the weak form of the efficient market hypoth- esis). In Section 4.2.1 we remarked that while asset returns may exhibit statistically significant autocorrelations, the correlation coefficients are typically small in magni- tude. Following Section 3.3.3 of Linton (2019), we examine the log returns of the S&P500 index. We consider data from Jan 1960 to Dec 2020. We divide the data into 6 decades: 1960–1969, 1970–1979, ..., 2011–2020.
1960−01−01 to 1970−12−31
0 5 10 15 20 25 30 35
1980−01−01 to 1990−12−31
0 5 10 15 20 25 30 35
2000−01−01 to 2010−12−31
0 5 10 15 20 25 30 35
1970−01−01 to 1980−12−31
0 5 10 15 20 25 30 35
1990−01−01 to 2000−12−31
0 5 10 15 20 25 30 35
2010−01−01 to 2020−12−31
0 5 10 15 20 25 30 35
Figure 5.1: ACF of daily log returns for S&P500.
In Figure 5.1 we show the sample ACFs for individual decades using daily log
ACF ACF ACF
0.0 0.4 0.8 0.0 0.4 0.8
0.0 0.4 0.8
ACF ACF ACF
0.0 0.4 0.8 0.0 0.4 0.8
0.0 0.4 0.8
1960−01−01 to 1970−12−31
0 5 10 15 20 25
1980−01−01 to 1990−12−31
0 5 10 15 20 25
2000−01−01 to 2010−12−31
0 5 10 15 20 25
1970−01−01 to 1980−12−31
0 5 10 15 20 25
1990−01−01 to 2000−12−31
0 5 10 15 20 25
2010−01−01 to 2020−12−31
0 5 10 15 20 25
Figure 5.2: ACF of weekly log returns for S&P500.
returns. The p-values of the Ljung-Box Q(10) are all smaller than 0.00001. So, there is statistically significant autorrelation for each decade. However, the pattern of the ACF does not appear to be stationary over time and it is questionable whether the autocorrelation can be consistently exploited by some trading strategy.
In Figures 5.2 and 5.3 we show the sample ACFs for weekly and monthly log returns. Essentially we see the same pattern. Also, the p-values of the Q(10) statistic are larger at lower frequencies; so, the evidence for (linear) predictability is smaller at lower frequencies. In particular, for the monthly data, other than the first decade, all the p-values are larger than 0.25.
Linton (2019) (Figure 3.6) also considered the autocorrelation of Dow Jone stocks in relation to their sizes, and reported that there is a slight negative relation between size and γˆ1, meaning large stocks tend to have a less significant autocorrelation. We reconsider this question using the most recent data. In Figure (5.4) we consider daily log returns of the Dow Jone stocks from Jan 2020 to Jan 2022. We report γˆ1 (the sample ACF at lag 1) and plot it against the log of the market cap (measured in billions) (retrieved on Yahoo! Finance on Jan 30, 2022). We observe that the correla- tions are mostly negative (the average is shown by the dashed line), but contrary to Linton (2020), there does not same to be any strong relation between autocorrelation and size in this sample. We remark here that the Dow Jone stocks are all relatively liquid. We leave it to the interested student to investigate whether large and liquid stocks are “less predictable” than small and illiquid stocks (this may be regarded as a form of relative efficiency).
0.0 0.4 0.8 0.0 0.4 0.8
0.0 0.4 0.8
0.0 0.4 0.8 −0.2 0.2 0.6
0.0 0.4 0.8
1960−01−01 to 1970−12−31
0 5 10 15 20
1980−01−01 to 1990−12−31
0 5 10 15 20
2000−01−01 to 2010−12−31
0 5 10 15 20
1970−01−01 to 1980−12−31
0 5 10 15 20
1990−01−01 to 2000−12−31
0 5 10 15 20
2010−01−01 to 2020−12−31
0 5 10 15 20
Figure 5.3: ACF of monthly log returns for S&P500. ACF(1) for daily log return, Jan 2020−Jan 2022
log(market_cap)
Figure 5.4: ACF(1) of daily log returns for Dow Jone stocks. 5.3 Variance ratio tests
Another family of popular statistics for the random walk model is the variance ratio test which was first introduced by Lo and Mackinlay (1988).
The main idea is to consider the volatility at different frequencies. Let pt = log Pt be the log price and let rt = pt −pt−1 = logPt −logPt−1 be the log return over the time interval [t − 1, t]. Suppose rt is a second order stationary process. Given q > 0,
Autocorrelation
ACF ACF ACF
−0.3 −0.2 −0.1 0.0 0.1
−0.2 0.2 0.6 1.0 −0.2 0.2 0.6 1.0
−0.2 0.2 0.6 1.0
ACF ACF ACF
−0.2 0.2 0.6 1.0 −0.2 0.2 0.6 1.0
−0.2 0.2 0.6 1.0
define the q-period log return
rt(q)=pt+q −pt =rt+1 +···+rt+q.
If rw5 holds for the log returns, meaning that the log returns are pairwise uncorrelated, then we have
Var(rt(q)) = Var(rt+1) + Var(rt+2) + · · · + Var(rt+q) = qVar(rt), (5.2) where the last equality follows from the assumption of stationarity. It follows from
(5.2) that under rw5 (and second order stationarity) we have
VR(q) = Var(rt(q)) = 1. (5.3)
We call VR(q) the variance ratio (at lag q). If the estimated variance ratio is signifi- cantly larger than or smaller than 1, we have evidence for departure from rw5.
To understand the behaviours of the variance ratio we consider some examples. First, consider the case q = 2. Assuming still that the log return is second order stationary, we have
VR(2) = Var(rt) + Var(rt) + 2Cov(rt+1, rt+2) = 1 + γ1. 2Var(rt )
SoVR(2)>1ifγ1 >0andVR(2)<1ifγ1 <0.
To give another example, suppose rt is an AR(1) process, i.e.,
rt = φrt−1 + εt,
where |φ| < 1 and (εt) is and i.i.d. white noise process. Then
VR(q)=1+2 φ −2φ(1−φq). φ−1 q (1−φ)2
In particular, if φ > 0 then VR(q) > 1. Exercise 5.4. Prove (5.4).
We consider a basic version of variance ratio test. Suppose we observe nq + 1 (log) prices
p0,p1,…,pnq.
From this, we compute the log returns rt = pt −pt−1, t = 1,2,…,T = nq at the max- imum frequency, and log returns r0(q),rq(q),…,r(n−1)q(q) sampled at lag q. Define
and finally
1 n− 1 t=0
( r t − μˆ ) 2 ,
μˆ = T σˆ L2 = n ( r t ( q ) − q μˆ ) 2 ,
VR(q)= σˆL2(q). qσˆH2
(Here H and L stand respectively for high and low [frequency].) 40
1960−01−01 to 1979−12−31
1980−01−01 to 1999−12−31
0 10 20 30 40 50
2000−01−01 to 2021−12−31
0 10 20 30 40 50
1960−01−01 to 2021−12−31
0 10 20 30 40 50
0 10 20 30 40 50
Figure 5.5: Estimated variance ratios VR(q) for the S&P500 log returns.
Theorem 5.5. Suppose that rw1 holds and the lag q is fixed. Then, as T → ∞, we have √ d
T VR(q)−1 −→N(0,2(p−1)).
Proof. Omitted.
We illustrate the above concepts with a basic example. Again, we consider the
S&P 500 index which has a long history. Consider daily log returns of the index from
Jan 1960 to Dec 2021. We divide the data into four roughly equal intervals. For each
interval, we compute VR(q) for q = 1, 2, . . . , 50. We also consider VR(q) over the whole period.
The result are shown in Figure 5.5. Over the entire period 1960–2021, the variance ratios tend to be smaller than 1 for all lags. Financially, this means that the increase in volatility (as the lag increases) is smaller than predicted by the random walk model. However, in the period 1960–1969, the variance ratios are all positive. Taken together, we have evidence against the random walk model. See Chapter 3 of Linton (2019) for other empirical results which claim that market efficiency (measured by consistency with the random walk model) has improved over time for some assets.
variance ratio
variance ratio
0.4 0.6 0.8 1.0 1.2 1.4 1.6
0.4 0.6 0.8 1.0 1.2 1.4 1.6
variance ratio
variance ratio
0.4 0.6 0.8 1.0 1.2 1.4 1.6
0.4 0.6 0.8 1.0 1.2 1.4 1.6
References
Campbell, J. Y., Lo, A. W., & MacKinlay, A. C. (1996). The Econometrics of Finan- cial Markets. Princeton University Press.
Linton, O. (2019). Financial Econometrics: Models and Methods. Cambridge Uni- versity Press.
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com