1 Introduction
All models are wrong, but some are useful. –
This course is an introduction to some applications of statistical and computational tools in quantitative finance. Specifically, we focus on two important and closely related topics:
(i) Financial econometrics, i.e., statistical modeling of financial data and tests of financial/economic hypotheses.
Copyright By PowCoder代写 加微信 powcoder
(ii) Quantitative investment, i.e., the construction, backtesting, and implementation of investment strategies using quantitative methods.
Ultimately, we hope that an improved understanding of financial data can help us make better financial decisions. For simplicity, in both parts we will limit ourselves to the analysis of stock prices and market indices. Some of the ideas and techniques to be discussed can be applied to other asset classes such as commodities, foreign exchanges, fixed income securities, and derivatives. We also note that there are many non-traditional forms of data that are relevant to financial decisions. These include, for example, weather, news and social media data.
The ideas and methods we cover in this course are classical but well established; in particular, we do not cover recent applications of big data analytics and advanced techniques such as deep learning. It is our belief that a firm grasp of the classical ideas, which are still highly relevant and applicable in today’s world, is needed to fully appreciate the strengths and limitations of the more recent techniques.
1.1 Programming
Throughout this course we use the R programming language to analyze the data and illustrate the methods. While sample codes will be provided along the way, we do assume a working knowledge of R programming. Some useful resources can be found here:
https://www.rstudio.com/resources/books/
Also, the following link provides a list R packages that are useful for computational
https://cran.r-project.org/web/views/Finance.html
1.2 Example: The S&P500
1.2.1 Background
Togetstarted,letusdiscusstheStandardandPoor’s500,orsimplytheS&P500.1 It is a capitalization-weighted market index of the US stock market. While the collection of constituent stocks changes from time to time, the index essentially tracks the largest 500 stocks listed on US stock exchanges, and accounts for about 80% of the total market capitalization. Thus, it is often used as an indicator of the overall performance of the US stock market and taken as a benchmark for investment strategies.
The market capitalization of a stock i at time t is defined by Mi,t = Pi,t × Qi,t,
1Some official documentations can be found here: https://www.spglobal.com/spdji/en/indices/ equity/sp- 500
where Pi,t is the price of the stock and Qi,t is the number of outstanding shares. The S&P500 is capitalization-weighted in the following sense: if Vt denotes its value in time t, then, excluding corporate actions (such as share buybacks and constituent changes), its value at time t + ∆t is given by
i Mi,t+∆t Mi,t Mi,t+∆t Vt+∆t=Vt M =Vt M M .
i i,t i j j,t i,t
Thus, the influence of a stock on the index is directly proportional to its capitalization (or market) weight Mi,t . The formula above holds true for other capitalization-
weighted indices such as the FTSE 100 Index (UK) and the SSE Composite Index
(Sheunghai).
Remark 1.1. There are market indices that are not capitalization-weighted. The Dow Jones Industrial Average is a price-weighted index with 30 constituent stocks. By definition, its value is (modulo technical adjustments such as stock split) a mutliple of the sum of the constituent stock prices. Another example is the S&P 500 Equal Weight Index. Its official description2 says it “is an equal-weight version of the widely-used S&P 500. The index includes the same constituents as the capitalization weighted S&P 500, but each company in the S&P 500 EWI is allocated a fixed weight – or 0.2% of the index total at each quarterly rebalance.”
1.2.2 Downloading and visualizing the data
Let us visualize the time series of S&P500 using R. An easy way to download financial data is to use the function getSymbols() from the R package quantmod. To install the package, use the code install.packages(“quantmod”). The following script downloads daily data of the S&500 from Yahoo! Finance (as specified by the argument src):
Here, ^GSPC is the symbol (a unique identifier) of S&500 (“ˆ” means that it is a market index). The downloaded data is stored as an xts (“extensible time series”) object (specified by return.class). The last argument auto.assign = FALSE allows us to assign the data to a variable; otherwise a variable (named by the symbol) is automatically assigned. For each trading day (indexed by row), the data consists of 6 values: opening price, high, low, closing price, volume, and adjusted closing price. Generally, the adjusted closing price adjusts for splits and dividend and/or capital gain distributions. Since S&500 is a market index, it is the same as the closing price.
Let us plot the time series of the closing value. We use ggplot2 which is part of tidyverse. A gentle introduction to tidyverse (a collection of packages which share an underlying design philosophy) can be found in the book R for Data Science which is available at https://r4ds.had.co.nz/index.html.
2See https://www.spglobal.com/spdji/en/indices/equity/sp-500-equal-weight-index 3
library(quantmod)
GSPC <- getSymbols(Symbols = c("^GSPC"), src = "yahoo",
return.class = "xts",
from = "2000-01-01", to = "2021-12-16",
periodicity = "daily", auto.assign = FALSE)
library(tidyverse)
ggplot(data = GSPC, aes(x = Index, y = GSPC.Adjusted)) +
S&P500 daily closing value
2000 2005 2010
Figure 1.1: Daily closing value of S&P500 from 2000-01-03 to 2021-12-15.
geom_line() + xlab("") + ylab("") +
labs(title = "S&P500 daily adjusted close")
The result is shown in Figure 1.1. During this period, the index grew from 1455.22 to 4709.85 which is close to its historic high. There are also many volatile periods. Here, we only note the financial crisis in 2007–2008, and the coronavirus crash in 2020.
1.3 Index funds
Note that the S&500 is a market index not a traded security. While it is theoretically possible to replicate the index by holding its constituent stocks according to their market weights, doing so directly is infeasible as this incurs a lot of transaction costs. Instead, one can invest in an index fund which aims to track the index as closely as possible. For the S&500 index, one can invest in an S&P500 ETF (exchange traded fund) which can be bought and sold in the market as if it is a stock. Here, we consider the SPDR S&P 500 ETF (symbol SPY) which is a very liquid security. These funds serve as investable proxies of the index.
Let us confirm empirically that investing in SPY is practically very similar to in- vesting in the index. In particular, we compute and compare their daily and monthly
simple returns, defined by
Rt= Vt −1, Vt−1
where Vt is the value of time t (indexed by an integer). Here, we take V (t) to be the closing value. In Sections 2 and 3 we will study asset returns and their distributional properties in more detail. The computation is performed using the following code:
# download SPY data
SPY <- getSymbols(Symbols = c("SPY"), src = "yahoo",
return.class = "xts",
from = "2000-01-01", to = "2021-12-16",
periodicity = "daily", auto.assign = FALSE)
# compute daily and monthly return
all.equal(index(GSPC), index(SPY)) # both series have same dates
Daily returns
−0.2 −0.1 0.0 0.1 0.2
Monthly returns
−0.2 −0.1 0.0 0.1 0.2
Figure 1.2: Comparing the simple returns of S&P500 and the SPY ETF. Left: Daily returns. Right: Monthly returns. The returns are highly correlated.
returns_daily <- merge(dailyReturn(GSPC),
dailyReturn(SPY), all = TRUE)
colnames(returns_daily) <- c("GSPC", "SPY")
returns_monthly <- merge(monthlyReturn(GSPC),
monthlyReturn(SPY), all = TRUE)
colnames(returns_monthly) <- c("GSPC", "SPY")
# plot scatterplots
plot1 <- ggplot(data = returns_daily) + ylim(-0.15, 0.15) +
geom_point(aes(x = GSPC, y = SPY)) +
labs(x = "S&P500", y = "SPY", title = "Daily returns")
plot2 <- ggplot(data = returns_monthly) + ylim(-0.15, 0.15) +
geom_point(aes(x = GSPC, y = SPY)) +
labs(x = "S&P500", y = "SPY", title = "Monthly returns")
library(patchwork) # for arranging the plots side by side
plot1 + plot2
# correlations
cor(returns_daily)
cor(returns_monthly)
In the code,3 we checked explicitly that the two data sets have the same date sequence. This allows us to merge the two return series directly (as two columns). This is not true in general (e.g. a stock may not be traded on certain days). By default, the functions dailyReturn() and monthlyReturn() compute the arithmetic (simple) return, but the log return, which is a transformation of the simple return (given by r = log(1 + R)), can also be computed.
3In later sections, only selected codes will be included in the main text. Please refer to the R files for complete codes.
We show the results in Figure 1.2. We see that the returns of ^GSPC and SPY are indeed highly correlated. For the daily returns, the correlation is 0.985. The accuracy is even higher for the monthly returns, where the correlation is 0.997. We also note that the monthly returns are on average a bit larger in magnitude, but not significantly so because the positive and negative (daily) returns partially cancel out each other.
Investing in an index fund (which tracks a market index) is a form of passive investing which usually incurs low transaction costs and management fees. Passive investing can be motivated by e.g. the capital asset pricing model (CAPM) and the efficient market hypothesis (EMH), both of which will be covered later in the course. In particular, the EMH implies that one cannot reliably predict future asset prices. In fact, index funds have historically outperformed the majority of active portfolio managers!4 Nevertheless, academic researchers and practitioners have worked very hard to understand how asset prices behave and devise ways to outperform the market. In this course, we will study some of the financial and statistical ideas involved.
References
Malkiel, B. G. (2019). A Random Walk down Wall Street: The Time-Tested Strategy for Successful Investing. WW Norton & Company.
4A fascinating story can be found in the book Malkiel (2019). 6
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com