Lectures 6 and 7:
Simulation
Computational Finance
1
A model is an approximate mathematical description of real-world
behaviors
E.g. model describing gravity
• 𝐹𝐹 = 𝐺𝐺 𝑚𝑚1𝑚𝑚2
𝑑𝑑2
and 𝐹𝐹 = 𝑚𝑚𝑚𝑚 imply 𝑚𝑚 = 𝐺𝐺 𝑚𝑚2
𝑑𝑑2
• The acceleration of an object towards another object (e.g. Earth) is proportional to
the mass of the other object and inversely proportional to the square of the distance
between them
• Allows you to predict how fast a ball will fall to the ground on Earth vs. on the moon
E.g. model describing stock prices (CAPM)
• 𝐸𝐸 𝑅𝑅𝑖𝑖 − 𝑅𝑅𝑓𝑓 = 𝛽𝛽𝑖𝑖 𝑅𝑅𝑚𝑚 − 𝑅𝑅𝑓𝑓
• The excess return on a stock is proportional to the return on the market
• Allows you to predict a particular stock’s return given the market’s
• Much less accurate than the gravity model! Human behavior is hard to model…
2
Building Models
Up to now: solve the model by solving an equation, or a system of
equations, describing what happens
• Sometimes with pen and paper, sometimes on a computer (e.g. tangency
portfolio in HW 1)
Many models cannot be solved this way
Simulation: create many different representative scenarios of what
could happen
• In finance, Monte Carlo simulation: sample from a known distribution of
possible scenarios, calculate the outcome for that scenario, to get
distribution of outcomes
• Useful when the distribution is complex and high-dimensional
• Also used in physics and engineering 3
Solving Models
Examples of Simulations Outside of Finance
4
Example: if X is a standard normally
distributed random variable, what is
the expected value of
𝑌𝑌 = 𝑔𝑔 𝑋𝑋 = 𝑒𝑒
𝑋𝑋
(1+𝑋𝑋2)
• Complicated function
• What is an expected value? Integral with
respect to the probability density:
E[𝑌𝑌] = �
−∞
∞ 𝑒𝑒𝑥𝑥
(1 + 𝑥𝑥2)
∗
𝑒𝑒−𝑥𝑥
2
2𝜋𝜋
𝑑𝑑𝑥𝑥
• Didn’t get any easier…
• Need a computer to solve
-3 -2 -1 0 1 2 3
x
0
0.5
1
1.5
2
2.5
f(x
)
Problem: find E[g(X)]
5
-3 -2 -1 0 1 2 3
x
0
0.1
0.2
0.3
0.4
0.5
f(x
)*
(x
)
Method 1: Riemann Sum
Limit your range to likely values
of X (where the integrand is
different enough from zero)
Divide the x axis into n equal
segments
• Length: 𝑑𝑑 = 𝑥𝑥𝑚𝑚𝑚𝑚𝑚𝑚−𝑥𝑥𝑚𝑚𝑚𝑚𝑚𝑚
𝑛𝑛
• Midpoints 𝑥𝑥𝑖𝑖 = 𝑥𝑥𝑚𝑚𝑖𝑖𝑛𝑛 +
𝑑𝑑
2
𝑖𝑖
Calculate ∑𝑖𝑖 𝑔𝑔 𝑥𝑥𝑖𝑖 𝜙𝜙 𝑥𝑥𝑖𝑖 to get
your approximation of E[g(X)]
Solving for the expectation numerically
6
-3 -2 -1 0 1 2 3
x
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
g(
x)
(x
)
Approx. Integral:
0.85588
Method 1: Riemann Sum
Limit your range to likely values
of X (where the integrand is
different enough from zero)
Divide the x axis into n equal
segments
• Length: 𝑑𝑑 = 𝑥𝑥𝑚𝑚𝑚𝑚𝑚𝑚−𝑥𝑥𝑚𝑚𝑚𝑚𝑚𝑚
𝑛𝑛
• Midpoints 𝑥𝑥𝑖𝑖 = 𝑥𝑥𝑚𝑚𝑖𝑖𝑛𝑛 +
𝑑𝑑
2
𝑖𝑖
Calculate ∑𝑖𝑖 𝑔𝑔 𝑥𝑥𝑖𝑖 𝜙𝜙 𝑥𝑥𝑖𝑖 to get
your approximation of E[g(X)]
Solving for the expectation numerically
7
-3 -2 -1 0 1 2 3
x
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
g(
x)
(x
)
Approx. Integral:
0.85572
Method 1: Riemann Sum
Limit your range to likely values
of X (where the integrand is
different enough from zero)
Divide the x axis into n equal
segments
• Length: 𝑑𝑑 = 𝑥𝑥𝑚𝑚𝑚𝑚𝑚𝑚−𝑥𝑥𝑚𝑚𝑚𝑚𝑚𝑚
𝑛𝑛
• Midpoints 𝑥𝑥𝑖𝑖 = 𝑥𝑥𝑚𝑚𝑖𝑖𝑛𝑛 +
𝑑𝑑
2
𝑖𝑖
Calculate ∑𝑖𝑖 𝑔𝑔 𝑥𝑥𝑖𝑖 𝜙𝜙 𝑥𝑥𝑖𝑖 to get
your approximation of E[g(X)]
Solving for the expectation numerically
8
-3 -2 -1 0 1 2 3
x
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
g(
x)
(x
)
Approx. Integral:
0.85567
Example: if 𝑋𝑋1and 𝑋𝑋2 are jointly standard
normally distributed random variable,
what is the expected value of
𝑌𝑌 = 𝑔𝑔 𝑋𝑋1,𝑋𝑋2 =
𝑒𝑒𝑋𝑋1+𝑋𝑋2
(1+𝑋𝑋1
2+|𝑋𝑋2|)
E[𝑌𝑌] = ∬−∞
∞ 𝑒𝑒𝑋𝑋1+𝑋𝑋2
(1+𝑋𝑋1
2+|𝑋𝑋2|)
∗ 𝑒𝑒
− 𝑚𝑚1
2+𝑚𝑚2
2
2𝜋𝜋
𝑑𝑑𝑥𝑥1𝑑𝑑𝑥𝑥2
where the second term is the PDF of the
multivariate standard normal distribution
What if 𝑔𝑔 𝑋𝑋1,𝑋𝑋2 is a function of two variables
9
Method 1: Riemann Sum
Partition both axes into equal intervals
forming squares
Compute the value of the function for the
center point of each square
Add up the volume of each bar to
approximate the volume under the curve
i.e. the value of the double integral
Already getting hard
What if you had a function of 200 random
variables e.g. value of a portfolio of 200
stocks?
Want to solve a 200-dimensional integral?
Solving for the expectation numerically
10
Method
• Sample random variables from their joint distribution using known methods of
generating random numbers
• Evaluate the function at each sample and take the mean
• MC_integration.m
Advantage
• By sampling from the underlying distribution of X, the Monte Carlo method
evaluates the function at values that are most likely to arise
• Avoids computing the function for values that will rarely occur
Valuable when
• Computing the function is costly
• The function is of many variables
• Both situations arise in finance all the time!
11
Monte Carlo Simulation is Easier and Faster
Generating
Random
Numbers
Sampling from a univariate
distribution
12
Because we want unpredictability
• Example: random number inside a slot machine in a casino
• Not very relevant for this class
Simulation: because we want a distribution
• Each random number represents a scenario
• Example: surface temperatures in an area of the Pacific Ocean where
typhoons may form
• Example: prices of stocks in our portfolio on a future day
13
Why generate random numbers?
Computers are deterministic machines, so can’t generate
truly random numbers
Programming languages generate pseudorandom numbers
• These are numbers that appear and behave random enough that we
can treat them as if they were random
• With a pseudorandom number generator (PRNG), once a seed value
is passed into the algorithm, the sequence of numbers is
deterministic
• see Appendix: Pseudorandom Number Generators for how this works
• We can set the seed value for MATLAB’s PRNG using the rng()
function
• rng(n) sets the seed to the integer n
• rng(‘shuffle’) uses the current date and time to set the seed
• rng(‘default’) sets the seed to the value used when MATLAB starts
up 14
Generating Random Numbers
Do we want unpredictability, or do we want a distribution?
• Different random number generators are better for one versus the other
Pseudo-random number algorithms are designed to be as unpredictable as
possible
• Good if you’re running a casino
• But may produce “clustering” (bunching) when multiple numbers are sampled
• Require very large samples to accurately match a desired distribution
• Matlab: rand(), randn() and most other functions we will talk about today
Quasi-random numbers are designed to “fill in” a particular distribution
• Knowing the first n-1 generated numbers helps predict the n’th number not good for
casinos at all!
• But can match a desired distribution quicker (will be useful for us starting next week)
• Latin Hypercube Sampling is the most popular method of generating quasi-random
numbers
• Matlab: lhsdesign() and lhsnorm()
15
Pseudo- vs. quasi-random numbers
Pseudo-Random Numbers
-3 -2 -1 0 1 2 3
0
20
40
60
80
100
120
140
Quasi-Random Numbers
-3 -2 -1 0 1 2 3 4
0
20
40
60
80
100
120
1,000 Draws from a Standard Normal Distribution
16
Pseudo vs. quasi-random numbers
Which looks more like a normal distribution density to you?
The uniform distribution is the
distribution directly generated by
most PRNGs
• rand() function in MATLAB
Most simulations entail sampling
random variables from something
other than the uniform
distribution
To get a sample from a different
distribution, start with standard
uniform and transform into the
distribution you need.
This works because CDFs are
themselves uniformly distributed
Inverse transform sampling
17
1
1
u1
u1
The cumulative probability of sampling
a number between 0 and u1 is u1
Suppose we want to sample x from
a distribution with the CDF given
by F
The inverse transform method sets:
𝑥𝑥 = 𝐹𝐹−1 𝑢𝑢 , 𝑢𝑢~𝑈𝑈 0,1
• where 𝐹𝐹−1 𝑢𝑢 is the inverse of F, and
𝑈𝑈(0,1) denotes the uniform
distribution from [0,1]
• So, if we generate a u from U(0,1),
then we can calculate the
corresponding x that has cumulative
probability u
Example:
• 𝑥𝑥1 = 𝐹𝐹−1 𝑢𝑢1 and 𝑥𝑥2 = 𝐹𝐹−1 𝑢𝑢2
Inverse Transform sampling
18
Example: distribution with [−∞. +∞] support
MATLAB often has functions for directly sampling from common
continuous distributions
• E.g. exponential: exprnd()
• E.g. normal: randn()
Let’s see how to do inverse transforms ourselves anyway
• Can use built-in Matlab functions to check
• Good practice with known and relatively easy distributions: prepares us for
when we have to sample from a distribution that does not have a built-in
Matlab random number generator
19
For continuous distributions
u0 u1 u2 un
x0 x1 x2 xn
Given a uniformly distributed random sample u, solve F(x)=u to get
a random sample distributed with CDF F(x)
Example: exponential with mean 1/𝜆𝜆
Let 1 − 𝑒𝑒−𝜆𝜆𝑥𝑥 = 𝑢𝑢
Solve for x
𝒙𝒙 = −𝟏𝟏
𝝀𝝀
𝐥𝐥𝐥𝐥𝐥𝐥(𝟏𝟏 − 𝒖𝒖)
We found the Inverse CDF of the Eexponential distribution
This is what MATLAB does! Let’s look inside exprnd()
20
Inverse Transform Sampling in Practice
Normal random variables are the building blocks of many financial simulation models
Basic properties
• PDF for N(0,1):
𝜑𝜑 𝑧𝑧 =
1
2𝜋𝜋
𝑒𝑒−
𝑧𝑧2
2
• PDF for N(𝜇𝜇, 𝜎𝜎)
𝑓𝑓 𝑥𝑥 = 𝜑𝜑 𝑋𝑋−𝜇𝜇
𝜎𝜎
i.e. 𝑧𝑧 = 𝑧𝑧−𝜇𝜇
𝜎𝜎
or 𝑥𝑥 = 𝜇𝜇 + 𝜎𝜎𝑧𝑧
• CDF does not have an analytical expression.
MATLAB functions for the normal PDF, CDF, and inverse CDF
• PDF: normpdf(x) or normpdf(x,mu,sigma)
• CDF: normcdf(x) or normcdf(x,mu,sigma)
• Inverse CDF: norminv(x) or norminv(x,mu,sigma)
21
The Normal Distribution
Use simulation with 10,000 samples to find the expected value of a
random variable 𝑌𝑌 = 1
1+𝑋𝑋
, where random variable X is exponentially
distributed with rate parameter 𝜆𝜆 = 2 (in other words, 𝐸𝐸 𝑋𝑋 = 1
𝜆𝜆
= 0.5)
• Part a: generate a random sample using MATLAB’s built-in exprnd()
• Part b: generate a random uniform sample using rand() and transform it to
exponential
Display your results.
We will reconvene in 15 minutes
If you’re done early, think about how you would try to solve this
problem without MATLAB. How would you set it up? What difficulties
would you run into?
In-Class Exercise: E[g(X)]
22
Example: the daily price fluctuation X of an asset
𝑋𝑋 ∈ −Δ2, 0,Δ1 with �
Prob 𝑋𝑋 = Δ1 = 𝑝𝑝1
Prob 𝑋𝑋 = 0 = 1 − 𝑝𝑝1 − 𝑝𝑝2 = 𝑝𝑝3
Prob 𝑋𝑋 = −Δ2 = 𝑝𝑝2
So, we generate a random variable U(0,1) and X is given by
23
For discrete distributions
0 1p1 p1+p2
𝑈𝑈 ~
Transform
Δ1 −Δ2 0
The “line segment” visualization makes it
easy to figure out how to code this up, but
we can also represent the problem the
same way as we do for continuous
variables
For 𝑥𝑥 < −Δ2, 𝐹𝐹 𝑥𝑥 = Prob 𝑋𝑋 < −Δ2 = 0:
impossible
For 𝑥𝑥 < 0, 𝐹𝐹 𝑥𝑥 = Prob 𝑋𝑋 < 0 = Prob[
]
𝑋𝑋 =
−Δ2 = 𝑝𝑝2
For 𝑥𝑥 < Δ1, 𝐹𝐹 𝑥𝑥 = Prob 𝑋𝑋 < Δ1 = Prob[
]
𝑋𝑋 =
−Δ2 + Prob 𝑋𝑋 = 0 = 𝑝𝑝2 + 𝑝𝑝3 = 1 − 𝑝𝑝1
For 𝑥𝑥 ≥ Δ1, 𝐹𝐹 𝑥𝑥 = 1 − Prob 𝑋𝑋 > Δ1 = 1:
guaranteed
For discrete distributions: CDF representation
24
-4 -3 -2 -1 0 1 2 3 4 5 6 7
0
0.2
0.4
0.6
0.8
1
Cumulative probability of X (HLS 2.1, p. 49)
delta1 = 5,delta2 = 2,p1 = 0.3,p2 = 0.4
X
C
um
lu
at
iv
e
P
ro
ba
bi
lit
y
Problem: Benford’s Law says that naturally-
occurring numbers tend to have a leading digit
(i.e., first digit) that is not uniformly distributed
since smaller leading digits tend to occur more
often than large leading digits. Instead, the
probability that the first digit of a number X is
equal to d, where d is 1, 2, … 9 is given as:
𝑃𝑃 𝑋𝑋 = 𝑑𝑑 = log10(1 +
1
𝑑𝑑
).
Write a script file that generates 1000 random
leading digits from this distribution using the
inverse transform method. Plot a histogram of
the values you generated.
Use MATLAB functions log10() and
histogram()
Benford’s Law can be used as a
basic fraud detector
Say you’re analyzing a large
dataset e.g.
• Sales records as part of a firm
audit
• Election results: vote totals by
region
If the first digits of the numbers
deviate significantly from
Benford’s Law, it may be that the
numbers are fake
In-Class Exercise: Benford’s Law
25
Generating
correlated
random
numbers
26
Case of two jointly normally distributed random variables
𝑋𝑋1~𝑁𝑁(𝜇𝜇1,𝜎𝜎1
2), 𝑋𝑋2~𝑁𝑁(𝜇𝜇2,𝜎𝜎2
2), 𝜌𝜌𝑥𝑥1,𝑥𝑥2 = 𝜌𝜌
• We can write down the joint distribution of these variables in vector form
𝑋𝑋 = 𝑋𝑋1𝑋𝑋2 ~𝑁𝑁
𝜇𝜇1
𝜇𝜇2
,
𝜎𝜎1
2 𝜌𝜌𝜎𝜎1𝜎𝜎2
𝜌𝜌𝜎𝜎1𝜎𝜎2 𝜎𝜎2
2
Generating Multivariate Normals
27
mu = [mu1, mu2];
Sigma = [sig1^2, rho*sig1*sig2;
rho*sig1*sig2, sig2^2];
X = mvnrnd(mu,Sigma);
Goal: generate a 1 x 2 vector
• First element is a sample of random
variable 𝑋𝑋1
• Second element is a corresponding sample
of random variable 𝑋𝑋2
Built-in MATLAB function:
mvnrnd(mean_vector,covariance_matr
ix)
• See optional Appendix to learn how mvnrnd() works
“under the hood”
But we rarely want to sample a
distribution just once.
New goal: generate a N x 2 matrix
• First column: samples of X1
• Second column: corresponding samples of X2
Let’s generate N pairs:
mvnrnd(mean_vector,covariance_matrix,N)
Produces a N x 2 matrix
• First column: samples of X1
• Second column: corresponding samples of X2
• Again, transpose to get what we want
28
Generating Multivariate Normals: N x 2
N = 1000;
X = mvnrnd(mu,Sigma,N);
-4 -2 0 2 4 6
Z
1
-4
-2
0
2
4
6
Z
2
Uncorrelated Sample Z
-4 -2 0 2 4 6
X
1
-4
-2
0
2
4
6
X
2
Correlated Sample X
Specify 1 x m vector means (zeros if standard normal)
Specify m x m covariance matrix
• correlation matrix if standard normal
Specify length of sample n
Get n x m sample
Generating Multivariate Normals: n x m
29
Demo:
Simulating
Stock
Returns
30
Example: United Airlines
• Today’s stock price: 36.85 | expected daily log return: 0.02% | daily vol: 1.4%
What is the expected stock price a year from now? What is the
standard deviation? If we buy the stock today, how likely are we to
lose money?
Define some variables
• 𝑆𝑆0 = 36.85; 𝜇𝜇 = 0.02;𝜎𝜎 = 0.014;
• A year is 365 calendar days, but only 252 trading days, so set T = 252
Stock price at time T can be represented as
• 𝑆𝑆𝑇𝑇 = 𝑆𝑆0𝑒𝑒𝑟𝑟0→𝑇𝑇
31
Simulating Stock Returns
Define a one-period log return 𝑟𝑟𝑖𝑖 = log𝑅𝑅𝑖𝑖 ~𝑁𝑁 𝜇𝜇,𝜎𝜎2
• Identically normally distributed with mean 𝜇𝜇 and variance 𝜎𝜎2
• Note: if you don’t know the mean log return but only the log mean return �̂�𝜇 =
log 𝔼𝔼 𝑅𝑅𝑖𝑖 , you must define the distribution of 𝑟𝑟𝑖𝑖 = log𝑅𝑅𝑖𝑖 ~𝑁𝑁 �̂�𝜇 −
𝜎𝜎2
2
,𝜎𝜎2 . That
will ensure that 𝔼𝔼 𝑅𝑅𝑖𝑖 = 𝔼𝔼 exp(𝑟𝑟𝑖𝑖) can be written as 𝑒𝑒�𝜇𝜇
Then log 𝑆𝑆𝑇𝑇 = log 𝑆𝑆0 + 𝑟𝑟1 + 𝑟𝑟2 + ⋯+ 𝑟𝑟𝑡𝑡 + ⋯𝑟𝑟𝑇𝑇 = log 𝑆𝑆0 + 𝑟𝑟0→𝑇𝑇
Further assuming that they are independent (so iid),
• 𝔼𝔼 log 𝑆𝑆𝑇𝑇 = log 𝑆𝑆0 + ∑𝑡𝑡=1𝑇𝑇 𝔼𝔼[𝑟𝑟𝑡𝑡] = log 𝑆𝑆0 + μ 𝑇𝑇
• 𝕍𝕍 log 𝑆𝑆𝑇𝑇 = ∑𝑡𝑡=1𝑇𝑇 𝕍𝕍[𝑟𝑟𝑡𝑡] = 𝜎𝜎2𝑇𝑇 and 𝜎𝜎 log 𝑆𝑆𝑇𝑇 = 𝜎𝜎 𝑇𝑇
Sum of iid normally distributed random variables is normal, so
32
Modeling Log Returns as Normally Distributed
The cumulative T-day log return 𝑟𝑟0→𝑇𝑇 is a normally distributed
random variable with a mean ofμ 𝑇𝑇 and a standard deviation of 𝜎𝜎 𝑇𝑇
• We can write it as 𝑟𝑟0→𝑇𝑇 = μ 𝑇𝑇 +𝜎𝜎 𝑇𝑇 ∗ ϵ
• Where ϵ is a standard normal random variable (we know how to generate
these in MATLAB!)
Simulate annual log returns:
• generate N normally distributed random numbers with meanμ 𝑇𝑇 and
standard deviation 𝜎𝜎 𝑇𝑇 and store them in a vector X
Compute year-ahead stock prices 𝑆𝑆𝑇𝑇 = 𝑆𝑆0𝑒𝑒𝑋𝑋
𝑆𝑆𝑇𝑇 now contains a distribution of end-of-year stock prices
33
Working with log returns, cont’d
Profit or loss on a portfolio (e.g. portfolio of stocks) depends on the future
prices of all assets (e.g. stocks) in the portfolio
Example: portfolio with 1 share each of Google (G), Apple (A), and
Facebook (F)
• Value today: 𝑆𝑆𝐺𝐺,0 + 𝑆𝑆𝐴𝐴,0 + 𝑆𝑆𝐹𝐹,0
• To get value at time T, we need 𝑆𝑆𝐺𝐺,𝑇𝑇, 𝑆𝑆𝐴𝐴,𝑇𝑇 , and 𝑆𝑆𝐹𝐹,𝑇𝑇
• 𝑆𝑆𝑖𝑖,𝑇𝑇 = 𝑆𝑆𝑖𝑖,0 exp 𝜇𝜇𝑖𝑖𝑇𝑇 + 𝜎𝜎𝑖𝑖 𝑇𝑇 ∗ 𝜀𝜀𝑖𝑖 for each stock 𝑖𝑖 = 𝐺𝐺,𝐴𝐴,𝐹𝐹
• What is the joint distribution of noise [𝜀𝜀𝐴𝐴,𝜀𝜀𝐴𝐴, 𝜀𝜀𝐹𝐹]′? In other words, how are the stocks
correlated?
• Sample from the multivariate standard normal distribution
Once we simulate the correlated noise, we can a distribution of stock
prices and hence portfolio value and profit/loss
34
What about a portfolio of stocks?
𝜀𝜀𝐺𝐺
𝜀𝜀𝐴𝐴
𝜀𝜀𝐹𝐹
~Normal
0
0
0
,
1 𝜌𝜌𝐺𝐺,𝐴𝐴 𝜌𝜌𝐺𝐺,𝐹𝐹
𝜌𝜌𝐺𝐺,𝐴𝐴 1 𝜌𝜌𝐴𝐴,𝐹𝐹
𝜌𝜌𝐺𝐺,𝐹𝐹 𝜌𝜌𝐴𝐴,𝐹𝐹 1
• Need to sample a 3×1 vector of multivariate normally distributed random variables with
mean 0 and covariance Σ
Sample correlated normally distributed random numbers
• noise = mvnrnd(zeros(1,3), correlation_matrix)
35
Sampling noise [𝜀𝜀𝐴𝐴,𝜀𝜀𝐴𝐴, 𝜀𝜀𝐹𝐹]′
Remember: normal distribution doesn’t do a great job capturing rare
large (either positive or negative) returns
Our solution: add “jumps”
• On most days (probability 1 − 𝑝𝑝𝑡𝑡), log returns are sampled from the normal
distribution 𝑟𝑟𝑡𝑡~𝑁𝑁(𝜇𝜇,𝜎𝜎2)
• On other days, log returns are sampled from an alternative distribution
o Good jumps: Positive jumps with probability 𝑝𝑝𝑡𝑡/2: 𝑟𝑟𝑡𝑡~𝐸𝐸𝑥𝑥𝑝𝑝 𝜆𝜆𝐺𝐺
o Bad jumps: Negative jumps with probability 𝑝𝑝𝑡𝑡/2: −𝑟𝑟𝑡𝑡~𝐸𝐸𝑥𝑥𝑝𝑝 𝜆𝜆𝐵𝐵
• Probability of jumps 𝑝𝑝𝑡𝑡 depends on whether there was a jump the previous
day.
o If there was a jump on day 𝑡𝑡 − 1, 𝑝𝑝𝑡𝑡 = 𝑝𝑝𝐽𝐽𝐽𝐽. Otherwise, 𝑝𝑝𝑡𝑡 = 𝑝𝑝𝑁𝑁𝐽𝐽 < 𝑝𝑝𝐽𝐽𝐽𝐽
o Jumps lead to more jumps – persistent periods of high volatility
Let's simulate the combined distribution ("mixture") 36
Capturing Rare Big Movements
Appendix
How does mvnrnd()
work?
38
Case of a two-dimensional random
vector
𝑋𝑋 = 𝑋𝑋1𝑋𝑋2 ~𝑁𝑁
0
0 , Σ where Σ =
1 𝜌𝜌
𝜌𝜌 1
Method
1) Generate 2 x 1 vector Z of standard
normal random numbers (uncorrelated)
2) Compute the lower Cholesky
decomposition L of Σ
3) Compute 𝑋𝑋 = 𝐿𝐿 ∗ 𝑍𝑍
* indicates matrix multiplication
39
Appendix: How Does mvnrnd() work?
Sigma = [1, rho; rho, 1];
Z = randn(2,1);
L = chol(Sigma,'lower');
X = L * Z;
The lower Cholesky decomposition of a square positive-definite
matrix Σ yields a matrix L of the same size such that
• Σ = 𝐿𝐿 ∗ 𝐿𝐿′ and L is lower triangular (all elements above the diagonal are zero)
• For 2 x 2 matrix, 𝐿𝐿 =
𝐿𝐿11 0
𝐿𝐿21 𝐿𝐿22
• Think of Cholesky decomposition as a matrix analog of the square root for
scalars
o If I asked you to generate a sample of univariate random numbers and gave you the
variance, you'd take the square root to get standard deviation.
o Square roots are unique: For any scalar, only one number exists such that its square is
equal to that scalar
o Same isn't true for matrices. Many matrices B exist such that Σ = 𝐵𝐵 ∗ 𝐵𝐵′
o But only one is lower triangular!
40
Lower Cholesky decomposition
For 2 x 2 matrix, 𝐿𝐿 =
𝐿𝐿11 0
𝐿𝐿21 𝐿𝐿22
so Σ =
𝐿𝐿11 0
𝐿𝐿21 𝐿𝐿22
∗
𝐿𝐿11 𝐿𝐿21
0 𝐿𝐿22
=
𝐿𝐿11
2 𝐿𝐿11𝐿𝐿21
𝐿𝐿11𝐿𝐿21 𝐿𝐿21
2 + 𝐿𝐿22
2
• But we also know that Σ =
1 𝜌𝜌
𝜌𝜌 1
• System of 3 equations in 3 unknowns is easy to solve
o Σ is symmetric as any covariance matrix is, so off-diagonal elements the same so extra equation is redundant
o 1 = 𝐿𝐿11
2 𝐿𝐿11 = 1
o 𝜌𝜌 = 𝐿𝐿11𝐿𝐿21 𝐿𝐿21 = 𝜌𝜌
o 1 = 𝐿𝐿21
2 + 𝐿𝐿22
2 𝐿𝐿22 = 1 − 𝜌𝜌2
• 𝑋𝑋 = 𝐿𝐿 ∗ 𝑍𝑍 so X1 = 𝐿𝐿11𝑍𝑍1 = 𝑍𝑍1and X2 = 𝐿𝐿21𝑍𝑍1 + 𝐿𝐿22𝑍𝑍2 = 𝑍𝑍1𝜌𝜌 + 𝑍𝑍2 1 − 𝜌𝜌2
o Check means: E[X1] = E[𝑍𝑍1] = 0 and E[X2] = 𝐿𝐿21E[𝑍𝑍1] + 𝐿𝐿22E[𝑍𝑍2] = 0
o Check variances: Var[X1] = Var[𝑍𝑍1] = 1 and Var[X2] = 𝐿𝐿21
2 Var[𝑍𝑍1] + 𝐿𝐿22
2 Var[𝑍𝑍2] = 𝜌𝜌2 + 1 − 𝜌𝜌2 = 1
o Check correlation:
Cov[𝑋𝑋1,𝑋𝑋2]
Var 𝑋𝑋1 Var[𝑋𝑋2]
= Var[𝑍𝑍1]𝐿𝐿11𝐿𝐿21 = 𝜌𝜌
• With more than 2 variables, this gets much harder to do by hand
o With lots of variables, this can get hard even for a computer e.g. 10,000 variables creates a 50M-dimensional system of
equations
𝑋𝑋 = 𝐿𝐿 ∗ 𝑍𝑍 so X1 = 𝐿𝐿11𝑍𝑍1 and X2 = 𝐿𝐿21𝑍𝑍1 + 𝐿𝐿22𝑍𝑍2
o Intuition: the correlation between X1 and X2 is created by the presence of Z1 in both
41
Constructing Cholesky decomposition
N random samples from the joint
distribution of 𝑋𝑋1and 𝑋𝑋2
• First row are samples of 𝑋𝑋1
• Second row are corresponding samples
of 𝑋𝑋2
• Each column vector is sample pair
Still 𝑋𝑋 = 𝐿𝐿 ∗ 𝑍𝑍
• 𝑋𝑋 and 𝑍𝑍 are now 2 x n matrices
• 2 x 2 times 2 x n is 2 x n
42
Generating Multivariate Normals
Sigma = [1, rho; rho, 1];
Z = randn(2,n);
L = chol(Sigma,'lower');
X = L * Z;
-4 -2 0 2 4 6
Z
1
-4
-2
0
2
4
6
Z
2
Uncorrelated Sample Z
-4 -2 0 2 4 6
X
1
-4
-2
0
2
4
6
X
2
Correlated Sample X
43
Generating m Multivariate Normals
𝑋𝑋 = 𝐿𝐿 ∗ 𝑍𝑍 →
𝑋𝑋1
𝑋𝑋2
⋮
𝑋𝑋𝑚𝑚
=
∗ 0 … 0
∗ ∗ … 0
∗ ⋮ . ⋮. ⋮
∗ ∗ … ∗
𝑍𝑍1
𝑍𝑍2
⋮
𝑍𝑍𝑚𝑚
For N samples (X is m x N), it's still 𝑋𝑋 = 𝐿𝐿 ∗ 𝑍𝑍
For one sample
A linear congruential generator (generate u)
𝑥𝑥𝑖𝑖+1 = 𝑚𝑚𝑥𝑥𝑖𝑖 mod 𝑚𝑚
𝑢𝑢𝑖𝑖+1 = 𝑥𝑥𝑖𝑖+1/𝑚𝑚
• mod is modulus (remainder) after division
o Ex: 7 mod 3 = 1, 11 mod 3 = 2, 13 mod 5 = 3
• Uses an initial value (seed) 𝑥𝑥0, 0 < 𝑥𝑥0 < 𝑚𝑚
• 0 < 𝑥𝑥𝑖𝑖+1 < 𝑚𝑚; 0 < 𝑢𝑢𝑖𝑖+1 < 1
• Example: a=6; m=11; 𝑥𝑥0 = 1
• Period length is (m-1) and the gap is (1/m)
• Reproducibility and unpredictability based on
• RandStream.list lists all the generator algorithms in Matlab
• See lecture code: Simulation_randomNumberGenerator.m
• Note: The period of MATLAB’s default random number generator
(the Mersenne Twister) is 2^19937-1!
o See https://www.mathworks.com/help/matlab/ref/randstream.list.html
44
Appendix: Generating Pseudorandom Numbers
https://www.mathworks.com/help/matlab/ref/randstream.list.html
Lectures 6 and 7: �Simulation
Building Models
Solving Models
Examples of Simulations Outside of Finance
Problem: find E[g(X)]
Solving for the expectation numerically
Solving for the expectation numerically
Solving for the expectation numerically
What if 𝑔 𝑋 1 , 𝑋 2 is a function of two variables
Solving for the expectation numerically
Monte Carlo Simulation is Easier and Faster
Generating Random Numbers
Why generate random numbers?
Generating Random Numbers
Pseudo- vs. quasi-random numbers
Pseudo vs. quasi-random numbers
Inverse transform sampling
Inverse Transform sampling
For continuous distributions
Inverse Transform Sampling in Practice
The Normal Distribution
In-Class Exercise: E[g(X)]
For discrete distributions
For discrete distributions: CDF representation
In-Class Exercise: Benford's Law
Generating correlated random numbers
Generating Multivariate Normals
Generating Multivariate Normals: N x 2
Generating Multivariate Normals: n x m
Demo: Simulating Stock Returns
Simulating Stock Returns
Modeling Log Returns as Normally Distributed
Working with log returns, cont'd
What about a portfolio of stocks?
Sampling noise [𝜀 𝐴, 𝜀 𝐴 , 𝜀 𝐹 ]′
Capturing Rare Big Movements
Appendix
Appendix: How Does mvnrnd() work?
Lower Cholesky decomposition
Constructing Cholesky decomposition
Generating Multivariate Normals
Generating m Multivariate Normals
Appendix: Generating Pseudorandom Numbers