Introduction
Last week we studied continuous time models.
• We defined the Wiener process
• We described how to understand stochastic differential equations as limits of difference equations.
Copyright By PowCoder代写 加微信 powcoder
I claimed that the benefit of continuous time models is the same as the benefit of using ODEs instead of difference equations: you can do calculus.
The central theme this week is Ito’s lemma, which is the most important result in stochastic calculus. Ito’s Lemma is a stochastic version of the chain rule from classical calculus.
Using Ito’s Lemma we will be able to solve some stochastic differential equations and write their solutions in closed form.
In practice solving ODEs is challenging and solving SDEs is even harder, so there are only a handful couple of SDEs we will be able to solve in closed form.
You will have noticed that differentiation is easy, but integration is hard. You can probably differentiate any function I give you (that is differentiable), but you can’t always compute an integral explicitly.
This isn’t because you are bad at integration, it is because there are only a certain number of tricks available for integration (substitution, integration by parts, partial fractions, contour integrals) and they cannot always be applied. There is a theory called differential Galois Theory which proves that you can’t solve certain integrals using standard functions, just as classical Galois theory shows you can’t solve the general quintic equation using only n-th roots.
There is really only one trick for completely solving SDEs and that is Ito’s Lemma.
Because Ito’s Lemma is so important I want you to be able to visualise why it is true. So we will plot lots of pictures.
In fact, this week’s course is based on a paper I wrote “Coordinate Free Stochastic Differential Equations as Jets” which showed how you can draw good pictures of SDEs. I developed these pictures so that I could understand Ito’s Lemma intuitively.
ODEs are easy to visualise as vector fields. But vector fields on R2 are much more fun to visualise than vector fields on R. The same applies to SDEs. For this reason we will also introduce higher dimensional SDEs this week.
The problem with higher dimensional problems is that calculations quickly get long, boring and fiddly. You will have experienced this when doing vector calculus: calculating the curl of a vector field takes ages by hand!
To save ourselves some boring computations, we will use sympy, a Python package for symbolic mathe- matics which will do all the dull calculations for us.
Learning Outcomes
In summary, this week you will be able to: • Solve SDEs using Ito’s Lemma
• Simulate higher dimensional SDEs
• Visualise an SDE as a field of curves
• Itô (1915–2008)
• Developed the theory of SDEs using the Ito integral
• KeyPaper:“StochasticIntegral”ProceedingsoftheImperialAcademy(1944)
Ito wrote his name with a hat accent on the o to indicate a longer vowel. This is tedious to do in a Jupyter
notebook, so in this course I have not done so except. When preparing LaTeX documents write It\^o [1]:
Higher Dimensional SDEs
• Wt is a Wiener process
• We considered the SDE
dXt = a(X,t)dt + b(X,t)dWt where a : R × R → R and b : R × R → R are functions.
• We interpreted the SDE as meaning that Xt is the limit of the Euler-Maruyama scheme Xt+δt = Xt + a(X,t)δt + b(X,t)δWt
• The standard definition is different and uses the Ito-integral.
Definition: A d-dimensional process Wt ∈ Rd is called a d-dimensional Brownian motion with drift μ and
covariance matrix Σ if
• W has independent increments.
• The increments Wt+u − Wt are normally distributed with mean μu and covariance matrix Σu. • Wt is almost surely continuous in t.
Here Σ is a positive definite d × d symmetric matrix.
Definition: A d-dimensional Wiener process is a d-dimensional Brownian motion with mean 0 and covari-
ance matrix given by the identity.
Note that the definition of “Brownian motion” isn’t as standardized as that of a Wiener process. This is the definition used in this course.
Lemma: Let Wt1, Wt2, . . . Wtd be independent 1-d Wiener processes, then Wt := (Wt1, . . . Wtd) is a d- dimensional Wiener process with covariance matrix 1d.
Lemma: Let L be the Cholesky-decomposition of a covariance matrix Σ and let Wt be a d-dimensional Wiener process, then Vt := LWt is a d-dimensional Brownian motion with covariance matrix Σ and drift 0.
Lemma: To simulate a d-dimensional Brownian motion, V with drift 0 and covariance matrix Σ on a
t discretegrid{0,δt,2δt,…,Nδt = T}wemayusethedifferenceequation: Vt+δt = Vt +L
εt is a d-dimensional vector of independent standard normal random variables. [2]:
δtεt where
def simulate_brownian( T, n_steps, cov=np.identity(1)): L = np.linalg.cholesky( cov )
dt = T/n_steps
d = cov.shape[0]
V = np.zeros( [d, n_steps+1] )
eps = np.random.randn( d, n_steps ) for i in range(0,n_steps):
V[:,i+1] = V[:,i] + sqrt(dt)*L @ eps[:,i] return V
P = np.array([[1, rho], [rho,1]]) W = simulate_brownian(1,5000, P) ax= plt.gca() ax.plot(W[0,:],W[1,:]); ax.set_aspect(‘equal’) ax.set_xlabel(‘$x_1$’) ax.set_ylabel(‘$x_2$’); #+
P = np.array([[1, rho], [rho,1]]) W = simulate_brownian(1,5000, P) ax= plt.gca() ax.plot(W[0,:],W[1,:]); ax.set_aspect(‘equal’) ax.set_xlabel(‘$x_1$’) ax.set_ylabel(‘$x_2$’); #+
LetWtd beaBrownianmotion. Leta : Rn ×R → Rn andB : Rn ×R → Rn ×Rd befunctions. Let X0 ∈ Rn be given. Then we may interpret “the solution of the SDE”
dXt =a(Xt,t)dt+B(Xt,t)dWt. as referring to the limit of the discrete time Euler-Maruyama scheme
whereXt ∈R2 andUt,Vt ∈R.
We can write an SDE in vector notation:
Xt+δt = Xt + a(Xt, t) δt +
δtB(Xt, t)(Wt+δt − Wt)
• Note that a is Rn-vector valued so aδt is an n-vector.
• B is (n × d)-matrix valued and (Wt+δt − Wt) is a d-vector so that B(Xt, t)(Wt+δt − Wt) is an
Very often we write higher dimensional SDEs by writing an SDE for each component. Let us write
Xt =(Ut,Vt)
(U)(Vt 0)(W1) d V = 0 −U d W2
It is often more readable to write an equation for each component
dUt = Vt dWt1 dVt = −Ut dWt2.
You may choose to have less noise terms than dimensions for your process. So d may be less than n. For example
d U = Vt dWt1 Vt −Ut
is a valid SDE. It can be written equivalently as
dUt = Vt dWt1 dVt = −Ut dWt1.
Let μ ∈ Rn be a vector and Σ be a positive definite symmetric matrix. Given a vector v write diag(v). For the matrix the components of v on the diagonal.
Continuous time geometric brownian motion is the solution of the SDE where Vnt is an n-dimensional Brownian motion with covariance matrix Σ.
dSt = (diag(S)t μ) dt + diag(S)t dVt
You can use this to model stock prices, when it is called the n-dimensional Black-Scholes-Merton model.
We can choose a pseudo-square root, σ, of Σ and write the equation in terms of a Wiener process as dSt =(diag(S)tμ)dt+(diag(S)tσ)dWt
Since μ is an n-vector diag(S)tμ is also an n-vector. Since σ is an n × n matrix, diag(S)tσ is also an n × n matrix.
1.1 Example 1dimensional geometric Brownian motion
In the 1-dimensional case we get 1-dimensional continuous time geometric Brownian motion dSt = St(μ1 dt + σdWt1).
We will see in a later video this week how to relate this to the discrete time version of geometric Brownian motion we have already seen.
This is the classical model continuous time model for stock prices used by Black and Scholes for their famous paper on derivative pricing and by Merton for his famous paper on continuous time investment strategies.
So the equation
makes sense in vector notation.
dSt = (diag(S)tμ) dt + (diag(S)tσ dWt) Another way to write this SDE is to use ◦ to denote elementwise multiplication.
dSt = (S)t ◦μdt+(S)t ◦(σdWt)
1.2 Example independent geometric Brownian motions
In the Black-Scholes-Merton model, Write St = (St1, St2). Take
μ= μ1 σ= σ1 0 μ2 0 σ2
in which we get two idependent 1-dimensional geometric Brownian motions. dSt1 = St1(μ1 dt + σ1dWt1)
dSt2 = St2(μ2 dt + σ2dWt2)
If you are given the values of Wt then you can simulate an SDE using the Euler-Maruyama scheme. If you
are not given the values, you can simulate the increments yourself:
ε1 √ .t
δWt:= δt . εdt
where the εit are independent standard normals.
1.3 Example simulating the BlackScholesMerton model.
To simulate the Black-Scholes-Merton model we may write
ε1 √ .t
St+δt =St +St ◦μδt+ δtσ . εdt
Recall that the mutliplcation between σ and the vector of ε values is matrix multiplication. This is written @ in numpy. The elementwise multiplication ◦ is written as * in numpy.
We can approximately simulate stocks in the Black-Scholes-Merton model by using the Euler-Maruyama scheme with a large number of steps. One of this weeks group exercises asks you to find a better way of simulating stocks in this model.
def simulate_bsm_euler_maruyama( T, S0, mu, sigma, n_steps ): dt = T/n_steps
n = S0.shape[0]
S = np.zeros([n,n_steps+1]) S[:,0]=S0
epsilon = np.random.randn(n,n_steps) for i in range(0,n_steps):
S[:,i+1]=S[:,i] + \
S[:,i] * (mu * dt + sqrt(dt) * sigma @ epsilon[:,i])
S0 = np.array([100, 150])
mu = np.array([0.03, 0.05])
sigma = np.array([[0.2, 0],[0.05,0.05]])
n_steps = 1000
S = simulate_bsm_euler_maruyama(T,S0,mu,sigma,n_steps)
2 Exercises
2.1 Exercise
Prove the lemma below.
Lemma: Let Wt1, Wt2, . . . Wtd be independent 1-d Wiener processes, then Wt := (Wt1, . . . Wtd) is a d-
dimensional Wiener process with covariance matrix 1d.
t = np.linspace(0,T,n_steps+1)
ax = plt.gca()
ax.plot(t,S[0,:]);
ax.plot(t,S[1,:]);
ax.set_title(‘Two correlated stock prices in the Black-Scholes-Merton model’);␣
2.2 Exercise
Prove the lemma below.
Lemma: Let L be the Cholesky-decomposition of a correlation matrix Σ and let Wt be a d-dimensional Wiener process, then Vt := LWt is a d-dimensional Brownian motion with covariance matrix Σ and drift 0.
2.3 Exercise
Prove the lemma
Lemma: To simulate a d-dimensional Brownian motion, V with correlation matrix Σ on a discrete grid
Simulate 1000 samples of the Euler-Maruyama approximation of a geometric Brownian motion with the parameter values
μ= 0.03 , σ= 0.1 0 , S0= 100 0.05 0.05 0.2 150
over a time period of 10 years using 100 steps in the approximation. Draw a scatter plot of S10 against S120. Repeat your simulation with 10000 samples and estimate the covariance of S1 and S2, storing your result in a variable cov_est.
2.5 Exercise
Simulate the deterministic process with initial condition (U, V ) = (1, 0) dUt = Vtdt
dVt = −Utdt
for a time period of 2π. Perform the simulation with 1000 steps and store the resulting points in two vectors U and V. Plot the values of U against the values of V . What shape do you get as the number of steps tends to 0?
{0,δt,2δt,…,Nδt = T} we may use the difference equation: Vt+δt + Vt + L δtεt where εt is a d-
dimensional vector of independent standard normal random variables.
2.4 Exercise
2.6 Exercise
Approximate the stochastic process
dUt = VtdWt1 dVt = −UtdWt1
with initial condition (U0,V0) = (1,0), where Wt1 is a 1-d Wiener process using the Euler-Maruyama scheme. You should do this by writing a function simulate_process which takes a vector containing the values of W and a vector of the associated times. Plot the values of U against V for a W simulated using Wiener’s construction. See what happens as the grid size is refined.
3 Visualising ODEs
You can visualise an Ordinary Differential Equation (ODE) as a vector field.
As an example of an ODE in derivative notation consider
dt y t −xt We can also write it in differential notation
d(x)(y) =t,
d x = yt dt. yt −xt
Or, as we can write it as two equations rather than a vector equation,
dxt = yt dt dyt = −xt dt.
The subscript t just means “evaluated at time t.” This ODE associates a vector
with every point (x, y) ∈ R2
You can see the code to draw the vector field in the notebook, it uses the arrow function of matplotlib.
We can approximate the ODE with the Euler scheme which means following the current vector for a small time δt.
δxt = yt δt δyt = −xt δt
In the notebook you can see the code used to find approximate solutions and then plot the resulting trajec- tory. However, in this talk I want to focus on the pictures not how they were generated.
In the limit as δt → 0 this converges to the solution of the SDE.
In Python, you can read an image as an array of numbers using cv2.imread and then perform all kinds of transformations on it, then display it using matplotlib.pyplot.imshow. For example, I deformed these pictures of Ito by shifting some rows of the array to the right following a cosine pattern.
An important reason why vector fields are a good way of visualising ODEs is that they transform correctly under deformations. If we deform the picture of an ODE, the vector field transforms to a new vector field, this gives us a new ODE. Will the transformed solutions be solutions of this new ODE? Yes they will.
4 Visualising SDEs
So far we’ve considered SDEs given by the limit of the discrete time scheme
δXt =a(Xt,t)δt+b(Xt,t)δWt
But what happens if we consider more general curved schemes? We could consider
δXt = a1(Xt,t)δt + a1(Xt,t)(δt)2 + …
+ b1(Xt,t)δWt + b2(Xt,t)(δWt)2 + …
these schemes are curved because they are not linear in δt and δW .
It turns out that thinking about such schemes is the key to visualising SDEs. We will focus on the limit of stochastic difference equations of the form
Xt+δt = γ(Xt, δWt) 14
where γ(x, s) is a smooth function satisfying γ(x, 0) = x. The last condition ensures that if δWt = 0 then δXt = 0 too. So Xt only moves when the Brownian motion moves.
Notice that although this is very general as a scheme involving δWt, there is no δt term at all.
The advantage of a curved scheme is that they are easy to visualise. Associated to the SDE for Xt ∈ Rn
Xt+δt = γ(Xt, δWt) wemaydefineacurveγx :R→Rn ateachpointx∈Rn by
γx(s) = γ(x, s)
We will write points in R2 as row vectors (x, y). We can then define γ : R2 × R → R2 by
γ((x, y), s) = (x, y) + (y, −x)s + 3(x, y)s2
We can now generate a noise process W and solve the difference equation. We refine W using Wiener’s construction to ensure convergence as δt → 0.
If we deform the plot just as we deformed the vector field, solutions map to solutions.
A vector field is a good way of drawing an ODE because if you deform the picture to obtain a new vector field, and hence a new ODE, the solutions are just a deformed version of the solutions to the original ODE.
The same holds if we transform a picture of an SDE drawn as a field of curves. When we deform the picture, we get a new field of curves, but the solutions will be just a deformed version of the solutions to the original SDE.
This result is trivial because we define the solution to the SDE in terms of following the curves, so of course if we apply a transform to the curves, we must apply the same transform to the solutions.
In fact, you can think of a vector field as a field of curves where we’ve just drawn the curves to first order (i.e. as their tangent vectors).
The difference between an ODE and an SDE is that for an ODE you only need to know the curves up to first order to solve the equation; for an SDE you need to know the curves up to second order. We will sketch the reason for this shortly.
For an SDE, the curvature of each γx matters. For an ODE curvature is unimportant so a vector field is all you need.
We say that two curves have the same n-jet if their polynomial expansions agree to order n. SDEs are given by 2-jets of curves. ODEs are given by 1-jets of curves.
4.1 Curved schemes and linear schemes
The relationship between curved schemes without a δt term and linear schemes with a δt term is very interestingandimportant.Wewillseethat(δW)2t termsandδttermsareessentiallyequivalent.
We can take Taylor series to write this in terms of powers of δWt. δXt = γ(Xt, δWt)
=γ′(Xt)δWt + 12γ′′(δWt)2 +…
If we write γ(x, s) then we’re ′ to denote partial derivatives with respect to s taken when s = 0:
Lemma: The limit of the curved scheme
asδ→0withX0 =0is
Proof: The case α = 1 is obvious. We may write
γ′(X) := ∂γ (X, 0) ∂s
γ′′(X) := ∂2γ (X, 0) ∂s2
Xt+δt = Xt + (δWt)α
Wt α=1 Xt=t α=2 0 α≥3
Takingnstepsuptotimet,δt= nt.SowecansimulateXt as ∑n ( √ ) α
where the εi are independent standard normals. We did this as an exercise in week 1!
We can compute the mean of Xt
And the variance
Since the variance is 0, Xt is almost-surely equal to its mean. □ We have shown the equivalence of the following schemes as δt → 0
Xt+δt = Xt + (δWt)2 is equivalent to Xt+δt = Xt + (δWt)3 is equivalent to
( t ) α2 α n nE(εi )
Var(Xt) = →0
t α=2 0 α≥3
n nVar(εαi )
This relates these curved schemes to the classical, linear, Euler-Maruyama scheme. We are saying that two schemes are equivalent if they have the same solutions as δt → 0.
As our proof of the Lemma indicates, the scaling of Brownian motion ensures that higher powers of δWt don’t have any effect on the scheme and the effect of δWt2 term is deterministic and equivalent to a δt term.
One can prove in general that these schemes are equivalent:
Xt = γ(Xt, δWt)
and δXt = γ′(Xt)δWt + 12γ′′(Xt)(δWt)2.
This is equivalent to the linear scheme
δXt = 21γ′′(Xt)δt+γ′(Xt)δWt Writing this the other way round, the linear scheme
is equivalent to the curved scheme
using the curves
δXt = a(Xt)δt + b(Xt)δWt
δXt = b(Xt)δWt + a(Xt)δWt2 + . γ(x, s) = x + b(X)s + a(X)s2. 19
δXt = δt δXt = 0
5 Ito’s : Let Xt ∈ R solve the SDE
dXt =a(Xt,t)dt+b(Xt,t)dWt
with initial condition X0 . Suppose that f : R → R is smooth, then (f (X ), Xt ) ∈ R2 solves (1)
d(f (Xt )) = f ′ (Xt )a(Xt , t) + 2 f ′′ (Xt )b(Xt , t)2 dt + f ′ (Xt )b(Xt , t) dWt dXt =a(Xt,t)dt+b(Xt,t)dWt
with initial condition (f (X0 ), X0 ).
The last equation isn’t really interesting as its the one we started with. Since we are understanding SDEs
as numerical schemes, I’m listing both equations as you need both to actually simulate f(Xt). The solution to the SDE
with initial condition X0 = W0 is Xt = Wt. Taking f(x) = x2 we compute f′(x) = x, f′′(x) = 2. Hence
by Ito’s we may write
Equivalently, the solution to the SDE
is Yt = Wt2. We’ve solved an SDE!
5.1 Example
Take f(x) = sin(x). So
d(f(X))t = dt + XtdWt. d(W2)t =dt+WtdWt dYt =dt+WtdWt
d(sin(W))t =−12sin(Wt)dt+cos(Wt)dWt. Hence Xt = Wt, Yt = sin(Wt) must be the solution to the 2-dimensional SDEs
dYt =−12sin(Xt)dt+cos(Xt)dWt
with initial condition (X0 = 0, Y0 = 0). We’ve solved another SDE!
Ito’s Lemma is often described as a stochastic version of the chain rule from classical calculus. To justify
this we make the following deduction about ODEs using the chain rule.
Let Xt ∈ R solve the ODE
dXt =a(Xt,t)dt 20
then(f(Xt),Xt)∈R2 solvestheODEs
d(f(X))t =f′(Xt)a(Xt,t)dt
The function f : R → R is a deformation of R.
When transforming ODEs we only have to think about the first derivative of f as ODEs only depend on
the 1-jet. This is why you only get f ′ terms in the ODE chain rule.
When transforming SDEs we must consider second order terms, so we get an f ′′ term.
If you accept the theory I have described about the correspondence between curved schemes and linear schemes we can use this to prove this version of Ito’s Lemma.
Proof: Our plan is as follows
• We are given an SDE written as the limit of a linear scheme using δt and δWt.
• We will write this as a curved scheme using δWt and δWt2.
• Wewillthenbeaabletowritedownacurvedschemeforthepair(Xt,f(Xt))
• We will compute derivatives of this to write it as a linear scheme, once again using δt and δWt.
According to our correspondence, the solutions to the SDE
dXt = a(Xt)dt + b(Xt)dWt are given by the limits of the solutions to the difference equation
Xt+δt = γ(Xt, δ
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com