Continuous-time Markov chains
Continuous-time Markov chains
1
Continuous-Time Markov Chains
A stochastic process (Xt)t≥0 in continuous time, taking values in a
countable state space S ⊂ R is said to be a Continuous-Time
Markov Chain (CTMC) if, for all k ≥ 1, 0 ≤ t1 < t2 < · · · < tk+1
and i1, i2, . . . , ik+1 ∈ S,
P(Xtk+1 = ik+1|Xt1 = i1, · · · ,Xtk = ik)
= P(Xtk+1 = ik+1|Xtk = ik),
whenever the left hand side is well defined.
As for DTMC, it is often convenient to assume that
S = {1, 2, . . . , n} for some n ∈ N, or that S = N.
If P(Xt+s = k |Xs = j) for j , k ∈ S do not depend on s then we say
the CTMC is time homogeneous, and we can write p
(t)
j ,k . We
consider only time-homogeneous CTMCs in this course. By
convention we assume that they are right continuous, i.e.
P(limh↓0 Xt+h = Xt for all t ≥ 0) = 1.
We can put the probabilities p
(t)
i ,j into a matrix P
(t).
2
DTMC vs CTMC
For DTMC, if Xn = i then we waited for a geometric(1− pi ,i )
amount of time before jumping to a new state. At the time we
jump to a new state, the probability of jumping to j is
bi ,j = pi ,j/(1− pi ,i ). If pi ,i = 0 then bi ,j = pi ,j .
For a CTMC, if Xt = i , we wait an exponential(λi ) time and then
jump to a new state. The probability of jumping to j is bi ,j .
This is equivalent to the following: let (Ti ,j)j∈S be independent
exponential(qi ,j) random variables. We wait time T
′
i = min`∈S Ti ,`
at state i and then jump to the state k such that Ti ,k = T
′
i .
To see the equivalence, set qi ,j = λibi ,j then
T ′i ∼exponential(
∑
`∈S qi ,`) (where
∑
`∈S qi ,` = λi ) and the
probability that we jump to j is
P(Ti ,j < min
6̀=j
Ti ,`) =
qi ,j∑
`∈S qi ,`
= bi ,j .
3
Transition diagrams
For a DTMC we drew a diagram containing the transition
probabilities pi ,j .
For a CTMC we draw a transition diagram containing the
transition rates qi ,j .
The jump chain of a CTMC (Xt)t≥0 is the DTMC (X
J
n )n∈Z+
defined by X Jn = XTn where T0 = 0, and
Ti = inf{t > Ti−1 : Xt 6= XTi−1} are the jump times of the CTMC.
The transition probabilities of the jump chain are bi ,j . If λi = 0 we
set bi ,i = 1 for the jump chain (otherwise bi ,i = 0).
The transition diagram for the CTMC contains more information
than that of its jump chain since the latter does not tell us how
long the CTMC waits (on average) at each state.
4
Basic example: Poisson process
The transition diagram for the Poisson process (Nt)t≥0 with rate λ
is
0 1 2 3 · · ·
λ λ λ λ
The transition diagram for its jump process (NJn )n∈Z+ is
0 1 2 3 · · ·
1 1 1 1
5
Remarks:
Note that we have to wait for an independent exponential(λi ) time
on each successive visit to i
The waiting times must be exponential in order for the process to
be memoryless/Markovian.
E.g. suppose X0 = j and T1 is the first time the CTMC leaves j .
Then the CTMC (Xt)t≥0 must satisfy
P(T1 > t + s|T1 > s)
= P
(
Xv = j ,∀v ∈ [0, t + s]
∣∣Xu = j ,∀u ∈ [0, s])
= P
(
Xv = j ,∀v ∈ [s, t + s]
∣∣Xs = j) (Markov)
= P
(
Xv = j ,∀v ∈ [0, t]
∣∣X0 = j) (homogeneous)
= P(T1 > t)
So T1 must have an exponential distribution.
6
Example: continuous time random walk
Consider a CTMC with state space Z that waits for an
exponential(λi ) time in state i ∈ Z before jumping to i + 1 with
probability p and i − 1 with probability 1− p. Then the transition
diagram is
0 1 2 · · ·−1−2· · ·
λ0p
λ1(1− p)
λ1p
λ2(1− p)
λ2p
λ3(1− p)λ0(1− p)λ−1(1− p)λ−2(1− p)
λ−3p λ−2p λ−1p
The transition diagram of the jump chain is
0 1 2 · · ·−1−2· · ·
p
(1− p)
p
(1− p)
p
(1− p)(1− p)(1− p)(1− p)
p p p
If λi = λ for each i we call the CTMC continuous time random
walk.
7
Exercise:
Let (Xt)t≥0 be a CTMC with state space S = {1, 2, 3} and
transition rates qi ,j = i for each i , j with j 6= i .
I Draw the transition diagram for (Xt)t≥0
I Draw the transition diagram for the jump chain (X Jn )n∈Z+ .
8
Classification of states and chains
As for DTMC we can ask about the following
(1) Whether a state i is absorbing (i.e. λi = 0)
(2) Whether i → j (i.e. p(t)i ,j > 0 for some t)
(3) Communicating classes (i ↔ j if i → j and j → i)
(4) Irreducibility (i.e. i ↔ j for every i , j ∈ S)
(5) Hitting probabilities
(i.e. hi ,A = P(Xt ∈ A for some t ≥ 0|X0 = i))
(6) Recurrence (for irreducible chains: hi ,j = 1 for every i , j ∈ S)
and transience
(7) Expected hitting times (i.e. mi ,A, which is the expected time
to reach a state in A starting from i)
(8) Positive recurrence (for irreducible chains: mi ,j <∞ for every
i , j ∈ S)
(9) The long run behaviour of the chain (limiting proportion of
time spent in state i etc.)
9
Some things are the same...
(1) A state i is an absorbing state for a CTMC if and only if
λi = 0, if and only if bi ,i = 1, if and only if i is an absorbing state
for the jump chain.
Similarly, items (2) to (6) above only depend on (bi ,j)i ,j∈S and
hence the CTMC has the given property if and only if its jump
chain has that property.
On the other hand, items (7)-(9) depend on how long we wait (on
average) at every state, so these properties will in general differ for
the CTMC and its jump chain.
10
Exercise:
Consider a CTMC with state space S = {1, 2, 3} and transition
rates q2,1 = λp, q2,3 = λ(1− p), where λ > 0 and p ∈ (0, 1), and
all other transition rates are 0.
I Which states are absorbing?
I Find m2,{1,3}.
I Find the hitting probability h2,1.
11
Remarks
One can construct CTMC that are explosive in the sense that the
process can jump infinitely many times in a finite amount of time.
E.g. if S = Z+ and λi = λi for some λ > 1 and bi ,i+1 = 1 for each
i then the CTMC is explosive. We henceforth assume that our
CTMC is not explosive.
A similar argument to the one we saw in the DTMC setting shows
that an irreducible finite-state CTMC is positive recurrent.
12
Long run behaviour
As for DTMC, the limiting proportion of time spent by a CTMC in
a given state can be random (recall our reducible examples).
For an irreducible transient CTMC, the limiting proportion of time
spent in each state is 0, and there is no limiting distribution.
An irreducible positive recurrent CTMC is ergodic, i.e. the limiting
distribution exists and does not depend on the initial distribution.
The limiting distribution can be specified in terms of a quantity
similar to the “expected return time” that appeared in DTMC.
This is also equal to the stationary distribution of the CTMC (see
next slide)
13
Stationary distribution
Recall that for a DTMC, a distribution π = (πi )i∈S is a stationary
distribution for (a DTMC with transition matrix) P if πP = π.
This can be rewritten as π(P − I ) = 0, where I is the |S| × |S|
identity matrix.
A distribution π = (πi )i∈S is called a stationary distribution for the
family (P(t))t≥0 if πP
(t) = π for each t ≥ 0.
Let qi ,i := −
∑
j 6=i qi ,j = −λi and let Q denote the matrix (called
the (infinitesimal) generator, or rate matrix) whose i , jth entry is
qi ,j .
For non-explosive CTMCs a distribution π = (πi )i∈S is a stationary
distribution for (a Markov chain with rate matrix) Q if and only if
πQ = 0.
This is equivalent to the set of equations πiλi =
∑
j 6=i πjqj ,i for
i ∈ S which are referred to as the full balance equations.
14
The main result
Theorem: An irreducible and positive recurrent CTMC has a
unique stationary distribution π. For such a CTMC, the limiting
proportion of time spent in state i is πi and the limiting
distribution is π (irrespective of the initial distribution).
Note that periodicity is not an issue for a CTMC.
For i ∈ S, let T (i)1 = inf{t > 0 : Xt 6= i} and
T i ,i = inf{t > T (i)1 : Xt = i}. Then
πi =
E[T (i)1 |X0 = i ]
E[T i ,i |X0 = i ]
.
15
The stationary distribution continued
The quantity
E[T (i)1 |X0=i ]
E[T i,i |X0=i ]
is a bit like the proportion of time spent
at i up to the first time that we return to i (start from i initially).
The numerator of this quantity is 1
λi
= − 1
qi,i
. By a first step
analysis, the denominator is
1
λi
+
∑
j∈S
bi ,jmj ,i .
This is because on average we take time 1/λi to escape from i , at
which point we go to j with probability bi ,j and then we have to
get from j to i (which takes time mj ,i on average).
16
Reversibility
Suppose that we find a distribution π = (πi )i∈S such that
πiqi ,j = πjqj ,i , for all i , j ∈ S.
Then we say that Q is reversible.
The above equations are called the detailed balance equations.
Note that (exercise!) if π satisfies the detailed balance equations
then it satisfies the full balance equations too. Thus any
distribution satisfying the detailed balance equations is a stationary
distribution.
17
Explosive CTMC revisited
For explosive CTMCs, it is possible to have a solution to
πQ = 0,
with
∑
j πj = 1 that is not the stationary distribution.
I Take the CTMC with qi ,i+1 = λip for i ≥ 0, and
qi ,i−1 = (1− p)λi for i ≥ 1 as the only non-zero transition
rates.
I If p > 1− p, then the chain is transient, but choosing λi = λi
there is a solution to
πQ = 0
of the form πi = π0
(
p
(1−p)λ
)i
. Thus we can get a solution
that sums to 1 if λ > p/(1− p).
18
Expected hitting times
Theorem: For a CTMC with state space S, and A ⊂ S, the vector
of mean hitting times (mi ,A)i∈S is the minimal non-negative
solution to
mi ,A =
{
0, if i ∈ A,
1
λi
+
∑
j∈S bi ,jmj ,A, if i /∈ A.
19
Exercise:
Let (Xt)t≥0 be a CTMC with the following transition diagram
1 2 3
10
5
5
1
I Find m1,3.
I Find the stationary distribution for this chain.
I Find the stationary distribution for the jump chain.
20
The Chapman-Kolmogorov equations
Observe that
p
(s+t)
i ,j =
∑
k
P(Xs+t = j |Xs = k,X0 = i)P(Xs = k |X0 = i)
=
∑
k
p
(s)
i ,k p
(t)
k,j .
These are the Chapman-Kolmogorov equations for a CTMC. In
matrix form, the Chapman-Kolmogorov equations can be expressed
as
P(t+s) = P(s)P(t).
21
Finding the transition probabilities
Thus far we have not actually computed P(t).
By analogy with the discrete-time case, we might hope that we can
write P(t) = Pt for some matrix P.
If t = m (a positive integer), the C-K equations tell us that
P(m) = (P(1))m and our hope is fulfilled, but if t < 1?
We want a single object (like P = P(1) in the discrete case) that
encodes the information of the chain.
It turns out that the generator Q is our single object. In fact we
have the so-called Forward and Backward equations:
d
dt
P(t) = QP(t) (‡) backward equations
and,
d
dt
P(t) = P(t)Q. (†) forward equations
22
Solving the forward and backward equations
For (non-explosive) CTMCs, the matrix Q determines the
transition probability completely by solving the backward or
forward equations to get
P(t) = exp(tQ)
:=
∞∑
k=0
1
k!
tkQk ,
subject to P(0) = I .
23
Example: The Poisson process
The Poisson process is a CTMC with generator
Q =
−λ λ 0 0 0 0
. . .
0 −λ λ 0 0 0
. . .
0 0 −λ λ 0 0
. . .
0 0 0 −λ λ 0
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
.
24
Poisson process transition probabilities
Can we derive the transition probabilities from Q? We could
I Compute exp(tQ).
I Solve the Kolmogorov backward or forward differential
equations.
For the first case, one can show that (Qn)i ,j = 0 if
j /∈ {i , i + 1, . . . , i + n}, and otherwise
(Qn)i ,j = λ
n
(
n
j − i
)
(−1)n−(j−i).
Then for j ≥ i ,
∞∑
n=0
tn
n!
(Qn)i ,j =
(tλ)j−i
(j − i)!
e−tλ.
Or we can directly solve e.g. the forward equations to find p
(t)
0,k .
25
Poisson process transition probabilities
Now,
d
dt
p
(t)
0,0 = −λp
(t)
0,0
=⇒ p(t)0,0 = ce
−λt .
So with the condition that p
(0)
0,0 = 1 we get p
(t)
0,0 = e
−λt .Similarly
d
dt
p
(t)
0,k =
k∑
j=0
p
(t)
0,j qj ,k = λ(p
(t)
0,k−1 − p
(t)
0,k)
By induction, we can show that p
(t)
0,k = e
−λt(λt)k/k!.
26
Interpretation of the generator
For small h,
P(Xt+h = k|Xt = j) = p
(h)
j ,k
≈ (I + hQ)j ,k
=
{
hqj ,k , if j 6= k ,
1 + hqj ,j , if j = k .
So indeed for k 6= j we can think of qj ,k as the rate of transition
from j to k .
27
Example - birth and death processes
Let (Xt)t≥0 be a CTMC on S = Z+, where:
I Xt represents the number of ‘people’ in a system at time t.
I Whenever there are n ‘people’ in the system
I new arrivals enter (by birth or immigration) the system at rate
νn
I ‘people’ leave (or die from) the system at rate µn
I arrivals and departures occur independently of one another
I (Xt)t≥0 is a birth-and-death process with arrival (or birth)
rates (νn)n∈S and departure (or death) rates (µn)n∈S .
28
Generator of a birth and death process
The generator of such a birth and death process has the form
Q =
−ν0 ν0 0 0 0
. . .
µ1 −(µ1 + ν1) ν1 0 0
. . .
0 µ2 −(µ2 + ν2) ν2 0
. . .
0 0 µ3 −(µ3 + ν3) ν3
. . .
. . .
. . .
. . .
. . .
. . .
. . .
The CTMC evolves by remaining in state k for an
exponentially-distributed time with rate νk + µk , then it moves to
state k + 1 with probability bk,k+1 = νk/(νk + µk) and state k − 1
with probability bk,k−1 = µk/(νk + µk), and so on.
29
Example
Consider a population in which there are no births (νi = 0 for
i ≥ 0) and, for i ≥ 1, the death rates are µi .
Suppose Xt is the population size at time t and so (Xt)t≥0 is a
CTMC.
1. Find expressions for p
(t)
i ,i and p
(t)
i ,i−1 for i ≥ 1.
2. Given that the population size is two at a particular time,
calculate the probability that no more than one death will
occur within the next one unit of time.
30
Birth and death stationary distribution
Assume νi > 0 and µi > 0 for all i . Assuming non-explosivity, we
derive the stationary distribution (if it exists) by solving πQ = 0.
In fact we can solve the detailed balance equations:
νkπk = µk+1πk+1, k ∈ Z+
This has solution
πk = π0
k∏
`=1
ν`−1
µ`
.
So a stationary distribution exists if and only if
∞∑
k=0
k∏
`=1
ν`−1
µ`
<∞ in which case π0 = ( ∞∑ k=0 k∏ `=1 ν`−1 µ` )−1 . 31 Exercises I Compute the stationary distribution of an ‘M/M/1 queue’, which is a birth and death process with νi = ν, i ≥ 0, and µi = µ, i ≥ 1. I Compute the stationary distribution of a birth and death process with constant birth rate νi = ν, i ≥ 0, and unit per capita death rate µi = i , i ≥ 1. 32 Finding the transition probabilities - the details If t and h are nonnegative real numbers, we can write P(t+h) − P(t) h = P(t) [ P(h) − I h ] = [ P(h) − I h ] P(t) This suggests that we should investigate the existence of the derivative Q ∗ ≡ lim h→0+ P(h) − I h . Under our assumptions about the chain (non-explosive, etc.), Q∗ exists and equals Q. 33 Forward and backward equations Since P(t+h) − P(t) h = P(t) [ P(h) − I h ] = [ P(h) − I h ] P(t) if we can take the limits through the matrix multiplication then we get d dt P(t) = QP(t) (‡) backward equations and, similarly, d dt P(t) = P(t)Q. (†) forward equations 34 Why does Q∗ = Q? Let T1 denote the time of the first jump of our process and T2 denote the time of the second jump. Then P(T1 > h|X0 = i) = e−λih = 1− λih + o(h) = 1 + qi ,ih + o(h).
P(T2 < h|X0 = i) =
∑
k∈S
P(T1 < h,T2 − T1 < h − T1,XT1 = k |X0 = i)
≤
∑
k∈S
P(T1 < h,T2 − T1 < h,XT1 = k |X0 = i)
=
∑
k∈S
(1− e−λkh)(1− e−λih)bi ,k
=
∑
k∈S
(λkh + o(h))(λih + o(h))bi ,k
= h2
∑
k∈S
(λk + o(1))(λi + o(1))bi ,k = o(h),
provided that λk do not grow too fast with k . 35
Why does Q∗ = Q?
So, P(T1 > h|X0 = i) = 1− λih + o(h). Therefore
P(T1 < h|X0 = i) = λih + o(h).
Also P(XT1 = j |X0 = i) = bi ,j = qi ,j/λi .
Also P(T2 < h|X0 = i) = o(h).
So
p
(h)
i ,i = P(T1 > h|X0 = i)+P(T2 < h,Xh = i |X0 = i) = 1−λih+o(h).
Similarly, for j 6= i ,
p
(h)
i ,j = P(XT1 = j ,T1 < h,T2 > h|X0 = i) + P(T2 < h,Xh = j |X0 = i)
=
qi ,j
λi
λih + o(h).
Now you can see that (P(h) − I )i ,j = qi ,jh + o(h). Divide by h and
take the limit...
36
Justifying the forward and backward equations
We write “hope that” since we need to justify pushing the limits
through the (possibly infinite) sums
lim
h→0
∑
k∈S
p
(t)
i ,k
[
P(h) − I
h
]
k,j
=
∑
k∈S
p
(t)
i ,k lim
h→0
[
P(h) − I
h
]
k,j
lim
h→0
∑
k∈S
[
P(h) − I
h
]
i ,k
p
(t)
k,j =
∑
k∈S
lim
h→0
[
P(h) − I
h
]
i ,k
p
(t)
k,j
for each i , j ∈ S .
If S is finite, then there is no problem and both (‡) and (†) hold.
37
Justifying the backward equations
In fact, we know from Fatou’s Lemma that, for j , k ∈ S ,
lim inf
h→0+
p
(t+h)
jk − p
(t)
jk
h
= lim inf
h→0+
∑
i∈S
(p
(h)
ji − δji )p
(t)
ik
h
≥
∑
i∈S
lim
h→0+
(p
(h)
ji − δji )p
(t)
ik
h
=
∑
i∈S
ajip
(t)
ik .
Similarly,
lim inf
h→0+
p
(t+h)
jk − p
(t)
jk
h
≥
∑
i∈S
p
(t)
ji aik .
38
Justifying the backward equations
We can show that the inequality in the first expression is, in fact,
an equality, as follows. For N > j ,
∑
i∈S
[
p
(h)
ji − δji
]
p
(t)
ik
h
=
N∑
i=1
[
p
(h)
ji − δji
]
p
(t)
ik
h
+
∞∑
i=N+1
p
(h)
ji p
(t)
ik
h
≤
N∑
i=1
[
p
(h)
ji − δji
]
p
(t)
ik
h
+
∞∑
i=N+1
p
(h)
ji
h
=
N∑
i=1
[
p
(h)
ji − δji
]
p
(t)
ik
h
+
1−
∑N
i=1
p
(h)
ji
h
=
N∑
i=1
[
p
(h)
ji − δji
] [
p
(t)
ik − 1
]
h
.
39
Justifying the backward equations
Therefore
lim sup
h→0+
∑
i∈S
[
p
(h)
ji − δji
]
p
(t)
ik
h
≤
N∑
i=1
lim
h→0+
[
p
(h)
ji − δji
] [
p
(t)
ik − 1
]
h
=
N∑
i=1
ajip
(t)
ik −
N∑
i=1
aji .
Now we let N →∞ and use the fact that
∑∞
i=1
aji = 0 to derive
lim sup
h→0+
p
(t+h)
jk − p
(t)
jk
h
≤
∑
i∈S
ajip
(t)
ik ,
which proves that p
(t)
jk is differentiable (since lim inf ≥ lim sup) and
dp
(t)
jk
dt
=
∑
i∈S
ajip
(t)
ik .
40
Justifying the forward and backward equations
So we have justified the interchange leading to (‡). However, the
interchange leading to (†) does not hold for explosive CTMCs.
41