Markov Chains (Theory, Algorithms and Applications) || Continuous-Time Markov Chains

Chapter 2

Continuous-Time Markov Chains

A continuous-time stochastic process X = {Xt, t ∈ +} on a countable set Sis a collection of random variables Xt defined on a probability space (Ω,F , ), withvalues in S. Time is represented here by the subscript t, a non-negative real number,which is the reason we refer to continuous time. As in the discrete case, the set S isalso called the state space.

A path ω ∈ Ω of the process is a function t −→ Xt(ω) from + to S. To ensurethat the probability of every event depending on this process can be determined fromits finite-dimensional distributions, we assume that the process X is right-continuous,in other words for all ω ∈ Ω and t ≥ 0, there exists ε > 0 such that:

Xs(ω) = Xt(ω) for all t ≤ s ≤ t+ ε.

For more details on this classical question falling within measure theory, see, forexample, [NOR 97].

The state space S being countable and the process being right-continuous, thepaths are piecewise constant and right-continuous functions and we distinguish, foreach one of them, three types of possible behaviors that are described below andillustrated by Figures 2.1, 2.2 and 2.3.

1) The path has a finite number of jumps on + and thus becomes constant after acertain time.

2) The path has an infinite number of jumps on +, but only finitely many on everyfinite interval.

90 Markov Chains – Theory, Algorithms and Applications

3) The path has an infinite number of jumps on a finite interval. Then there exists afirst finite explosion time ζ after which the process may start up again from a state ofS; it can then explode again, maybe an infinite number of times, or not.

t

Xt(ω)

0 T1 T2 T3 T4 T5

S1 S2 S3 S4 S5 S6 = ∞

Figure 2.1. Example of a type 1 path. The path has a finite number of jumps (5here) and, therefore, becomes constant after a certain time (T5 here). We have

Tn = ∞ and Sn = ∞ for n ≥ 6

t

Xt(ω)

0 T1 T2 T3 T4 T5 T6 T7 T8 T9

S1 S2 S3 S4 S5 S6 S7 S8 S9 S10

Figure 2.2. Example of a type 2 path. The path has an infinite number ofjumps but only finitely many on every finite interval

We denote by T1, T2, . . . the jump times of the process X and by S1, S2, . . . thesequence of successive sojourn times spent in states of S, which we define by therelations:

T0 = 0, Tn+1 = inf{t ≥ Tn | Xt = XTn},

Continuous-Time Markov Chains 91

for n ≥ 0 with the convention inf ∅ = ∞ and, for n ≥ 1,

Sn =Tn − Tn−1 if Tn−1 < ∞∞ otherwise.

t

Xt(ω)

0 T1 T2 T3 T4 T5 T6 T7 T8 T9 ζS1 S2 S3 S4 S5 S6 S7 S8 S9

Figure 2.3. Example of a type 3 path. The path has an infinite number ofjumps on a finite interval of time

If Tn+1 = ∞ then we define XTn+1 = XTn and otherwise we do not define X∞.The property of right-continuity of the paths ensures that Sn > 0, for all n ≥ 1, whichmeans that S does not contain any instantaneous states, in other words states in whichthe process stays zero time. The first explosion time ζ is defined by:

ζ = limn−→∞Tn =

∞

n=1

Sn

and represents the first occurrence instant of the first discontinuity, which is not ajump.

If the first explosion time ζ of the process is finite with a non-zero probability thenXζ− is not in S. We then add a boundary state, denoted by Δ, in such a way that wehave Xζ− = Δ, that is ζ = inf{t > 0 | Xt− = Δ}. The process can then continue


its evolution after time ζ by a return from the boundary state Δ to the state space Swith a certain probability distribution. This return to the state space S can possiblyproduce another explosion time, and so forth. In this case, the state Δ is actually aninstantaneous state, which means that the time spent in Δ is equal to 0. The theoreticalproblems related to the study of these explosion times are beyond the scope of thisbook. They are analyzed in a very rigorous way in [CHU 67], [FRE 83] and [AND 91]among others. Nevertheless, it is important to ensure that the studied Markov chain isnot explosive or in other words that we have ζ = ∞ with probability 1.

To study this phenomenon in a simple way, we will not focus on what happensafter the first explosion time ζ of the process. Specifically, we consider that from theexplosion time ζ, the process stays forever in the boundary state Δ and does not,therefore, return to the state space S. We then assume that the state Δ is a state wecall absorbing, in other words a state that the process never leaves. Thus we haveXt = Δ, for all t ≥ ζ. The process constructed in this way is called the minimalprocess to indicate that its activity period, that is its evolution period in the states ofS, is minimal. Indeed, the total amount of time spent in the states of S is then equalto ζ whereas if, from time ζ, the process returns to the states of S, the total amount oftime spent in S will be greater than ζ.

2.1. Definitions and properties

DEFINITION 2.1.– A stochastic process X = {Xt, t ∈ +} with values in acountable set S is a continuous-time Markov chain if for all n ≥ 0, for all instants0 ≤ s0 < · · · < sn < s < t and for all states i0, . . . , in, i, j ∈ S, we have:

{Xt = j | Xs = i,Xsn = in, . . . Xs0 = i0} = {Xt = j | Xs = i}.

DEFINITION 2.2.– A continuous-time Markov chain X = {Xt, t ∈ +} ishomogeneous if t, s ≥ 0 and i, j ∈ S, we have:

{Xt+s = j | Xs = i} = {Xt = j | X0 = i}.

In the following, every Markov chain will be considered homogeneous.

For t ≥ 0, we denote by Ft the σ-algebra generated by {Xu, 0 ≤ u ≤ t}, that isby the events {Xu = i}, for u ≤ t and i ∈ S.

DEFINITION 2.3.– A random variable T with values in [0,∞] is called a stoppingtime for X if for all t ≥ 0, we have {T ≤ t} ∈ Ft, that is if the event {T ≤ t} iscompletely determined by the history {Xu, 0 ≤ u ≤ t} of process X before t.


For instance, the jump times Tn are stopping times. The following theorem showsthat the Markov property is not only valid at deterministic times but also at randomtimes, provided that these times are stopping times.

THEOREM 2.1.– STRONG MARKOV PROPERTY.– If X = {Xt, t ∈ +} is aMarkov chain and if T is a stopping time for X then, for all i ∈ S, conditional on{T < ∞} ∩ {XT = i}, the process {XT+t, t ∈ +} is a Markov chain with initialdistribution δi, independent of the process {Xs, s ≤ T}.

PROOF.– A detailed proof of this theorem, which requires a good knowledge ofmeasure theory, can be found in [NOR 97].

2.2. Transition functions and infinitesimal generator

Let X = {Xt, t ∈ +} be a continuous-time Markov chain on a countable statespace S. For all i, j ∈ S and t ≥ 0, we set Pi,j(t) = {Xt = j | X0 = i} and wedefine the matrix P (t) by P (t) = (Pi,j(t))i,j∈S . The functions Pi,j(t) are called thetransition functions. The joint distribution of the variables Xt is given by lemma 2.1.

LEMMA 2.1.– If X = {Xt, t ∈ +} is a continuous-time Markov chain then, forall n ≥ 1, for all instants 0 < t1 < · · · < tn and for all states i0, i1, . . . , in ∈ S, wehave:

{Xtn = in, Xtn−1 = in−1, . . . , Xt1 = i1 | X0 = i0}= Pi0,i1(t1)Pi1,i2(t2 − t1) · · ·Pin−1,in(tn − tn−1).

PROOF.– The result is true for n = 1 from definition of the transition functions Pi,j(t).Let us assume that the result is true at step n− 1. By conditioning and then using theMarkov property as well as the homogeneity of X , we have:

{Xtn = in, Xtn−1= in−1, . . . , Xt1 = i1|X0 = i0}

= {Xtn = in|Xtn−1 = in−1} {Xtn−1 = in−1, . . . , Xt1 = i1|X0 = i0}= Pi0,i1(t1) · · ·Pin−2,in−1(tn−1 − tn−2)Pin−1,in(tn − tn−1),

which completes the proof.

At time t = 0, we have, by definition, P (0) = I , where I denotes the identitymatrix whose dimension is defined by the context. The functions Pi,j(t) beingundefined for t < 0, we will simply write t −→ 0 to represent the limit on the right atpoint 0. The following lemma shows that, since the paths are right-continuous, thefunctions Pi,j(t) are right-continuous at 0.


LEMMA 2.2.– The transition functions Pi,j(t) are right-continuous at 0, that is forall i, j ∈ S, we have:

limt−→0

Pi,j(t) = Pi,j(0) = 1{i=j}.

PROOF.– For all j ∈ S, we have 0 ≤ 1{Xt=j} ≤ 1 and, since the paths of the processX are right-continuous, we have:

limt−→0

1{Xt=j} = 1{X0=j}.

From the dominated convergence theorem, we obtain, for all i ∈ S,

limt−→0

{1{Xt=j} | X0 = i} = {1{X0=j} | X0 = i},

that is

limt−→0

Pi,j(t) = Pi,j(0) = 1{i=j},

which completes this proof.

The process we focus on being the minimal process, we have, in addition, for allt ≥ 0 and i, j ∈ S,

{Xt = j | X0 = Δ} = 0, {Xt = Δ | X0 = Δ} = 1

and

{Xt = Δ | X0 = i} = 1−j∈S

Pi,j(t).

LEMMA 2.3.– For all s, t ≥ 0, we have P (t+ s) = P (t)P (s), that is for all i, j ∈ S,

Pi,j(t+ s) =k∈S

Pi,k(t)Pk,j(s). [2.1]

PROOF.– Let i, j ∈ S and s, t ≥ 0. Since the boundary state Δ is an absorbing state,that is a state that the process never leaves, we have:

{Xt+s = j,Xt = Δ | X0 = i} = 0,


hence:

Pi,j(t+ s) = {Xt+s = j | X0 = i}=

k∈S∪{Δ}{Xt+s = j,Xt = k | X0 = i}

=k∈S

{Xt+s = j,Xt = k | X0 = i}

=k∈S

{Xt+s = j | Xt = k,X0 = i} {Xt = k | X0 = i}

=k∈S

Pi,k(t) {Xt+s = j | Xt = k}

=k∈S

Pi,k(t) {Xs = j | X0 = k}

=k∈S

Pi,k(t)Pk,j(s),

where the fifth equality is due to the Markov property and the sixth due to thehomogeneity of the process X .

Equation [2.1] is the Chapman–Kolmogorov equation. It expresses the fact that inorder to be in state j at time t + s, starting from state i, we have to be in any statek at time t then, starting from this state k, we have to be in state j at time s. As wehave seen in the proof of this equation, it is based on the Markov property and thehomogeneity of the Markov chain.

Theorem 2.2 shows that the time T1 spent in the first state visited by a Markovchain X follows an exponential distribution. To prove it, we will use the followinglemma that allows us to express any non-negative real number as the limit of adecreasing sequence of dyadic numbers, that is numbers of the form k/2n, withk, n ∈ . For all n ∈ , we define the set Dn of non-negative dyadic numbers oforder n by:

Dn = {k/2n, k ∈ }.

LEMMA 2.4.– For all t ∈ +, the sequence (tn)n≥0 defined by:

tn = inf{s ∈ Dn | s ≥ t}is decreasing and converges to t.


PROOF.– Let t ∈ +. For all n ≥ 0, we have tn ≥ t and if we define the sets En by:

En = {s ∈ Dn | s ≥ t},

we have En ⊆ En+1. Indeed, if s = k/2n ∈ En then we also have s = 2k/2n+1 ∈En+1. It follows that t ≤ tn+1 ≤ tn, that is the sequence (tn)n≥0 is decreasing andlower bounded, therefore, it converges. If t is itself a dyadic number, that is if thereexist integers k and n such that t = k/2n then we have t = t, for all ≥ n, therefore,the sequence (tn)n≥0 has as limit t. If t is not a dyadic number, we denote by t thelimit of the tn’s. We necessarily have, for all n ≥ 0, t ≤ t ≤ tn. Let us recall that, forall x ∈ , x denotes the integer part of x, that is the largest integer less than or equalto x. If t < t then, by defining n = max( − log2(t − t) + 1, 0) and k = 2nt +1,we obtain (k − 1)/2n < t < k/2n < t , that is tn = k/2n, which is impossible sincewe must have tn ≥ t . Therefore, we have t = t .

THEOREM 2.2.– For all i ∈ S and t ≥ 0, we have:

{T1 > t | X0 = i} = e−νit,

where νi is a non-negative real number given by:

limn−→∞

1− Pi,i(1/2n)

1/2n= νi. [2.2]

PROOF.– If X0 = i and if T1 = ∞ then, for all s ≥ 0, we have, by definition, Xs = iwith probability 1, hence Pi,i(s) = 1, for all s ≥ 0 and in particular, Pi,i(1/2

n) = 1,for all n ≥ 0, which gives νi = 0 and the first relation is in this case satisfied.

We, therefore, assume that we have T1 < ∞. For all n ≥ 0, we define the sequenceof random variables (τn)n≥0 by:

τn = inf{s ∈ Dn | s ≥ T1}.

From lemma 2.4, the sequence (τn)n≥0 is decreasing and converges to T1. Thus,we have:

T1 = limn−→∞ τn = inf

n≥0τn


with probability 1. It follows, from the monotone convergence theorem for events that,i ∈ S and t ≥ 0, we have:

{T1 ≥ t | X0 = i} = infn≥0

τn ≥ t X0 = i

=n≥0

{τn ≥ t} X0 = i

= limn−→∞ {τn ≥ t | X0 = i}. [2.3]

For every real number x, let us recall that x denotes the smallest integer greaterthan or equal to x. For all n ≥ 0 and t ≥ 0, we have, since ( 2nt − 1) /2n is thelargest dyadic number of order n less than t,

{τn ≥ t | X0 = i} = {τn > ( 2nt − 1) /2n | X0 = i}.

For t = 0, we have {τn ≥ 0 | X0 = i} = 1. For t > 0, using lemma 2.1, weobtain:

{τn ≥ t | X0 = i}= {τn > ( 2nt − 1) /2n | X0 = i}= {X( 2nt −1)/2n = i,X( 2nt −2)/2n = i, . . . , X1/2n = i | X0 = i}= (Pi,i(1/2

n))2nt −1

= exp (( 2nt − 1) ln(Pi,i(1/2n))) . [2.4]

Since, from [2.3], the limit of this last term exists when n tends to infinity, for allt > 0, it follows that the limit:

limn−→∞

− ln(Pi,i(1/2n))

1/2n

exists. Denoting this limit by νi and since, from lemma 2.2, the functions Pi,i(t) areright-continuous at 0, we obtain:

limn−→∞

1− Pi,i(1/2n)

1/2n= νi.

It follows that, for all i ∈ S, we have νi ≥ 0 and that for all t ≥ 0, we have, from[2.3] and [2.4],

{T1 ≥ t | X0 = i} = e−νit.


The exponential function being continuous, we also have {T1 > t | X0 = i} =e−νit.

Note that we have shown, in particular, that T1 is a random variable, that is ameasurable function and that T1 is a stopping time since the times τn are stoppingtimes.

As previously seen, the property of right-continuity of the paths ensures that Sn >0 for all n ≥ 1, and in particular S1 = T1 > 0. The state space S does not, therefore,contain any instantaneous states, which means that for all i ∈ S, we have νi < ∞, asconfirmed by theorem 2.2.

It follows from theorem 2.2 that νi = 0 if and only if T1 = ∞, i-a.s. We thensay that the state i is absorbing. If νi > 0, conditional on X0 = i, T1 follows theexponential distribution with rate νi.

LEMMA 2.5.– The transition functions Pi,j(t) have the following properties:

1) For all i ∈ S, the function fi(t) defined by fi(t) =j∈S

Pi,j(t) is decreasing.

2) If there exists t > 0 such that P (t) is stochastic then P (t) is stochastic for allt ≥ 0.

PROOF.– For the first point, we have, for i ∈ S and s, t ≥ 0, from lemma 2.3,

fi(t+ s) =j∈S

Pi,j(t+ s) =j∈S k∈S

Pi,k(t)Pk,j(s).

From Fubini’s theorem, we obtain:

fi(t+ s) =k∈S

Pi,k(t)j∈S

Pk,j(s) ≤k∈S

Pi,k(t) = fi(t).

For the second point, let t > 0 such that P (t) is stochastic. Point 1 shows that forall s < t, we have P (s) stochastic. If s > t then by choosing an integer n such thats/n < t, we have, from lemma 2.3,

P (s) = (P (s/n))n.

Since s/n < t, the matrix P (s/n) is, as we have just seen, stochastic, therefore,the matrix P (s) is also stochastic.

Note that the proof of the first point does not use the fact that the process X isminimal.


LEMMA 2.6.– For all t ≥ 0 and i ∈ S, we have:

Pi,i(t) ≥ e−νit > 0.

PROOF.– From theorem 2.2, we have, using the fact that, if X0 = i then {T1 > t} ⊆{Xt = i} and since, for all i ∈ S, νi < ∞,

Pi,i(t) = {Xt = i, T1 > t | X0 = i}+ {Xt = i, T1 ≤ t | X0 = i}≥ {Xt = i, T1 > t | X0 = i}= {T1 > t | X0 = i}= e−νit

> 0,


LEMMA 2.7.– Let i be a state of S. If there exists a time t > 0 such that Pi,i(t) = 1then Pi,i(t) = 1, for all t ≥ 0.

PROOF.– Let i ∈ S and t > 0 such that Pi,i(t) = 1. For s < t, we have, from lemma2.3, for all j ∈ S,

Pi,j(t) =k∈S

Pi,k(s)Pk,j(t− s) ≥ Pi,j(s)Pj,j(t− s).

It follows that:

0 = 1− Pi,i(t) ≥j∈S,j=i

Pi,j(t) ≥j∈S,j=i

Pi,j(s)Pj,j(t− s) ≥ 0.

We then have:

j∈S,j=i

Pi,j(s)Pj,j(t− s) = 0.

From lemma 2.6, we have, for all j ∈ S, Pj,j(t − s) > 0, therefore, Pi,j(s) = 0,for all j = i. It follows, from lemma 2.5, that:

1 = Pi,i(t) =j∈S

Pi,j(t) ≤j∈S

Pi,j(s) = Pi,i(s),


that is Pi,i(s) = 1. For s > t, by choosing an integer n such that s/n < t, we have,from lemma 2.3, P (s) = (P (s/n))

n, which leads to:

Pi,i(s) ≥ (Pi,i(s/n))n.

Since s/n < t, we have, as we have just seen, Pi,i(s/n) = 1, hence Pi,i(s) = 1.

Let us now determine the distribution of the pair (XT1 , T1). The variable XT1

represents the state occupied by the Markov chain X at time T1. If T1 < ∞ then XT1

is the second state visited by X , since the paths are right-continuous and if T1 = ∞then we have XT1 = X0.

We denote by P = (Pi,j)i,j∈S the matrix whose coefficients are given by:

Pi,j = {XT1 = j | X0 = i}.

When X0 = i, at time T1 the process X is in state i if T1 = ∞ and in a statej ∈ S, j = i, if T1 < ∞. This shows that the matrix P is a stochastic matrix, that is:

Pi,j ≥ 0 andj∈S

Pi,j = 1.

If state i is not absorbing, that is if T1 < ∞, i-a.s., then we have, by definition ofT1, Pi,i = 0. If state i is absorbing, that is if T1 = ∞, i-a.s. then we have Pi,i = 1.In this last case, we have Pi,j = 0 for j = i.

This means, in particular, that the boundary state Δ, which is involved in the caseof a possible explosion of the Markov chain X , cannot be reached in a finite numberof transitions or in a finite number of jumps from a state of S.

THEOREM 2.3.– For all i, j ∈ S and t ≥ 0, we have:

{XT1 = j, T1 > t | X0 = i} = Pi,je−νit

and, for i = j,

limn−→∞

Pi,j(1/2n)

1/2n= νiPi,j .

PROOF.– If X0 = i and if T1 = ∞ then we have, from theorem 2.2, νi = 0 and weobtain, from definition of the matrix P , Pi,j = 1{i=j}, therefore, the first relation issatisfied. Moreover, in this case, we have Xt = i with probability 1, for all t ≥ 0,therefore, Pi,i(t) = 1, for all t ≥ 0, which means that Pi,j(t) = 0, for all t ≥ 0 andj = i. The second relation is thus in this case satisfied as well.


Let us now assume that we have T1 < ∞, which means that if X0 = i then i isnot absorbing. If i = j then, since T1 is finite, the left-hand side of the first relation isequal to 0 and the right-hand side is also equal to 0 since Pi,i = 0 if i is not absorbing.Note that for i = j, the limit in the second relation is equal to ∞ since the numeratortends to 1 by the right-continuity of the transition functions.

Therefore, we assume now that we have T1 < ∞ and i = j. As in the proof oftheorem 2.2, we define, for all n ≥ 0, the sequence of random variables (τn)n≥0 by:

τn = inf{s ∈ Dn | s ≥ T1}.

From lemma 2.4, the sequence (τn)n≥0 is decreasing and converges to T1. Wethen have:

T1 = limn−→∞ τn = inf

n≥0τn,

with probability 1. From the right-continuity of the paths of the Markov chain X , wealso have:

XT1= lim

n−→∞Xτn ,

with probability 1. It follows, since T1 is finite with probability 1, that the integer N ,defined by:

N = inf{n ≥ 0 | τn < T2},

is finite with probability 1. For all n ≥ N , we define the events An by:

An = {Xτn = j, τn ≥ t}.

Since the sequence (τn)n≥0 is decreasing and since Xτn = XτN = XT1 for alln ≥ N , the sequence An is itself decreasing from rank N . We then have, from themonotone convergence theorem for events, for all i, j ∈ S, i = j and, for all t ≥ 0,

{XT1 = j, T1 ≥ t | X0 = i} = XT1 = j, infn≥0

τn ≥ t X0 = i

=∞

n=N

An X0 = i

= limn−→∞ {Xτn = j, τn ≥ t | X0 = i}. [2.5]


As we have already seen in the proof of theorem 2.2, we have, for all n ≥ 0 andt ≥ 0,

{Xτn = j, τn ≥ t | X0 = i} = {Xτn = j, τn > ( 2nt − 1) /2n | X0 = i}.

For all n ≥ 0, t > 0 and i = j, we then have, using lemma 2.1,

{Xτn =j, τn ≥ t | X0 = i}= {Xτn = j, τn > ( 2nt − 1) /2n | X0 = i}

=∞

k= 2nt

{Xk/2n = j, τn = k/2n | X0 = i}

=∞

k= 2nt

{Xk/2n = j,X(k−1)/2n = i, . . . , X1/2n = i | X0 = i}

=

∞

k= 2nt

(Pi,i(1/2n))

k−1Pi,j(1/2

n).

It is easy to see that if t = 0 then the latter series starts from k = 1 instead of k = 0.Here again, we necessarily have Pi,i(1/2

n) < 1, for all n ≥ 0; indeed, if there existsan integer n such that Pi,i(1/2

n) = 1 then, from lemma 2.7, we have Pi,i(t) = 1, forall t ≥ 0, which means that T1 = ∞, which contradicts the hypothesis. Therefore, wehave, for all t > 0,

{Xτn = j, τn ≥ t | X0 = i} =(Pi,i(1/2

n))2nt −1

1− Pi,i(1/2n)Pi,j(1/2

n)

= exp(( 2nt − 1) ln(Pi,i(1/2n)))

Pi,j(1/2n)

1− Pi,i(1/2n)

and, for t = 0, since τn ≥ 0 with probability 1,

{Xτn = j | X0 = i} = {Xτn = j, τn ≥ 0 | X0 = i} =Pi,j(1/2

n)

1− Pi,i(1/2n).

Since, from [2.5], the limit of {Xτn = j, τn ≥ t | X0 = i} exists, when n tendsto infinity, it follows, from theorem 2.2, that the following limit:

limn−→∞

Pi,j(1/2n)

1/2n


exists. Therefore, again from theorem 2.2, taking the limit in [2.5], we have:

{XT1 = j, T1 ≥ t | X0 = i} =e−νit

νilim

n−→∞Pi,j(1/2

n)

1/2n,

for all t ≥ 0. Let us recall that νi > 0, since we have assumed that state i is notabsorbing. By taking t = 0 in this last relation, we obtain, from definition of thematrix P ,

limn−→∞

Pi,j(1/2n)

1/2n= νiPi,j ,

hence, for all t ≥ 0,

{XT1 = j, T1 ≥ t | X0 = i} = Pi,je−νit.

The exponential function being continuous, we also have:

{XT1 = j, T1 > t | X0 = i} = Pi,je−νit,


This theorem also shows us that conditional on X0 = i, the random variablesXT1 and T1 are independent. We now define the discrete-time stochastic process Y ={Yn, n ∈ } by Yn = XTn for all n ≥ 0. Recall that for n ≥ 1, Sn = Tn − Tn−1

represents the time spent by the process X in the nth state visited by X , which isYn−1.

THEOREM 2.4.– The process Y is a discrete-time Markov chain on the state spaceS, with transition probability matrix P and, for all n ≥ 1, i0, . . . , in ∈ S andt1, . . . , tn ∈ +, we have:

{Yn = in, Sn > tn, . . . , Y1 = i1, S1 > t1 | Y0 = i0} =

n

=1

Pi −1,i e−νi −1

t . [2.6]


PROOF.– For all n ≥ 0, we have, by definition of Y , Yn ∈ S. Using the strong Markovproperty at stopping time Tn−1, we have, for n ≥ 1,

{XTn = in, Sn > tn, . . . , XT1 = i1, S1 > t1 | X0 = i0}= {XTn−1 = in−1, Sn−1 > tn−1, . . . , XT1 = i1, S1 > t1 | X0 = i0}× {XTn

= in, Sn > tn | XTn−1= in−1}

= {XTn−1 = in−1, Sn−1 > tn−1, . . . , XT1 = i1, S1 > t1 | X0 = i0}× {XT1 = in, T1 > tn | X0 = in−1}

= {XTn−1 = in−1, Sn−1 > tn−1, . . . , XT1 = i1, S1 > t1 | X0 = i0}× Pin−1,ine

−νin−1tn ,

where the first equality uses the strong Markov property, the second uses thehomogeneity of X and the third is due to theorem 2.3.

By induction, we obtain:

{XTn = in, Sn > tn, . . . , XT1 = i1, S1 > t1 | X0 = i0} =

n

=1


t ,

that is:

{Yn = in, Sn > tn, . . . , Y1 = i1, S1 > t1 | Y0 = i0} =

n

=1


t .

By taking t1 = . . . = tn = 0, we obtain:

{Yn = in, . . . , Y1 = i1 | Y0 = i0} =

n

=1

Pi −1,i ,

which means, from theorem 1.1, that Y is a Markov chain on the state space S, withtransition probability matrix P .

The following corollary shows that the sequence of sojourn times in the statesvisited successively by X is a sequence of independent and exponentially distributedrandom variables.

COROLLARY 2.1.– For all n ≥ 1, t1, . . . , tn ∈ + and i0, . . . , in ∈ S, we have:

{Sn > tn, . . . , S1 > t1 | Yn = in, . . . , Y0 = i0} =n

=1

e−νi −1t .


PROOF.– Using relations [2.6] and [1.1], since Y is a Markov chain with transitionprobability matrix P , we obtain:

{Sn >tn, . . . , S1 > t1 | Yn = in, . . . , Y0 = i0}

={Yn = in, Sn > tn, . . . , Y1 = i1, S1 > t1, Y0 = i0}

{Yn = in, . . . , Y0 = i0}

={Yn = in, Sn > tn, . . . , Y1 = i1, S1 > t1 | Y0 = i0} {Y0 = i0}

{Yn = in, . . . , Y0 = i0}

=

n

=1


t {Y0 = i0}

{Y0 = i0}n

=1

Pi −1,i

=

n

=1

e−νi −1t ,


The following theorem shows that the transition functions of the Markov chain Xare continuous on +.

THEOREM 2.5.– For all i, j ∈ S, the transition functions Pi,j(t) are continuous on+ and we have, for all h, t ≥ 0,

|Pi,j(t+ h)− Pi,j(t)| ≤ 1− e−νih.

PROOF.– Let h > 0. We have, from equation [2.1],

Pi,j(t+ h)− Pi,j(t) =k∈S

Pi,k(h)Pk,j(t)− Pi,j(t)

=k∈S,k=i

Pi,k(h)Pk,j(t)− Pi,j(t)[1− Pi,i(h)].

This leads to:

Pi,j(t+ h)− Pi,j(t) ≥ −Pi,j(t)[1− Pi,i(h)] ≥ −[1− Pi,i(h)]

and

Pi,j(t+ h)− Pi,j(t) ≤k∈S,k=i

Pi,k(h)Pk,j(t) ≤k∈S,k=i

Pi,k(h) ≤ 1− Pi,i(h).


In total, we have:

|Pi,j(t+ h)− Pi,j(t)| ≤ 1− Pi,i(h),

which proves the right-continuity of Pi,j(t) since, from lemma 2.2, Pi,j(t) is right-continuous at 0.

As for left-continuity, let us consider Pi,j(t− h)− Pi,j(t) with 0 < h < t so thatPi,j(t− h) is defined. We then have, by reusing the previous argument,

|Pi,j(t− h)− Pi,j(t)| = |Pi,j(t)− Pi,j(t− h)| ≤ 1− Pi,i(t− (t− h))

= 1− Pi,i(h),

which proves the left-continuity of Pi,j(t). The functions Pi,j(t) are, therefore,continuous on +. Actually, we have shown that for all h, we have:

|Pi,j(t+ h)− Pi,j(t)| ≤ 1− Pi,i(|h|),

which means that Pi,j(t) is uniformly continuous on +. From lemma 2.6, we obtain,for all h, t ≥ 0,

|Pi,j(t+ h)− Pi,j(t)| ≤ 1− e−νih,


Theorem 2.4 shows, in particular, that we can specify the probabilistic behaviorof the minimal process X by the joint distributions of the sequence of sojourn timesS1, . . . , Sn and of the states visited successively Y0, . . . , Yn. Indeed, the process beingminimal, we have t ≥ 0 and j ∈ S,

Xt = j =⇒ ζ > t

and

Xt ∈ S ⇐⇒ ζ > t.

Since (Tk)k≥0 is an increasing sequence of random variables, since X0 = Y0 andζ = limk−→∞ Tk = supk≥1 Tk with probability 1, it follows, for example, that for alli, j ∈ S and t ≥ 0, we have:

{ζ > t} = supk≥1

Tk > t =

∞

k=1

{Tk > t} =

∞

k=0

{Tk ≤ t < Tk+1}


and, since this last union is a disjoint union,

{Xt = j | X0 = i} = {Xt = j, ζ > t | X0 = i}

=∞

k=0

{Xt = j, Tk ≤ t < Tk+1 | Y0 = i} [2.7]

=∞

k=0

{Yk = j, Tk ≤ t < Tk+1 | Y0 = i}. [2.8]

REMARK 2.1.– This relation is of course false if the Markov chain X is not theminimal chain because, in case of explosion, a new sequence of jump times must beconsidered starting from time ζ and possibly leading to a new explosion, and so forth.

Thus, the necessary parameters for the study of the behavior of the minimalMarkov chain X are:

– The transition probability matrix P of the discrete-time Markov chain Y , whichhandles the jumps of X . The Markov chain Y is also called the Markov chainembedded at the transition instants of chain X .

– The non-negative real numbers νj , j ∈ S. The time spent by X in the statej, during a visit of X to j, follows the exponential distribution with rate νj . Theexpectation of this time is equal to 1/νj . If νj = 0 then this time is infinite withprobability 1.

We now introduce the matrix Q defined by:

Q = −Λ(I − P ), [2.9]

where Λ is the diagonal matrix containing the rates νj , j ∈ S, and where P is thetransition probability matrix of the discrete-time Markov chain Y embedded at thetransition instants of X . The coefficients of the matrix Q are, therefore, given by:

Qi,j =νiPi,j if i = j−νi if i = j.

Indeed, for all i ∈ S, we have Pi,i = 1 if i is absorbing and 0 otherwise. Inparticular, if i is an absorbing state, we have Qi,j = 0 for all j ∈ S. The propertiesof the stochastic matrix P lead to the following properties for the matrix Q. For alli, j ∈ S, we have:

Qi,j ≥ 0 for i = j, Qi,i ≤ 0 andj∈S

Qi,j = 0.


Thus, giving matrices Λ and P is equivalent to giving matrix Q, which is calledthe infinitesimal generator of the Markov chain X or the transition rate matrix of theMarkov chain X .

2.3. Kolmogorov’s backward equation

Kolmogorov’s backward integral equation describes the evolution of the transitionfunctions at time t when we condition with respect to the time of the first jump andthe time spent in the initial state X0. It is expressed by theorem 2.6.

LEMMA 2.8.– For all i, j ∈ S, the functions βi,j , defined for all t ≥ 0, by:

βi,j(t) =k∈S,k=i

Qi,kPk,j(t),

are finite and continuous on +.

PROOF.– From theorem 2.5, the functions Pk,j(t) are continuous and we have:

βi,j(t) =k∈S,k=i

Qi,kPk,j(t) ≤k∈S,k=i

Qi,k = νi < ∞.

It follows, from the dominated convergence theorem, that the functions βi,j(t) arecontinuous on +.

THEOREM 2.6.– For all t ≥ 0 and i, j ∈ S, we have:

Pi,j(t) = e−νit1{i=j} +t

0 k∈S,k=i

Qi,kPk,j(u)e−νi(t−u)du. [2.10]

PROOF.– From theorem 2.2, the time T1 of the first jump of the Markov chain Xfollows the exponential distribution with rate νi when the initial state is state i. If statei is absorbing, that is if T1 = ∞, we have νi = 0 and Pi,j(t) = 1{i=j} for all t ≥ 0.On the other hand, the right-hand side of relation [2.10] is equal to 1{i=j} if νi = 0,since in this case, we have Qi,k = 0, for all k ∈ S.

Let us now assume that the state i is not absorbing, that is T1 < ∞ and, therefore,νi > 0. By conditioning with respect to the stopping time T1, we have, from theorem2.2,

Pi,j(t) =∞

0

{Xt = j | T1 = u,X0 = i}νie−νiudu.


We distinguish two cases in this integral: u > t and u ≤ t.

If T1 = u > t and X0 = i then Xt = i. Therefore, for all u > t, we have:

{Xt = j | T1 = u,X0 = i} = 1{i=j},

which leads to:

Pi,j(t) =t

0

{Xt = j | T1 = u,X0 = i}νie−νiudu+ 1{i=j}∞

t

νie−νiudu,

that is:


0

{Xt = j | T1 = u,X0 = i}νie−νiudu. [2.11]

From theorem 2.3, we have, for all u > 0,

{Xu = k | T1 = u,X0 = i} = Pi,k.

If 0 < u ≤ t then, using this relation, the Markov property and the homogeneityof X , we obtain:

{Xt = j | T1 = u,X0 = i}=

k∈S

{Xt = j,Xu = k | T1 = u,X0 = i}

=k∈S

{Xt = j | Xu = k, T1 = u,X0 = i} {Xu = k | T1 = u,X0 = i}

=k∈S

Pi,k {Xt = j | Xu = k}

=k∈S

Pi,kPk,j(t− u).

By carrying this expression into [2.11], we obtain:


0 k∈S

Pi,kPk,j(t− u)νie−νiudu.


Performing the change of variable u := t − u and since, for k = i, we haveQi,k = νiPi,k, we obtain:


0 k∈S,k=i

Qi,kPk,j(u)e−νi(t−u)du,


Kolmogorov’s backward differential equation also describes the evolution of thetransition functions but through the use of their derivative instead of their integral. Itis expressed by the following theorem.

THEOREM 2.7.– Let X = {Xt, t ∈ +} be a Markov chain with infinitesimalgenerator Q. For all t ≥ 0, the transition functions Pi,j(t) are of class C1 (that isdifferentiable with continuous derivative) over + and we have:

P (t) = QP (t). [2.12]

PROOF.– Here again, the continuity and the differentiability at 0 are to be understoodas being to the right of 0. Equation [2.10] can also be written as:

Pi,j(t) = e−νit

⎡⎣1{i=j} +t

0 k∈S,k=i

Qi,kPk,j(u)eνiudu

⎤⎦ . [2.13]

From lemma 2.8, the functions βi,j , defined by:

βi,j(t) =k∈S,k=i

Qi,kPk,j(t),

are continuous over +. This proves that the functions Pi,j(t), given by [2.13], aredifferentiable over +. Differentiating relation [2.13], we obtain:

Pi,j(t) = −νiPi,j(t) + e−νit

k∈S,k=i

Qi,kPk,j(t)eνit

= −νiPi,j(t) +k∈S,k=i

Qi,kPk,j(t)

=k∈S

Qi,kPk,j(t),


that is P (t) = QP (t). It follows, using lemma 2.8 once again, that the functionsPi,j(t) are continuous over +.

Note that, in particular, at point t = 0, since P (0) = I , we have:

P (0) = Q,

which explains why the matrix Q is also called the infinitesimal generator of theMarkov chain X .

We have shown in theorem 2.7 that the differential equation [2.12] can beobtained from the integral equation [2.13]. Conversely, the integral equation [2.13]can be obtained easily from [2.12] by writing:

Pi,j(u)eνiu =

k∈S

Qi,kPk,j(u)eνiu = −νiPi,j(u)e

νiu +k∈S,k=i

Qi,kPk,j(u)eνiu,

that is:

Pi,j(u)eνiu + Pi,j(u)νie

νiu =k∈S,k=i

Qi,kPk,j(u)eνiu,

or

(Pi,j(u)eνiu) =

k∈S,k=i

Qi,kPk,j(u)eνiu.

Integrating this last relation between 0 and t, we obtain:

Pi,j(t)eνit − Pi,j(0) =

t

0 k∈S,k=i

Qi,kPk,j(u)eνiudu,

which is relation [2.13], since Pi,j(0) = 1{i=j}.

Let us now consider, for all i, j ∈ S, t ≥ 0 and n ≥ 0, the sequence of functionsPi,j(n, t) defined by:

Pi,j(n, t) = {Xt = j, Tn+1 > t | X0 = i}, [2.14]

where we recall that Tn+1 is the time of the (n + 1)th jump of X . These functions,together with their derivatives, are expressed by induction as shown by the followingtheorem.


THEOREM 2.8.– For all t ≥ 0, n ≥ 0 and i, j ∈ S, the functions Pi,j(n, t) are ofclass C1 over + and we have Pi,j(0, t) = e−νit1{i=j} and, for all n ≥ 1,

Pi,j(n, t) = e−νit1{i=j} +t

0 k∈S,k=i

Qi,kPk,j(n− 1, u)e−νi(t−u)du. [2.15]

Moreover, the derivatives Pi,j(n, t) satisfy:

Pi,j(n, t) =

⎧⎪⎪⎨⎪⎪⎩−νie

−νit1{i=j} if n = 0

−νiPi,j(n, t) +k∈S,k=i

Qi,kPk,j(n− 1, t) if n ≥ 1.[2.16]

PROOF.– The proof follows the same lines as those of theorem 2.6. From theorem2.2, the time T1 of the first jump of the Markov chain X follows the exponentialdistribution with rate νi when the initial state is state i. If the state i is absorbing,that is if T1 = ∞, we have νi = 0, Tn = ∞ for all n ≥ 1 and, by definition,Pi,j(n, t) = 1{i=j} for all t ≥ 0 and n ≥ 0. Relation [2.15] is then valid since, in thiscase, we have Qi,k = 0 for all k ∈ S.

Let us now assume that the state i is not absorbing, that is T1 < ∞ and, therefore,νi > 0. For n = 0, we have, by definition of T1 and from theorem 2.2,

Pi,j(0, t) = {Xt = j, T1 > t | X0 = i} = e−νit1{i=j},

which is differentiable over +.

For n ≥ 1, by conditioning with respect to the stopping time T1, we have, againfrom theorem 2.2,

Pi,j(n, t) =∞

0

{Xt = j, Tn+1 > t | T1 = u,X0 = i}νie−νiudu.

We distinguish in this integral the two cases where u > t and u ≤ t.

If T1 = u > t and X0 = i then Xt = i and Tn+1 > t. Therefore, for all u > t, wehave:

{Xt = j, Tn+1 > t | T1 = u,X0 = i} = 1{i=j},


which leads to:

Pi,j(n, t)

=t

0

{Xt = j, Tn+1 > t | T1 = u,X0 = i}νie−νiudu+ 1{i=j}∞

t

νie−νiudu,

that is:

Pi,j(n, t) = e−νit1{i=j}

+t

0

{Xt = j, Tn+1 > t | T1 = u,X0 = i}νie−νiudu. [2.17]

From theorem 2.3, we have, for all u > 0,

{Xu = k | T1 = u,X0 = i} = Pi,k.

If 0 < u ≤ t then, using this relation, the strong Markov property and thehomogeneity of X , we obtain, for n ≥ 1,

{Xt = j,Tn+1 > t | T1 = u,X0 = i}=

k∈S

{Xt = j, Tn+1 > t,Xu = k | T1 = u,X0 = i}

=k∈S

Pi,k {Xt = j, Tn+1 > t | Xu = k, T1 = u,X0 = i}

=k∈S

Pi,k {Xt−u = j, Tn > t− u | X0 = k}

=k∈S

Pi,kPk,j(n− 1, t− u).

Replacing this relation in [2.17], we obtain, for all n ≥ 1,


0 k∈S

Pi,kPk,j(n− 1, t− u)νie−νiudu.

Performing the change of variable u := t − u and since, for k = i, we haveQi,k = νiPi,k, we obtain:


0 k∈S,k=i

Qi,kPk,j(n− 1, u)e−νi(t−u)du,


which is relation [2.15]. This relation can also be written, for all n ≥ 1, as:

Pi,j(n, t) = e−νit

⎡⎣1{i=j} +t

0 k∈S,k=i

Qi,kPk,j(n− 1, u)eνiudu

⎤⎦ . [2.18]

Regarding differentiability, we proceed by induction. For n = 0, the functionsPi,j(0, t) = e−νit1{i=j} are of class C1 over + and we have:

Pi,j(0, t) = −νie−νit1{i=j}.

Let us assume that, for an integer n ≥ 1, the functions Pk,j(n − 1, t) aredifferentiable over +. From the dominated convergence theorem, the functions fi,j ,defined by:

fi,j(t) =k∈S,k=i

Qi,kPk,j(n− 1, t),

are continuous over +. The functions Pi,j(n, t), given by [2.18], are thusdifferentiable over +. By differentiating relation [2.18], we obtain:

Pi,j(n, t) = −νiPi,j(n, t) +k∈S,k=i

Qi,kPk,j(n− 1, t),

which is relation [2.16]. This series, as we have seen, being a continuous function over+, it follows that the functions Pi,j(n, t) are continuous over +, which completes

the proof.

2.4. Kolmogorov’s forward equation

Kolmogorov’s forward integral equation describes the evolution of the transitionfunctions at time t when we condition with respect to the last jump time before t.Note that this last jump time before t does not necessarily exist if the process X inquestion is not the minimal process. Indeed, in this case, there can be infinitely manyjumps before time t, therefore, it is possible that there is no last jump before t. Onthe other hand, if the process in question is the minimal process, as is the case inthis book then, for all t ≥ 0 and j ∈ S, the occurrence of event {Xt = j} impliesthat the number of jumps before t is finite and thus that the last jump before t exists.Kolmogorov’s forward integral equation is expressed by theorem 2.9. The proof of


this theorem uses several lemmas allowing the study of the distribution of the pair(Yn, Tn).

LEMMA 2.9.– For all i, j ∈ S, the functions ρi,j , defined for all t ≥ 0, by:

ρi,j(t) =k∈S,k=j

Pi,k(t)Qk,j ,

are finite are continuous over +.

PROOF.– Let i and j be two states of S. For all k ∈ S, we define the quantity μi,k by:

μi,k =∞

0

e−tPi,k(t)dt.

Since, for all t ≥ 0, Pi,k(t) ∈ [0, 1], we also have μi,k ∈ [0, 1]. Let Si be thesubset of S defined by:

Si = {k ∈ S | μi,k > 0}.

On the other hand, since Pi,k(t) is non-negative and continuous over +, we have:

μi,k = 0 ⇐⇒ Pi,k(t) = 0, for all t ≥ 0. [2.19]

From lemma 2.6, we have Pi,i(t) > 0 for all t ≥ 0, therefore, μi,i > 0 and i ∈ Si.We also have, from the monotone convergence theorem and lemma 2.3, for all ∈ Sand t ≥ 0,

k∈S

μi,ke−tPk, (t) =

k∈S

∞

0

e−uPi,k(u)e−tPk, (t)du

=∞

0

e−(t+u)

k∈S

Pi,k(u)Pk, (t)du

=∞

0

e−(t+u)Pi, (t+ u)du

=∞

t

e−sPi, (s)ds

≤∞

0

e−sPi, (s)ds

= μi, . [2.20]


Taking = j, this inequality shows that if μi,j = 0, then we have, for all t ≥ 0,

k∈S

μi,ke−tPk,j(t) =

k∈Si

μi,ke−tPk,j(t) = 0,

which implies that Pk,j(t) = 0 for all k ∈ Si and t ≥ 0. It follows that Pk,j(t) = 0and, therefore, in particular, that Pk,j(0) = Qk,j = 0 for all k ∈ Si. From equivalence[2.19], if k /∈ Si then we have Pi,k(t) = 0, for all t ≥ 0. We then obtain, if μi,j = 0,that is if j ∈ S \ Si,

ρi,j(t) =k∈S,k=j

Pi,k(t)Qk,j =k∈Si

Pi,k(t)Qk,j +k∈S\Si,k=j

Pi,k(t)Qk,j = 0.

Thus, we have ρi,j(t) = 0 for all t ≥ 0, which completes the proof in the casewhere μi,j = 0.

If μi,j = 0, that is if μi,j > 0 then we have j ∈ Si and, since equivalence [2.19]shows us that if k /∈ Si, then we have Pi,k(t) = 0, for all t ≥ 0, we obtain:

ρi,j(t) =k∈S,k=j

Pi,k(t)Qk,j =k∈Si,k=j

Pi,k(t)Qk,j .

We introduce the matrices P (t) and Q, of size (|Si| × |Si|), defined, for all , k ∈Si, by:

P ,k(t) =μi,ke

−tPk, (t)

μi,[2.21]

and

Q ,k =μi,k(Qk, − 1{ =k})

μi,. [2.22]

These matrices are well-defined since if ∈ Si then μi, > 0. Using inequality[2.20], it can be easily checked that the matrix P (t) satisfies:

P ,k(t) ≥ 0 andk∈S

P ,k(t) ≤ 1. [2.23]


Following theorem 2.7, matrix P (t) is differentiable, and we can easily checkthat Q = P (0). Using this result, the inequalities [2.23], relation [2.22] and Fatou’slemma, we have:

k∈S,k=j

Qj,k =k∈S,k=j

limh−→0

Pj,k(h)

h

≤ limh−→0

k∈S,k=j

Pj,k(h)

h

≤ limh−→0

1− Pj,j(h)

h

= −Qj,j

= −Qj,j + 1

= νj + 1. [2.24]

Relations [2.21] and 2.22 lead to:

Pi,k(t) =μi,ke

tPk,i(t)

μi,iand Qk,j =

μi,jQj,k

μi,k+ 1{j=k},

which gives:

ρi,j(t) =k∈Si,k=j

Pi,k(t)Qk,j =μi,je

t

μi,ik∈Si,k=j

Qj,kPk,i(t).

It follows that ρi,j(t) is finite since, from [2.24], we have:

ρi,j(t) ≤ μi,jet

μi,ik∈S,k=j

Qj,k ≤ μi,jet(νj + 1)

μi,i< ∞.

The functions Pk,i(t) being continuous, using [2.24] and the dominatedconvergence theorem, it follows that the functions fi,j , defined by:

fi,j(t) =k∈Si,k=j

Qj,kPk,i(t),

are also continuous and thus that the functions ρi,j are continuous.

[ ]


For all n ≥ 0, t ≥ 0 and i, j ∈ S, we define the functions Ri,j(n, t) and Ri(n, t)by:

Ri,j(n, t) = {Yn = j, Tn ≤ t | Y0 = i} and Ri(n, t) = {Tn ≤ t | X0 = i}.

Recall that we have Yn = XTn and thus, Y0 = X0. We have:

Ri(n, t) =j∈S

Ri,j(n, t).

For n = 0, we have, since T0 = 0,

Ri,j(0, t) = 1{i=j} and Ri(0, t) = 1

and for n = 1, we have, from theorem 2.3,

Ri,j(1, t) = Pi,j(1− e−νit) and Ri(1, t) = 1− e−νit. [2.25]

LEMMA 2.10.– For all t ≥ 0, n ≥ 0 and i, j ∈ S, the functions Ri,j(n, t) andRi(n, t) are of class C1 over + and we have, for all n ≥ 1,

Ri,j(n, t) =t

0 k∈S,k=i

Qi,kRk,j(n− 1, u)e−νi(t−u)du, [2.26]

Ri,j(n, t) = −νiRi,j(n, t) +k∈S,k=i

Qi,kRk,j(n− 1, t), [2.27]

Ri(n, t) =t

0 k∈S,k=i

Qi,kRk(n− 1, u)e−νi(t−u)du, [2.28]

Ri(n, t) = −νiRi(n, t) +k∈S,k=i

Qi,kRk(n− 1, t). [2.29]

PROOF.– We proceed by induction. The functions Ri,j(0, t) = 1{i=j} and Ri,j(1, t),given by relation [2.25], are of class C1 over +, understood on the right at 0. By


conditioning with respect to the pair (Y1, T1), we have:

Ri,j(n, t) = {Yn = j, Tn ≤ t | Y0 = i}

=t

0 k∈S

{Yn = j, Tn ≤ t | Y1 = k, T1 = u, Y0 = i}Ri,k(1, u)du

=t

0 k∈S

{Yn−1 = j, Tn−1 ≤ t− u | Y0 = k}Pi,kνie−νiudu

=t

0 k∈S

Rk,j(n− 1, t− u)Pi,kνie−νiudu

=t

0 k∈S

Pi,kRk,j(n− 1, u)νie−νi(t−u)du

=t

0 k∈S,k=i

Qi,kRk,j(n− 1, u)e−νi(t−u)du,

where the third equality uses the strong Markov property at time T1 and relation [2.25].The fifth equality comes from the change of variable u := t − u and the sixth fromthe definition of the infinitesimal generator Q. We have thus obtained relation [2.26],which can also be written as:

Ri,j(n, t) = e−νitt

0 k∈S,k=i

Qi,kRk,j(n− 1, u)eνiudu. [2.30]

Let us assume that, for an integer n ≥ 2, the functions Ri,j(n − 1, t) aredifferentiable over +. From the dominated convergence theorem, the functions fi,j ,defined by:

fi,j(t) =k∈S,k=i

Qi,kRk,j(n− 1, t),

are continuous over +. The functions Ri,j(n, t), given by [2.30], are, therefore,differentiable over +. Differentiating this relation, we obtain:

Ri,j(n, t) = −νiRi,j(n, t) +k∈S,k=i

Qi,kRk,j(n− 1, t),

which is relation [2.27]. This series, as we have just seen, being continuous over +,it follows that the functions Ri,j(n, t) are continuous over +.


By summing relation [2.26] over j ∈ S, we obtain [2.28] from Fubini’s theoremand the monotone convergence theorem. We then show, in the same way, by inductionand using the dominated convergence theorem that the functions fi, defined by:

fi(t) =k∈S,k=i

Qi,kRk(n− 1, t),

are continuous over +. The functions Ri(n, t), given by [2.28], are, therefore,differentiable over +. Differentiating this relation, we obtain:

Ri(n, t) = −νiRi(n, t) +k∈S,k=i

Qi,kRk(n− 1, t),

which is relation [2.29]. This series, as we have just seen, being a continuous functionover +, it follows that the functions Ri(n, t) are continuous over +.

Note that, by definition, the functions Ri,j(n, t) and Ri(n, t) are non-negativefunctions with values in [0, 1] and increasing with t and that their derivatives Ri,j(n, t)and Ri(n, t) are non-negative. Moreover, we have, for all t ≥ 0, n ≥ 0 and i, j ∈ S,

Ri,j(n, t) ≤ {Yn = j | Y0 = i} = (Pn)i,j .

Relation [2.27] then allows us to obtain, for all n ≥ 1, i, j ∈ S and t ≥ 0, theinequality:

Ri,j(n, t) ≤k∈S,k=i

Qi,kRk,j(n− 1, t) ≤k∈S,k=i

Qi,k(Pn−1)k,j . [2.31]

LEMMA 2.11.– For all t ≥ 0 and i, j ∈ S, we have, for n ≥ 1 and = 1, . . . , n,

Ri,j(n, t) =k∈S

t

0

Ri,k( , u)Rk,j(n− , t− u)du. [2.32]

and, for n ≥ 2 and = 1, . . . , n− 1,

Ri,j(n, t) =k∈S

t

0

Ri,k( , u)Rk,j(n− , t− u)du. [2.33]


PROOF.– We have seen in lemma 2.10 that the functions Ri,j(n, t) are of class C1 over+. For n ≥ 1 and = 1, . . . , n, by conditioning with respect to the pair (Y , T ),

we have:

Ri,j(n, t) = {Yn = j, Tn ≤ t | Y0 = i}

=k∈S

t

0

{Yn = j, Tn ≤ t | Y = k, T = u, Y0 = i}Ri,k( , u)du

=k∈S

t

0

{Yn− = j, Tn− ≤ t− u | Y0 = k}Ri,k( , u)du

=k∈S

t

0

Ri,k( , u)Rk,j(n− , t− u)du,

where the third equality uses the strong Markov property at time T . We have thusproved relation [2.32]. Note that we recover relation [2.26] for = 1 by performingthe change of variable u := t− u.

Denoting by uk(t) the general term of the series [2.32], we have, for n ≥ 2 and= 1, . . . , n− 1, using inequality [2.31] and since Ri,j(n, t) ≤ 1,

uk(t) =t

0

Ri,k( , u)Rk,j(n− , t− u)du

≤t

0 h∈S,h=i

Qi,h(P−1)h,kRk,j(n− , t− u)du

=h∈S,h=i

Qi,h(P−1)h,k

t

0

Rk,j(n− , t− u)du

=h∈S,h=i

Qi,h(P−1)h,kRk,j(n− , t)

≤h∈S,h=i

Qi,h(P−1)h,k

and, from Fubini’s theorem,

k∈S h∈S,h=i

Qi,h(P−1)h,k =

h∈S,h=i

Qi,h

k∈S

(P −1)h,k

=h∈S,h=i

Qi,h

= νi.


From the dominated convergence theorem, it follows that we can differentiaterelation [2.32] term by term, which means that we have, for = 1, . . . , n− 1,

Ri,j(n, t) =k∈S

t

0

Ri,k( , u)Rk,j(n− , t− u)du,


For all n ≥ 0, t ≥ 0 and i, j ∈ S, we define the functions Hi,j(n, t) by:

Hi,j(n, t) = {Yn = j, Tn ≤ t < Tn+1 | Y0 = i}.

LEMMA 2.12.– For all t ≥ 0, n ≥ 0 and i, j ∈ S, the functions Hi,j(n, t) are ofclass C1 over + and we have:

Hi,j(n, t) =

⎧⎪⎪⎪⎨⎪⎪⎪⎩e−νjt1{i=j} if n = 0

t

0 k∈S,k=j

Hi,k(n− 1, u)Qk,je−νj(t−u)du if n ≥ 1.

[2.34]

Moreover, the derivatives Hi,j(n, t) satisfy:

Hi,j(n, t) =

⎧⎪⎪⎨⎪⎪⎩−νje

−νjt1{i=j} if n = 0

−νjHi,j(n, t) +k∈S,k=j

Hi,k(n− 1, t)Qk,j if n ≥ 1.[2.35]

PROOF.– For all t ≥ 0 and i, j ∈ S, we have, from theorem 2.2,

Hi,j(0, t) = {Y0 = j, T1 > t | Y0 = i} = e−νit1{i=j} = e−νjt1{i=j},

which are functions of class C1 satisfying relations [2.34] and [2.35].


For n ≥ 1, by conditioning with respect to the pair (Yn, Tn), using the strongMarkov property at time Tn as well as theorem 2.2, we have:

Hi,j(n, t) = {Yn = j, Tn ≤ t < Tn+1 | Y0 = i}

=t

0

{Tn ≤ t < Tn+1 | Yn = j, Tn = u, Y0 = i}Ri,j(n, u)du

=t

0

{Tn+1 > t | Yn = j, Tn = u, Y0 = i}Ri,j(n, u)du

=t

0

{Tn+1 − Tn > t− u | Yn = j, Tn = u, Y0 = i}Ri,j(n, u)du

=t

0

{T1 > t− u | Y0 = j}Ri,j(n, u)du

=t

0

Ri,j(n, u)e−νj(t−u)du

= e−νjtt

0

Ri,j(n, u)eνjudu, [2.36]

where the fifth equality is due to the strong Markov property. Following lemma 2.10,the functions Ri,j(n, t) are continuous, therefore, relation [2.36] shows that thefunctions Hi,j(n, t) are differentiable, with derivatives given by:

Hi,j(n, t) = −νjHi,j(n, t) +Ri,j(n, t), [2.37]

which shows that the functions Hi,j(n, t) are continuous over +. We have thusproved that the functions Hi,j(n, t) are of class C1 over +.

From relation [2.33] of lemma 2.11, we have, for all n ≥ 2, taking = n− 1 andusing relations [2.25] and [2.36],

Ri,j(n, t) =k∈S

t

0

Ri,k(n− 1, u)Rk,j(1, t− u)du

=k∈S

t

0

Ri,k(n− 1, u)νkPk,je−νk(t−u)du

=k∈S

Hi,k(n− 1, t)νkPk,j

=k∈S,k=j

Hi,k(n− 1, t)Qk,j . [2.38]


It can be easily observed that relation [2.38] is also valid for n = 1. Combiningrelations [2.36] and [2.38] gives, for n ≥ 1,

Hi,j(n, t) =t

0 k∈S,k=j

Hi,k(n− 1, u)Qk,je−νj(t−u)du,

which is relation [2.34]. Replacing relation [2.38] in [2.37], we obtain relation [2.35].

THEOREM 2.9.– For all t ≥ 0 and i, j ∈ S, we have:

Pi,j(t) = e−νjt1{i=j} +t

0 k∈S,k=j

Pi,k(u)Qk,je−νj(t−u)du. [2.39]

PROOF.– By definition of the functions Hi,j(n, t), we have, by relation [2.8],

Pi,j(t) =∞

n=0

Hi,j(n, t) = e−νjt1{i=j} +∞

n=1

Hi,j(n, t). [2.40]

Summing relation [2.34] over n ≥ 1, gives, from Fubini’s theorem and themonotone convergence theorem,

∞

n=1

Hi,j(n, t) =t

0 k∈S,k=j

∞

n=1

Hi,k(n− 1, u)Qk,je−νj(t−u)du,

that is:

Pi,j(t)− e−νjt1{i=j} =t

0 k∈S,k=j

Pi,k(u)Qk,je−νj(t−u)du,


The following corollary gives the forward version of the equations satisfied bythe functions Pi,j(n, t), the corresponding backward equations having already beenobtained in theorem 2.8. Recall that these functions have been defined in [2.14] by:

Pi,j(n, t) = {Xt = j, Tn+1 > t | X0 = i}.

COROLLARY 2.2.– For all t ≥ 0 and i, j ∈ S, we have Pi,j(0, t) = e−νjt1{i=j} and,for all n ≥ 1,

Pi,j(n, t) = e−νjt1{i=j} +t

0 k∈S,k=j

Pi,k(n− 1, u)Qk,je−νj(t−u)du [2.41]


and

Pi,j(n, t) =

⎧⎪⎪⎨⎪⎪⎩−νje

−νjt1{i=j} if n = 0

−νjPi,j(n, t) +k∈S,k=j

Pi,k(n− 1, t)Qk,j if n ≥ 1.

PROOF.– We have already seen in theorem 2.8 that:

Pi,j(0, t) = e−νit1{i=j} = e−νjt1{i=j}.

Taking the definition of the Hi,j( , t) used in relation [2.40], we have:

Hi,j( , t) = {Y = j, T ≤ t < T +1 | Y0 = i}= {Xt = j, T ≤ t < T +1 | X0 = i}.

We, therefore, obtain, for all n ≥ 0,

Pi,j(n, t) =n

=0

Hi,j( , t).

The functions Hi,j( , t) satisfy relation [2.34], that is:

Hi,j( , t) =t

0 k∈S,k=j

Hi,k( − 1, u)Qk,je−νj(t−u)du.

For n ≥ 1, summing over = 1, . . . , n, from Fubini’s theorem, we obtain:


0 k∈S,k=j

n

=1

Hi,k( − 1, u)Qk,je−νj(t−u)du

= e−νjt1{i=j} +t

0 k∈S,k=j

n−1

=0

Hi,k( , u)Qk,je−νj(t−u)du

= e−νjt1{i=j} +t

0 k∈S,k=j

Pi,k(n− 1, u)Qk,je−νj(t−u)du,


which is relation [2.41] and we can also write:

Pi,j(n, t) = e−νjt

⎡⎣1{i=j} +t

0 k∈S,k=j

Pi,k(n− 1, u)Qk,jeνjudu

⎤⎦ . [2.42]

Note that the integrand of this last relation is continuous since we have, fromrelation [2.38], for n ≥ 1,

k∈S,k=j

Pi,k(n− 1, t)Qk,j =n−1

=0 k∈S,k=j

Hi,k( , t)Qk,j

=n−1

=0

Ri,j( + 1, t)

=

n

=1

Ri,j( , t)

and since the functions Ri,j( , t) are continuous from lemma 2.10. Differentiatingrelation [2.42], we get, for n ≥ 1,

Pi,j(n, t) = −νjPi,j(n, t) +k∈S,k=j

Pi,k(n− 1, t)Qk,j ,


As in the case of Kolmogorov’s backward integral equation, we can obtain theforward differential equation from the forward integral equation. This is given in thefollowing theorem.

THEOREM 2.10.– Let X = {Xt, t ∈ +} be a Markov chain with infinitesimalgenerator Q. For all t ≥ 0, we have:

P (t) = P (t)Q. [2.43]

PROOF.– Taking equation [2.39], we have, for all t ≥ 0 and i, j ∈ S,

Pi,j(t) = e−νjt

⎡⎣1{i=j} +t

0 k∈S,k=j

Pi,k(u)Qk,jeνjudu

⎤⎦ .


From lemma 2.9, the integrand of this relation is a continuous function, therefore,differentiating this equation gives:

Pi,j(t) = −νjPi,j(t) +k∈S,k=j

Pi,k(t)Qk,j =k∈S

Pi,k(t)Qk,j ,


Conversely, the integral equation [2.39] is obtained from equation [2.43] bywriting:

Pi,j(u)eνju =

k∈S

Pi,k(u)Qk,jeνju = −νjPi,j(u)e

νju +k∈S,k=j

Pi,k(u)Qk,jeνju,

that is:

Pi,j(u)eνju + Pi,j(u)νje

νju =k∈S,k=j

Pi,k(u)Qk,jeνju,

or

(Pi,j(u)eνju) =

k∈S,k=j

Pi,k(u)Qk,jeνju.

Integrating this last relation between 0 and t gives:

Pi,j(t)eνjt − Pi,j(0) =

t

0 k∈S,k=j

Pi,k(u)Qk,jeνjudu,

which is the integral equation [2.39], since Pi,j(0) = 1{i=j}.

2.5. Existence and uniqueness of the solutions

Let X be a continuous-time Markov chain on a state space S, with infinitesimalgenerator Q. As we have seen in the previous two sections, the transition functionmatrix P (t), not necessarily stochastic, satisfies at the same time Kolmogorov’sbackward equations, as shown in theorems 2.6 and 2.7, as well as Kolmogorov’sforward equations, as shown in theorems 2.9 and 2.10. Let us recall thatKolmogorov’s backward equations have the integral form:

Gi,j(t) = e−νit1{i=j} +t

0 k∈S,k=i

Qi,kGk,j(u)e−νi(t−u)du [2.44]


and the differential form:

G (t) = QG(t), G(0) = I. [2.45]

Likewise, Kolmogorov’s forward equations have the integral form:

Gi,j(t) = e−νjt1{i=j} +t

0 k∈S,k=j

Gi,k(u)Qk,je−νj(t−u)du [2.46]

and the differential form:

G (t) = G(t)Q, G(0) = I, [2.47]

where G(t) is the matrix of the Gi,j(t)’s, for i, j ∈ S. The following theoremdiscusses the existence and the uniqueness of the solutions to these equations.

THEOREM 2.11.– Let X be a continuous-time Markov chain on a countable statespace S, with infinitesimal generator Q.

1) The transition function matrix P (t), not necessarily stochastic, satisfies both thebackward equations [2.44] and the forward equations [2.46].

2) The matrix P (t) is the minimal solution to each of these equations, in the sensethat every other non-negative solution G(t) of the backward or forward equations,G(t) not necessarily being a transition function matrix, satisfies Pi,j(t) ≤ Gi,j(t),for all i, j ∈ S and t ≥ 0.

3) If the matrix P (t) is stochastic then it is the unique solution to the backwardequations [2.44] as well as the unique solution to the forward equations [2.46], inthe sense that these equations do not have any other non-negative solution G(t) suchthat, for all i ∈ S, j∈S Gi,j(t) ≤ 1. These results apply to the backward [2.45] andforward [2.47] equations.

PROOF.– Point 1 has already been proved by means of theorems 2.6 and 2.9.

For point 2, let us consider the sequence of functions Pi,j(n, t) defined in [2.14],for all t ≥ 0, n ≥ 0 and i, j ∈ S, by:

Pi,j(n, t) = {Xt = j, Tn+1 > t | X0 = i}.

We have, using relation [2.7],

Pi,j(t) = limn−→∞

n

k=0

{Xt = j, Tk ≤ t < Tk+1 | Y0 = i} = limn−→∞Pi,j(n, t). [2.48]


From theorem 2.8, we have, for all t ≥ 0 and i, j ∈ S, Pi,j(0, t) = e−νit1{i=j}and, for all n ≥ 1,


0 k∈S,k=i

Qi,kPk,j(n− 1, u)e−νi(t−u)du.

Let G(t) = (Gi,j(t))i,j∈S be a non-negative solution to the backward equations[2.44]. We then have:

Gi,j(t) ≥ e−νit1{i=j} = Pi,j(0, t).

Let us assume that for an integer n ≥ 1, we have, for all t ≥ 0 and i, j ∈ S,Gi,j(t) ≥ Pi,j(n− 1, t). It follows that:

Gi,j(t) = e−νit1{i=j} +t

0 k∈S,k=i

Qi,kGk,j(u)e−νi(t−u)du

≥ e−νit1{i=j} +t

0 k∈S,k=i

Qi,kPk,j(n− 1, u)e−νi(t−u)du

= Pi,j(n, t).

We have thus shown, by induction, that for all t ≥ 0, n ≥ 0 and i, j ∈ S, we have:

Gi,j(t) ≥ Pi,j(n, t).

Taking the limit when n tends to infinity, we obtain:

Gi,j(t) ≥ Pi,j(t),

which proves the minimality of the solution P (t) for the backward equations. Weproceed exactly in the same way for the forward equations starting from equations[2.41] instead of equations [2.15].

As for point 3, let us assume that the matrix P (t) is stochastic and let G(t) =(Gi,j(t))i,j∈S be a non-negative solution to the backward equations [2.44] or to theforward equations [2.46] such that for all i ∈ S, j∈S Gi,j(t) ≤ 1. As P (t) isminimal, we have, for all t ≥ 0 and i, j ∈ S, Gi,j(t)− Pi,j(t) ≥ 0. Summing over j,we obtain, for all i ∈ S,

0 ≤j∈S

[Gi,j(t)− Pi,j(t)] =j∈S

Gi,j(t)− 1 ≤ 0.

Therefore, we have G(t) = P (t), which proves uniqueness.


These results apply to the backward [2.45] and forward [2.47] equations since,as we have previously seen, equations [2.44] and [2.45] are equivalent and equations[2.46] and [2.47] are equivalent too.

Extensions of these results to non-homogeneous Markov chains are studied in[REU 53].

2.6. Recurrent and transient states

As in the discrete case, we study in this section the notion of recurrent and transientstates. We add to the already existing notations, when we deem it necessary, the chainX or Y concerned by these notations in subscript or superscript. For the discrete-timeMarkov chain Y embedded at the transition instants of the continuous-time Markovchain X , we denote, for all j ∈ S, by τY (j) the random variable that counts thenumber of transitions necessary to reach state j. Recall that this variable is defined by:

τY (j) = inf{n ≥ 1 | Yn = j},

where τY (j) = ∞ if this set is empty. This random variable was studied in section1.3. For all i, j ∈ S, we define:

fYi,j = {τY (j) < ∞ | X0 = i}.

For an analogous study in the continuous-time case, we recall that T1 is the timeof the first jump of the Markov chain X and we denote, for all j ∈ S, by τX(j) thefirst time after T1 where the chain X is in state j, that is:

τX(j) = inf{t ≥ T1 | Xt = j},

again with the convention inf ∅ = ∞. If the initial state is different from j, τX(j) isthe time of the first visit to state j and if the initial state is state j, τX(j) is the firstreturn time to state j. For all i, j ∈ S, we define:

fXi,j = {τX(j) < ∞ | X0 = i}.

If the initial state i is an absorbing state, we have, by definition since T1 = ∞,τX(j) = ∞ whereas in this case, we have, for the chain Y , τY (j) = ∞ if i = j andτY (j) = 1 if i = j. If the initial state is not an absorbing state, we have, for everystate j ∈ S, since the process X is minimal,

τX(j) < ∞ ⇐⇒ τY (j) < ∞. [2.49]


Indeed, if τY (j) < ∞ then we have:

τX(j) =

τY (j)

k=1

Sk, [2.50]

where Sk is the time spent by the chain X in the kth state it visits. Therefore, τY (j) <∞ leads to τX(j) < ∞. Conversely, if τX(j) < ∞ then, since the process X isthe minimal process, it cannot explode before time τX(j), therefore, the number nof visited states between the times 0 and τX(j) is finite, which means that we haveτY (j) = n < ∞.

Relation [2.50] is still valid if τY (j) = ∞ (or in an equivalent way if τX(j) = ∞)and if X is not explosive, since in this case, the sum of the series of terms Sk, which isequal to the explosion time ζ, is infinite. However, if τY (j) = ∞ (or in an equivalentway if τX(j) = ∞) and if X is explosive then the sum of the series of terms Sk, whichis equal to the explosion time ζ, can be finite. For example, for the continuous-timepure birth process, described in section 3.4, by taking X0 = 0, we have, by definition,τY (0) = τX(0) = ∞ whereas, if the series of terms 1/λk converges, the series ofterms Sk converges, which means that the process is explosive and, therefore, relation[2.50] is not satisfied.

It follows from equivalence [2.49] that for all i, j ∈ S, if i is not absorbing thenwe have:

fXi,j = fY

i,j . [2.51]

If i is absorbing then, as we have seen above, we have fXi,j = 0 and fY

i,j = 1{i=j}.

DEFINITION 2.4.– An absorbing state i ∈ S, that is such that νi = 0, is said to berecurrent. A non-absorbing state i ∈ S, that is such that νi > 0, is said to be recurrentfor X if fX

i,i = 1 and is said to be transient for X if fXi,i < 1. A Markov chain is

said to be recurrent (respectively transient) if all its states are recurrent (respectivelytransient).

We define, for all i, j ∈ S and t ≥ 0, Fi,j(t) = {τX(j) ≤ t | X0 = i}.


Fi,j(t) = Pi,j(1− e−νit) + e−νitt

0 k∈S,k=j

Pi,kFk,j(u)νieνiudu. [2.52]

Moreover, the functions Fi,j are of class C1 over + and we have:

Fi,j(t) = νi

⎡⎣Pi,j − Fi,j(t) +k∈S,k=j

Pi,kFk,j(t)

⎤⎦ .


PROOF.– If the initial state i is absorbing then relation [2.52] is satisfied since in thiscase, we have νi = 0, T1 = ∞, τX(j) = ∞ and Fi,j(t) = 0. We, therefore, assumethat the initial state i is not absorbing, that is T1 < ∞ with probability 1. Let us notefirst that if XT1 = j then τX(j) = T1. We then obtain, from theorem 2.3,

Fi,j(t) = {τX(j) ≤ t | X0 = i}= {τX(j) ≤ t,XT1 = j | X0 = i}

+k∈S,k=j

{τX(j) ≤ t,XT1 = k | X0 = i}

= {T1 ≤ t,XT1 = j | X0 = i}+

k∈S,k=j

{τX(j) ≤ t,XT1= k | X0 = i}

= Pi,j(1− e−νit) +k∈S,k=j

{τX(j) ≤ t,XT1 = k | X0 = i}.

Conditioning with respect to T1 and using the strong Markov property at time T1

and theorem 2.3, we have, for k = j,

{τX(j) ≤ t,XT1 = k | X0 = i}

=t

0

{τX(j) ≤ t | XT1 = k, T1 = u,X0 = i}Pi,kνie−νiudu

=t

0

{τX(j) ≤ t− u | X0 = k}Pi,kνie−νiudu

=t

0

Pi,kFk,j(t− u)νie−νiudu

= e−νitt

0

Pi,kFk,j(u)νieνiudu,

where the second equality uses the strong Markov property at time T1 and the lastequality is due to the change of variable u := t− u. Regrouping these terms leads to:

Fi,j(t) = Pi,j(1− e−νit) + e−νit

k∈S,k=j

t

0

Pi,kFk,j(u)νieνiudu.

From the monotone convergence theorem, this gives:

Fi,j(t) = Pi,j(1− e−νit) + e−νitt

0 k∈S,k=j

Pi,kFk,j(u)νieνiudu,


which is relation [2.52]. The functions Fk,j(t) are cumulative distribution functions,that is non-negative, right-continuous and bounded by 1. The series:

k∈S,k=j

Pi,kFk,j(t)

is thus convergent, bounded by 1 and the function fi,j defined by:

fi,j(t) =t

0 k∈S,k=j

Pi,kFk,j(u)νieνiudu

is continuous over +. This proves, following relation [2.52], that the functionsFi,j(t) are continuous over +. These functions being continuous, from thedominated convergence theorem, the series of functions:

k∈S,k=j

Pi,kFk,j(t)

is also a continuous function over +. The function fi,j is, therefore, differentiable,and following relation [2.52], the functions Fi,j(t) are also differentiable.Differentiating [2.52] gives:

Fi,j(t) = Pi,jνie−νit − νi Fi,j(t)− Pi,j(1− e−νit) + νi

k∈S,k=j

Pi,kFk,j(t),

that is:

Fi,j(t) = νi


Pi,kFk,j(t)

⎤⎦ .

This last series, as we have just seen, being continuous over +, the functions Fi,j

are continuous as well.

The following theorem gives an expression of Pi,j(t) as a function of the densityFi,j(t) of τX(j), conditional on X0 = i.



0

Fi,j(u)Pj,j(t− u)du. [2.53]


PROOF.– Recall that T1 is the first jump time of the Markov chain X . Observing that{Xt = j, T1 ≤ t} = {Xt = j, τX(j) ≤ t} and conditioning with respect to τX(j),which is a stopping time for X , we have:

Pi,j(t) = {Xt = j, T1 > t | X0 = i}+ {Xt = j, T1 ≤ t | X0 = i}= e−νit1{i=j} + {Xt = j, τX(j) ≤ t | X0 = i}

= e−νit1{i=j} +t

0

{Xt = j | τX(j) = u,X0 = i}Fi,j(u)du

= e−νit1{i=j} +t

0

{Xt−u = j | X0 = j}Fi,j(u)du

= e−νit1{i=j} +t

0

Pj,j(t− u)Fi,j(u)du,

where the fourth equality uses the strong Markov property at time τX(j).

COROLLARY 2.3.– The state j ∈ S is recurrent for X if and only if:∞

0

Pj,j(t)dt = ∞.

PROOF.– If νj = 0, that is if the state j is absorbing, therefore, recurrent by definition

then we have Pj,j(t) = 1 for all t ≥ 0 and thus∞

0

Pj,j(t)dt = ∞.

If νj > 0 then from theorem 2.13, taking i = j in equation [2.53], we have:

Pj,j(t) = e−νjt +t

0

Fj,j(u)Pj,j(t− u)du.

Integrating with respect to t and using Fubini’s theorem, we obtain:

∞

0

Pj,j(t)dt =1

νj+

∞

t=0

t

u=0

Fj,j(u)Pj,j(t− u)du dt

=1

νj+

∞

u=0

Fj,j(u)∞

t=u

Pj,j(t− u)dt du

=1

νj+

∞

u=0

Fj,j(u)du∞

t=0

Pj,j(t)dt

=1

νj+ fX

j,j

∞

0

Pj,j(t)dt.


This shows, in particular, that∞

0

Pj,j(t)dt ≥ 1

νj.

If∞

0

Pj,j(t)dt < ∞ then we have:

fXj,j =

∞

0

Pj,j(t)dt− 1

νj∞

0

Pj,j(t)dt

< 1,

which means that state j is transient for X .

Conversely, let us assume that∞

0

Pj,j(t)dt = ∞. Integrating [2.53], for i = j,

between 0 and T , for all T > 0, we have:

T

0

Pj,j(t)dt =1− e−νjT

νj+

T

t=0

t

u=0

Fj,j(u)Pj,j(t− u)du dt

=1− e−νjT

νj+

T

u=0

Fj,j(u)T

t=u

Pj,j(t− u)dt du

=1− e−νjT

νj+

T

u=0

Fj,j(u)T−u

t=0

Pj,j(t)dt du

≤ 1− e−νjT

νj+

T

u=0

Fj,j(u)duT

t=0

Pj,j(t)dt

≤ 1− e−νjT

νj+ fX

j,j

T

0

Pj,j(t)dt.

This shows, in particular, thatT

0

Pj,j(t)dt ≥ 1− e−νjT

νj. We, therefore, obtain:

fXj,j ≥

T

0

Pj,j(t)dt− 1− e−νjT

νjT

0

Pj,j(t)dt

−→ 1 when T −→ ∞,

which proves that fXj,j = 1, which means that state j is recurrent for X .


COROLLARY 2.4.– If the state j is transient for X then, for all i ∈ S, we have:∞

0

Pi,j(t)dt < ∞,

hence:

limt−→∞Pi,j(t) = 0 and lim

t−→∞ {Xt = j} = 0.

PROOF.– Let us take equation [2.53], that is:


0

Fi,j(u)Pj,j(t− u)du.

Integrating with respect to t and using Fubini’s theorem, we obtain, sincee−νit1{i=j} = e−νjt1{i=j},

∞

0

Pi,j(t)dt =1

νj1{i=j} +

∞

t=0

t

u=0

Fi,j(u)Pj,j(t− u)du dt

=1

νj1{i=j} +

∞

u=0

Fi,j(u)∞

t=u

Pj,j(t− u)dt du

=1

νj1{i=j} +

∞

u=0

Fi,j(u)du∞

t=0

Pj,j(t)dt

=1

νj1{i=j} + fX

i,j

∞

0

Pj,j(t)dt.

If j is transient then νj = 0 and, from corollary 2.3, we have∞

0

Pj,j(t)dt < ∞.

Since fXi,j ≤ 1, we obtain

∞

0

Pi,j(t)dt < ∞, hence limt−→∞Pi,j(t) = 0, since Pi,j(t)

is continuous over +. If α is the initial distribution of X then we have:

{Xt = j} =i∈S

αiPi,j(t)

and from the dominated convergence theorem, we obtain:

limt−→∞ {Xt = j} = 0,



THEOREM 2.14.– Let i be a state of S.

1) State i is recurrent (respectively transient) for the chain Y if and only if state iis recurrent (respectively transient) for the chain X .

2) X is recurrent (respectively transient) if and only if Y is recurrent (respectivelytransient).

PROOF.– Point 2 is immediately derived from point 1. Let i be a state of S. It is easyto see that state i is absorbing for X if and only if state i is absorbing for Y . If state iis not absorbing then, from relation [2.51], we have fX

i,i = fYi,i, which completes the

proof.

2.7. State classification

DEFINITION 2.5.– A state j ∈ S is said to be accessible from a state i ∈ S, for thechain X , if there exists a real number t ≥ 0 such that Pi,j(t) > 0. We then writei −→ j for X .

DEFINITION 2.6.– We say that two states i and j communicate, for the chain X , ifthey are accessible from one another. We then write i ←→ j for X . The chain X issaid to be irreducible if for all i, j ∈ S, we have i ←→ j for X .

The notion of accessibility between two states of S for discrete-time Markovchains, for example Y , has been defined in Chapter 1. The following theorem showsthat these notions are related to each other for the chains X and Y .

THEOREM 2.15.– Let i and j be two states of S. We have:

1) i −→ j for X ⇐⇒ i −→ j for Y.

2) i −→ j for X =⇒ Pi,j(t) > 0 for all t > 0.

3) X is irreducible ⇐⇒ Y is irreducible.

PROOF.– If i = j then we have i −→ i for X and i −→ i for Y , since Pi,i(0) =(P 0)i,i = 1. Moreover, from lemma 2.6, we have Pi,i(t) > 0, for all t ≥ 0.

Let us, therefore, assume that we have i = j. If i −→ j for X then there exists areal number t ≥ 0 such that Pi,j(t) > 0. This number t is positive since Pi,j(0) = 0.Following relation [2.40], there exists an integer n ≥ 1 such that:

{Yn = j, Tn ≤ t < Tn+1 | Y0 = i} > 0.


For this integer n, we have:

0 < {Yn = j, Tn ≤ t < Tn+1 | Y0 = i} ≤ {Yn = j | Y0 = i} = (Pn)i,j ,

which proves that i −→ j for Y .

Conversely, if i −→ j for Y then, by definition, there exists an integer n ≥ 0 suchthat (Pn)i,j > 0. This integer n is positive since (P 0)i,j = 0. For n ≥ 2, we have:

(Pn)i,j =i1,i2,...,in−1∈S

Pi,i1Pi1,i2 · · ·Pin−1,j > 0.

We then either have Pi,j > 0 if n = 1, or there exist states i1, i2, . . . , in−1, forn ≥ 2, such that:

Pi,i1Pi1,i2 · · ·Pin−1,j > 0.

Since i = j and since for all k ∈ S, we have Pk,k = 1 or Pk,k = 0 depending onwhether k is absorbing or not, we can assume that the states i, i1, i2, . . . , in−1 are notabsorbing. Indeed, if i is absorbing then necessarily i = j, which is contrary to thehypothesis, and if the state i is absorbing then necessarily i = · · · = in−1 = j, andwe have:

Pi,i1Pi1,i2 · · ·Pin−1,j = Pi,i1Pi1,i2 · · ·Pi −1,j > 0,

which only consists of replacing the integer n with the integer .

Since the states i, i1, i2, . . . , in−1 are not absorbing, we have, using relation [2.40]and lemma 2.12, for all t > 0 and k = 1, . . . , n, by taking i0 = i and in = j and ifik = ik−1,

Pik−1,ik(t) ≥ Hik−1,ik(1, t)

=t

0 ∈S, =ik

Hik−1, (0, u)Q ,ike−νik

(t−u)du

= νik−1Pik−1,ike

−νikt

t

0

e−(νik−1−νik

)udu

> 0.

If ik = ik−1 then we have, from lemma 2.6, Pik−1,ik(t) > 0. The Chapman–Kolmogorov equation [2.1] shows us that P (t) = (P (t/n))

n. We then have, again bytaking i0 = i and in = j, for all t > 0,

Pi,j(t) ≥ Pi0,i1(t/n)Pi1,i2(t/n) · · ·Pin−1,in(t/n) > 0,

which completes at the same time the proof of points 1 and 2. Point 3 is directlydeduced from point 1.


Lemma 2.6 states that for all t ≥ 0 and i ∈ S, we have Pi,i(t) > 0. This showsthat the notion of periodicity is meaningless for continuous-time Markov chains.

Theorems 2.14 and 2.15 show that the continuous-time Markov chain X inheritssome class properties of its embedded Markov chain Y . In particular, the two chainshave the same equivalence classes in the sense of communication between states, inparticular X is irreducible if and only if Y is irreducible. The recurrent and transientstates are the same for both chains as well as the class properties of recurrence andtransience. Likewise, the closed classes and sets are the same since, for all i = j,Pi,j = 0 is equivalent to Qi,j = 0. It follows that the general structure of aninfinitesimal generator Q of a continuous-time Markov chain is the same as that of atransition probability matrix P of a discrete-time Markov chain. It is, therefore, alsogiven by [1.3] where P is of course replaced by Q.

On the other hand, as we will see, the positive or null recurrent classes are not thesame, in general, for a continuous-time Markov chain X and its embedded Markovchain Y .

As in the discrete case, among the recurrent states, we distinguish positiverecurrent and null recurrent states. To do so, we denote by mi the expected returntime to state i, that is:

mi = {τX(i) | X0 = i}.

DEFINITION 2.7.– A recurrent, non-absorbing state i ∈ S is said to be positiverecurrent if mi < ∞ and null recurrent if mi = ∞. An absorbing state i ∈ S is saidto be positive recurrent. A Markov chain is said to be positive recurrent (respectivelynull recurrent) if all its states are positive recurrent (respectively null recurrent).

Note that if a non-absorbing state i is such that mi < ∞ then this state is recurrent.

Let us define, for all i, j ∈ S,

mi,j = {τX(j) | X0 = i}.The following theorem gives an equation that is satisfied by the mi,j . Note that we

have mi = mi,i.

THEOREM 2.16.– Let X be a continuous-time Markov chain. For all i, j ∈ S, wehave:

mi,j =1

νi+

k∈S,k=j

Pi,kmk,j , [2.54]

where we define 1/νi = ∞ if νi = 0.


PROOF.– If the initial state i is absorbing then νi = 0 and thus T1 = ∞ withprobability 1. It then follows, by definition, that τX(j) = ∞ with probability 1,hence mi,j = ∞ and relation [2.54] is satisfied.

If the initial state i is not absorbing then we have νi > 0 and T1 < ∞ withprobability 1. Proceeding in the same way as in the proof of theorem 2.12, we have,for all i, j ∈ S,

mi,j = {τX(j) | X0 = i}= Pi,j {τX(j) | XT1 = j,X0 = i}

+k∈S,k=j

Pi,k {τX(j) | XT1 = k,X0 = i}

=1

νiPi,j +

k∈S,k=j

Pi,k {τX(j) | XT1 = k,X0 = i}.

By conditioning with respect to T1, using theorem 2.3 and the strong Markovproperty at time T1, we have, for k = j,

{τX(j) | XT1 = k,X0 = i}

=∞

0

{τX(j) | XT1 = k, T1 = s,X0 = i}νie−νisds

=∞

0

(s+ {τX(j) | X0 = k}) νie−νisds

=1

νi+ {τX(j) | X0 = k}

=1

νi+mk,j .

In total, we obtain:

mi,j =1

νiPi,j +

1

νik∈S,k=j

Pi,k +k∈S,k=j

Pi,kmk,j

=1

νi+

k∈S,k=j

Pi,kmk,j ,



2.8. Explosion

The explosion phenomenon appears when the Markov chain X produces infinitelymany jumps in a finite time. We denote by ζ the first explosion time and we onlyconsider the minimal process here, which consists of leaving the process X in anabsorbing state Δ, which is not in S, from the time ζ. Thus, if ζ < ∞ then we takeXt = Δ for all t ≥ ζ. As we have seen at the beginning of this chapter, the explosiontime ζ is defined by:

ζ = limn−→∞Tn =

∞

n=1

Sn.

DEFINITION 2.8.– A continuous-time Markov chain X is said to be explosive if thereexists a state i ∈ S such that {ζ < ∞ | X0 = i} > 0 and non-explosive otherwise.

As we have previously seen, since X is the minimal process, we have, for all t ≥ 0,

Xt ∈ S ⇐⇒ ζ > t.

We then have, for all t ≥ 0,

j∈S

Pi,j(t) = {Xt ∈ S | X0 = i} = {ζ > t | X0 = i}.

It follows that:

X non-explosive ⇐⇒ for all i ∈ S, {ζ = ∞ | X0 = i} = 1

⇐⇒ for all i ∈ S and t ≥ 0, {ζ > t | X0 = i} = 1

⇐⇒ for all i ∈ S and t ≥ 0,j∈S

Pi,j(t) = 1. [2.55]

LEMMA 2.13.– Let (Vn)n≥1 be a sequence of independent and identicallyexponentially distributed random variables, with rate ν > 0. The sum V1 + · · · + Vn

follows the Erlang distribution with n phases and with rate ν, that is for all t ≥ 0, wehave:

{V1 + · · ·+ Vn > t} =∞

t

νe−νx (νx)n−1

(n− 1)!dx =

n−1

k=0

e−νt (νt)k

k!.

Moreover, if V =∞

n=1

Vn then we have {V = ∞} = 1.


PROOF.– The result holds for n = 1 since, in this case, we obtain the exponentialdistribution with rate ν. For n ≥ 2, since the variables Vk are independent, the densitygn of V1 + · · ·+ Vn is obtained by convolution as:

gn(x) =x

0

gn−1(y)g1(x− y)dy.

By induction, we obtain, for all n ≥ 1,

gn(x) = νe−νx (νx)n−1

(n− 1)!.

The result is then obtained by integrating gn between t and ∞.

For all k ≥ 1, we have:

e−Vk =∞

0

e−xνe−νxdx =ν

ν + 1,

hence, by the independence of the Vk,

exp −n

k=1

Vk =n

k=1

e−Vk =ν

ν + 1

n

.

Since 0 ≤ e−(V1+···+Vn) ≤ 1, we have, from the dominated convergence theorem,

{e−V } = limn−→∞ exp −

n

k=1

Vk

= limn−→∞ exp −

n

k=1

Vk

= limn−→∞

ν

ν + 1

n

= 0,

that is V = ∞ with probability 1.

THEOREM 2.17.– Let X be a continuous-time Markov chain on a state space S. Ifone of the following conditions is satisfied then X is non-explosive.


1) S is finite.

2) supi∈S

νi < ∞.

3) X0 = i and state i is recurrent.

PROOF.– If S is finite then supi∈S νi < ∞, therefore, it is sufficient to considerconditions 2 and 3. For all n ≥ 1, let us define Vn = νYn−1Sn if Sn < ∞ andVn = ∞ otherwise. From corollary 2.1, we have, if Vn < ∞,

{Vn > tn, . . . , V1 > t1 | Yn = in, . . . , Y0 = i0} =n

=1

e−t .

By deconditioning, we obtain:

{Vn > tn, . . . , V1 > t1} =n

=1

e−t ,

which shows that if Vn < ∞ then the random variables V1, . . . , Vn are independentand identically exponentially distributed with rate 1. Defining:

V =

∞

n=1

Vn

we have, from lemma 2.13, {V = ∞} = 1.

Let us define ν = supi∈S νi. If ν = 0 then every state is absorbing and there is noexplosion. Indeed, we have Sn = ∞ for all n ≥ 1 and thus {ζ = ∞} = 1.

If 0 < ν < ∞ then we have:

V =∞

n=1

Vn =∞

n=1

νYn−1Sn ≤ ν∞

n=1

Sn = νζ.

Hence:

{ζ = ∞} = {νζ = ∞} ≥ {V = ∞} = 1,

which proves that condition 2 implies that the chain X does not explode.


Let us assume that X0 = i and that state i is recurrent. If νi = 0 then state i isabsorbing and there is no explosion as we have seen in the proof of condition 2. Ifνi > 0, from theorem 2.14, state i being recurrent for X , it is also recurrent for Y .Since X0 = Y0 = i, from corollary 1.3, the total number of visits Ni of Y to state i isinfinite i-a.s., that is {Ni = ∞ | Y0 = i} = 1. We denote, in order to simplify thewriting, by τ = τY (i) the number of transitions necessary to return to state i, that is:

τ = inf{n ≥ 1 | Yn = i}.

For ≥ 1, we define the sequence τ of passage times to state i by defining τ0 = 0and

τ = inf{n ≥ τ −1 + 1 | Yn = i}.

Note that we have τ = τ1. Since i is recurrent, we have, for all ≥ 0, τ < ∞and, therefore, Yτ = i with probability 1. By definition of Vn, we have:

Vτ +1 = νYτ Sτ +1 = νiSτ +1.

By the strong Markov property, the sequence (Sτ +1) ≥0 of the successive sojourntimes in state i is a sequence of independent and identically exponentially distributedrandom variables with rate νi. The sequence (Vτ +1) ≥0 thus forms a sequence ofindependent and identically exponentially distributed random variables with rate 1.

By taking V =∞

=0

Vτ +1, we have, from lemma 2.13, {V = ∞} = 1. We also

have:

V =

∞

=0

Vτ +1 = νi

∞

=0

Sτ +1 ≤ νi

∞

=1

S = νiζ.

Hence, since νi > 0,

{ζ = ∞} = {νiζ = ∞} ≥ {V = ∞} = 1,

which proves that condition 3 implies that the chain X does not explode.

In particular, this theorem shows that, among irreducible chains, only the transientchains can explode.


THEOREM 2.18.– Let X be a continuous-time Markov chain on a state space S, withinfinitesimal generator Q. For all θ > 0, we define:

zi = {e−θζ | X0 = i}and we denote by z = (zi)i∈S the column vector with zi as coefficients. We then have:

1) for all i ∈ S, 0 ≤ zi ≤ 1,

2) Qz = θz.

Moreover, if z̃ = (z̃i)i∈S is a column vector satisfying points 1 and 2 then we havez̃i ≤ zi, for all i ∈ S.

PROOF.– For all θ > 0, we have 0 ≤ e−θζ ≤ 1 and, therefore, we have0 ≤ {e−θζ} ≤ 1, for every initial distribution of X . By starting in state i, we obtain0 ≤ zi ≤ 1.

Conditioning with respect to the pair (XT1 , T1) and by the strong Markov property

applied at time T1, we have, for all i ∈ S, using the fact that ζ =

∞

k=1

Sk and that

T1 = S1,

{e−θζ | X0 = i}

=j∈S

Pi,j

∞

0

{e−θζ | XT1 = j, T1 = t,X0 = i}νie−νitdt

=j∈S

Pi,j

∞

0

{e−θt−θ(ζ−T1) | XT1 = j, T1 = t,X0 = i}νie−νitdt

=j∈S

Pi,j

∞

0

e−θt {e−θζ | X0 = j}νie−νitdt

=j∈S

Pi,j {e−θζ | X0 = j}∞

0

e−θtνie−νitdt

=νi

νi + θj∈S

Pi,j {e−θζ | X0 = j},

that is:

zi =νi

νi + θj∈S

Pi,jzj ,


which can also be written as:

(νi + θ)zi =j∈S,j=i

Qi,jzj ,

that is, since νi = −Qi,i,

θzi =j∈S

Qi,jzj .

We have thus shown that Qz = θz.

For n ≥ 0, let us consider the time of the (n+ 1)th jump Tn+1. We have:

Tn+1 =n+1

k=1

Sk

and conditioning with respect to the pair (XT1 , T1) and by the strong Markov propertyapplied at time T1, we have, for all i ∈ S, since T1 = S1,

{e−θTn+1 | X0 = i}

=j∈S

Pi,j

∞

0

{e−θTn+1 | XT1= j, T1 = t,X0 = i}νie−νitdt

=j∈S

Pi,j

∞

0

{e−θt−θ(Tn+1−T1) | XT1 = j, T1 = t,X0 = i}νie−νitdt

=j∈S

Pi,j

∞

0

e−θt {e−θTn | X0 = j}νie−νitdt

=j∈S

Pi,j {e−θTn | X0 = j}∞

0

e−θtνie−νitdt

=νi

νi + θj∈S

Pi,j {e−θTn | X0 = j}.

Let z̃ = (z̃i)i∈S be a column vector that satisfies points 1 and 2. Since T0 = 0, wehave {e−θT0 | X0 = i} = 1 and, therefore, using point 1 we have, for all i ∈ S,

z̃i ≤ {e−θT0 | X0 = i}.


We now proceed by induction. Assume that, for all i ∈ S, we have:

z̃i ≤ {e−θTn | X0 = i},

for an integer n ≥ 0. Since z̃ satisfies condition 2, we have, for all i ∈ S,

z̃i =νi

νi + θj∈S

Pi,j z̃j

≤ νiνi + θ

j∈S

Pi,j {e−θTn | X0 = j}

= {e−θTn+1 | X0 = i}.

We have thus shown that for all n ≥ 0 and for all i ∈ S, we have:

z̃i ≤ {e−θTn | X0 = i}.

From the dominated convergence theorem, we have, since e−θTn ≤ 1 andζ = lim

n−→∞Tn,

limn−→∞ {e−θTn | X0 = i} = {e−θζ | X0 = i} = zi,

thus we have z̃i ≤ zi, for all i ∈ S.

The next theorem is called Reuter’s criterion and gives a necessary and sufficientcondition for a Markov chain to be non-explosive.

THEOREM 2.19.– REUTER’S CRITERION.– Let X be a continuous-time Markovchain on a state space S, with infinitesimal generator Q. Let z = (zi)i∈S be a columnvector indexed by the states of S. The following conditions are equivalent:

1) X is not explosive.

2) For all θ > 0, Qz = θz and 0 ≤ zi ≤ 1 for all i ∈ S =⇒ z = 0.

PROOF.– If X is not explosive then, for all i ∈ S, we have {ζ = ∞ | X0 = i} = 1and thus {e−θζ | X0 = i} = 0. From theorem 2.18, Qz = θz and 0 ≤ zi ≤ 1 forall i ∈ S implies that zi ≤ {e−θζ | X0 = i} for all i ∈ S. Therefore, zi = 0 for alli ∈ S and condition 2 is satisfied.


Conversely, if condition 2 is satisfied then, from theorem 2.18, we have {e−θζ |X0 = i} = 0 for all i ∈ S and thus {ζ = ∞ | X0 = i} = 1, which means that X isnot explosive.

2.9. Irreducible and recurrent Markov chains

Let X = {Xt, t ∈ +} be a continuous-time Markov chain on a state space S,with infinitesimal generator Q. For s > 0, let us consider the discrete-time processW (s) = {W (s)

n , n ≥ 0} defined, for all n ≥ 0, by:

W (s)n = Xns.

This process is called the s-skeleton of the process X since it is extracted from theprocess X at times that are multiples of s.

LEMMA 2.14.– If X is not explosive then, for all s > 0, W (s) is a homogeneousdiscrete-time Markov chain on S, with transition probability matrix P (s). Moreover,W (s) is aperiodic and if X is irreducible then W (s) is irreducible.

PROOF.– Let s > 0. If X is not explosive then, from equivalence [2.55], the matrixP (s) is stochastic.

For all n ∈ and for all i0, . . . , in ∈ S, since X is a homogeneous Markov chain,we have:

{W (s)n = in | W (s)

n−1 = in−1, . . . ,W(s)1 = i1,W

(s)0 = i0}

= {Xns = in | X(n−1)s = in−1, . . . , Xs = i1, X0 = i0}= {Xns = in | X(n−1)s = in−1}= {Xs = in | X0 = in−1}= Pin−1,in(s),

therefore, W (s) is a homogeneous discrete-time Markov chain on S. The chain W (s)

is aperiodic since, from lemma 2.6, we have, for all i ∈ S, Pi,i(s) > 0. If X isirreducible then from point 2 of theorem 2.15, we have, for all i, j ∈ S, Pi,j(s) > 0,therefore, W (s) is irreducible. In this case, the matrix P (s) does not contain any nullelements.

THEOREM 2.20.– If X is not explosive then for all s > 0, we have:

1) The state i is recurrent for X if and only if i is recurrent for W (s).

2) X is recurrent if and only if W (s) is recurrent.


PROOF.– If X is not explosive then, from lemma 2.14, for all s > 0, W (s) is a Markovchain with transition probability matrix P (s). Point 2 is immediately deduced frompoint 1. To prove point 1, it is sufficient to show, from theorem 1.8, corollary 2.3 andsince, from the Chapman–Kolmogorov equation [2.1], [P (s)]

n= P (ns), that:

∞

0

Pi,i(t)dt < ∞ ⇐⇒∞

n=0

Pi,i(ns) < ∞. [2.56]

To show [2.56], we write:∞

0

Pi,i(t)dt =

∞

n=0

(n+1)s

ns

Pi,i(t)dt.

Using the change of variable u = t− ns, we obtain:

(n+1)s

ns

Pi,i(t)dt =s

0

Pi,i(ns+ u)du

and

s minu∈[0,s]

Pi,i(ns+ u) ≤s

0

Pi,i(ns+ u)du ≤ s maxu∈[0,s]

Pi,i(ns+ u).

In total:

s∞

n=0

minu∈[0,s]

Pi,i(ns+ u) ≤∞

0

Pi,i(t)dt ≤ s∞

n=0

maxu∈[0,s]

Pi,i(ns+ u).

The Chapman–Kolmogorov equation [2.1] allows us to write, for u ∈ [0, s],

Pi,i(ns+ u) =k∈S

Pi,k(ns)Pk,i(u) ≥ Pi,i(ns)Pi,i(u) ≥ Pi,i(ns)γ,

where:

γ = minu∈[0,s]

Pi,i(u).

We then obtain:

minu∈[0,s]

Pi,i(ns+ u) ≥ Pi,i(ns)γ.


From lemma 2.6 and theorem 2.7, the function Pi,i(u) is continuous and positive,therefore, we have γ > 0. In the same way, we have, for u ∈ [0, s],

Pi,i((n+ 1)s) = Pi,i(ns+ u+ s− u)

=k∈S

Pi,k(ns+ u)Pk,i(s− u)

≥ Pi,i(ns+ u)Pi,i(s− u).

From this, it follows that:

Pi,i(ns+ u) ≤ Pi,i((n+ 1)s)

Pi,i(s− u)≤ Pi,i((n+ 1)s)

minu∈[0,s]

Pi,i(s− u)=

Pi,i((n+ 1)s)

γ,

that is:

maxu∈[0,s]

Pi,i(ns+ u) ≤ Pi,i((n+ 1)s)

γ.

By regrouping these results, we get:

γs

∞

n=0

Pi,i(ns) ≤∞

0

Pi,i(t)dt ≤ s

γ

∞

n=0

Pi,i((n+ 1)s) ≤ s

γ

∞

n=0

Pi,i(ns),

hence we deduce equivalence [2.56].

DEFINITION 2.9.– We say that the Markov chain X has an invariant measure v =(vj , j ∈ S) if v is a measure on S, that is 0 ≤ vj < ∞ for all j ∈ S, and if vQ = 0.The measure v is said to be positive if vj > 0 for all j ∈ S.

THEOREM 2.21.– If the Markov chain X is irreducible and recurrent then it has, upto a multiplicative constant, a unique, positive invariant measure, denoted by γX . If|S| ≥ 2, we have, for all i ∈ S,

γXi = γY

i /νi,

where γY is the unique, positive invariant measure, up to a multiplicative constant, ofthe embedded Markov chain Y .

PROOF.– The case where |S| = 1 is trivial. If |S| ≥ 2, we necessarily have νj > 0for all j ∈ S, since X is irreducible. From theorems 2.14 and 2.15, the embeddedMarkov chain Y is also irreducible and recurrent. From theorem 1.17, Y has, up to amultiplicative constant, a unique, positive invariant measure. If we denote this unique


measure by γY , we have γYj > 0 for all j ∈ S and γY = γY P . Let us define, for all

i ∈ S, γXi = γY

i /νi. Since Q = Λ(P − I), we have, for all j ∈ S, γXj > 0 and

γXQj=

i∈S

γXi Qi,j

=i∈S

γXi νi(Pi,j − 1{i=j})

=i∈S

γYi (Pi,j − 1{i=j})

= (γY P )j − γYj

= 0,

which proves that γX is an invariant measure for X and that it is positive. Themeasure γY being unique, up to a multiplicative constant, γX is also unique, up to amultiplicative constant.

THEOREM 2.22.– Let X be an irreducible and recurrent Markov chain, withinfinitesimal generator Q and let μ be a measure on S. Then:

μQ = 0 ⇐⇒ μP (s) = μ for all s > 0.

PROOF.– The result is trivial if |S| = 1. In the case where |S| ≥ 2, since X isirreducible, we have νj > 0 for all j ∈ S. Let s > 0. Since X is recurrent, X isnot explosive from theorem 2.17. From lemma 2.14, the matrix P (s) is stochasticand the discrete-time Markov chain W (s), defined, for n ≥ 0, by W

(s)n = Xns, is

irreducible since X is irreducible. Finally, from theorem 2.20, the chain W (s) is alsorecurrent. From theorem 1.17, W (s) has, up to a multiplicative constant, a unique,positive invariant measure. From theorem 2.21, X also has, up to a multiplicativeconstant, a unique, positive invariant measure. To prove the theorem, it is sufficient tofind a measure μ positive and such that μQ = 0 and μP (s) = μ.

Recall that the random variable τX(i) is defined by:

τX(i) = inf{t ≥ T1 | Xt = i}and that, in the same way, we have defined, for the embedded Markov chain Y ,

τY (i) = inf{n ≥ 1 | Yn = i}.From theorem 1.16, we have τY (i) < ∞ with probability 1. From this, it follows,

from equivalence [2.49], that we have τX(i) < ∞ with probability 1. For all i, j ∈ S,let us consider the random variable Ui,j defined by:

Ui,j =τX(i)

0

1{Xt=j}dt.


Using relation [2.50], we have:

Ui,j =

τY (i)

n=1

Sn1{Yn−1=j} =∞

n=1

Sn1{Yn−1=j,τY (i)>n−1}.

For all i ∈ S, we define the measure μi = (μij , j ∈ S) by:

μij = {Ui,j | X0 = i} .

We then have, from the monotone convergence theorem and the strong Markovproperty, and since X0 = Y0,

μij =

∞

n=1

Sn1{Yn−1=j,τY (i)>n−1} Y0 = i

=∞

n=1

{Sn1{Yn−1=j,τY (i)>n−1} | Y0 = i}

=

∞

n=1

{Sn | Yn−1 = j, τY (i) > n− 1, Y0 = i}

× {Yn−1 = j, τY (i) > n− 1 | Y0 = i}

=∞

n=1

{Sn | Yn−1 = j} {1{Yn−1=j,τY (i)>n−1} | Y0 = i}

=1

νj

∞

n=1

{1{Yn−1=j,τY (i)>n−1} | Y0 = i}

=1

νj

∞

n=1

1{Yn−1=j,τY (i)>n−1} Y0 = i

=1

νj

⎧⎨⎩τY (i)

n=1

1{Yn−1=j} Y0 = i

⎫⎬⎭=

1

νj

⎧⎨⎩τY (i)

n=1

1{Yn=j} Y0 = i

⎫⎬⎭ ,

where the second and the sixth equalities are due to the monotone convergencetheorem and the fourth equality uses the strong Markov property, since τY (i) is a


stopping time. The fifth equality uses corollary 2.1. The last equality is due to the factthat YτY (i) = i, which gives {1{Y0=j} | Y0 = i} = 1{i=j} ={1{YτY (i)=j} | Y0 = i}. We have seen in the proof of theorem 1.17, that the

measure γi = (γij , j ∈ S), defined by:

γij =

⎧⎨⎩τY (i)

n=1

1{Yn=j} Y0 = i

⎫⎬⎭ ,

is, up to a multiplicative constant, the unique, positive invariant measure of the chainY . We then have γi = γiP and since we have just shown that μi

j = γij/νj , we have,

from theorem 2.21, μiQ = 0.

It remains to show that μiP (s) = μi, for all s > 0. By the strong Markov propertyapplied at the stopping time τX(i), we have, for all s > 0, since XτX(i) = i,

τX(i)+s

τX(i)

1{Xt=j}dt X0 = i =τX(i)+s

τX(i)

1{Xt=j}dt XτX(i) = i

=s

0

1{Xt=j}dt X0 = i .

We then have, for all s > 0, from Fubini’s theorem,

μij =

τX(i)

0

1{Xt=j}dt X0 = i

=s

0

1{Xt=j}dt+τX(i)

s

1{Xt=j}dt X0 = i

=τX(i)+s

τX(i)

1{Xt=j}dt+τX(i)

s

1{Xt=j}dt X0 = i

=τX(i)+s

s

1{Xt=j}dt X0 = i

=τX(i)

0

1{Xt+s=j}dt X0 = i

=∞

0

1{Xt+s=j,τX(i)>t}dt X0 = i


=∞

0

{Xt+s = j, τX(i) > t | X0 = i}dt

=∞

0 k∈S

{Xt+s = j,Xt = k, τX(i) > t | X0 = i}dt

=∞

0 k∈S

{Xt+s = j | Xt = k, τX(i) > t,X0 = i}

× {Xt = k, τX(i) > t | X0 = i}dt.

From this it follows, by the strong Markov property, since τX(i) is a stopping time,from the monotone convergence theorem and Fubini’s theorem, that:

μij =

∞

0 k∈S

{Xt+s = j | Xt = k} {1{Xt=k,τX(i)>t} | X0 = i}dt

=∞

0 k∈S

Pk,j(s) {1{Xt=k,τX(i)>t} | X0 = i}dt

=k∈S

∞

0

{1{Xt=k,τX(i)>t} | X0 = i}dtPk,j(s)

=k∈S

∞

0

1{Xt=k,τX(i)>t}dt X0 = i Pk,j(s)

=k∈S

τX(i)

0

1{Xt=k}dt X0 = i Pk,j(s)

=k∈S

μikPk,j(s).

We have thus shown that μi = μiP (s).

As in the discrete case, we denote by the column vector whose entries are allequal to 1 and whose dimension is determined by the context of its use.

DEFINITION 2.10.– An invariant probability for X on S is an invariant measure forX , v = (vj , j ∈ S) such that v = 1. The invariant probability v is said to bepositive if vj > 0 for all j ∈ S.

THEOREM 2.23.– Let X be a continuous-time irreducible and recurrent Markovchain and let γX be the unique, positive invariant measure, up to a multiplicativeconstant, for X . We have:

X positive recurrent ⇐⇒i∈S

γXi < ∞.


In this case, X has a unique invariant probability, denoted by πX , positive andgiven, for all j ∈ S, if |S| ≥ 2, by:

πXj =

1

νjmj.

PROOF.– Let Y be the Markov chain embedded at the transition instants of X and letP be its transition probability matrix. The chain X being irreducible and recurrent,Y is also irreducible and recurrent, from theorems 2.15 and 2.14. If |S| = 1 theresult is trivial, by definition. If |S| ≥ 2, we necessarily have νi > 0 for all i ∈ S.From theorem 1.17, Y has, up to a multiplicative constant, a unique, positive invariantmeasure. If we denote by γY this unique measure, we have γY

i > 0 for all i ∈ S andγY = γY P . Multiplying each member of equation [2.54] by γY

i and summing overi ∈ S, we obtain, for all j ∈ S,

i∈S

γYi mi,j =

i∈S

γYi

νi+

i∈S

γYi

k∈S,k=j

Pi,kmk,j .

From Fubini’s theorem and since:

γYk =

i∈S

γYi Pi,k,

we have:

i∈S

γYi mi,j =

i∈S

γYi

νi+

k∈S,k=j i∈S

γYi Pi,kmk,j

=i∈S

γYi

νi+

k∈S,k=j

γYk mk,j ,

which gives:

γYj mj =

i∈S

γYi

νi,

and since γYj > 0,

mj =1

γYj i∈S

γYi

νi.


From theorem 2.21, we have γXj = γY

j /νj . It follows that:

mj =1

νjγXj i∈S

γXi . [2.57]

We thus have, for all j ∈ S,

mj < ∞ ⇐⇒i∈S

γXi < ∞, [2.58]

that is:

X positive recurrent ⇐⇒i∈S

γXi < ∞.

In this case, X has a unique, positive invariant probability πX = (πXj , j ∈ S)

given by:

πXj =

γXj

i∈S

γXi

,

that is, from relation [2.57],

πXj =

1

νjmj,


COROLLARY 2.5.– If the Markov chain X is irreducible and recurrent then the statesare either all positive recurrent or all null recurrent.

PROOF.– From theorem 2.21, X has, up to a multiplicative constant, a unique, positiveinvariant measure γX = (γX

i , i ∈ S). From theorem 2.23, ifi∈S

γXi < ∞ then X is

positive recurrent and ifi∈S

γXi = ∞ then we have, from equivalence [2.58], mj = ∞

for all j ∈ S, which means, by definition, that X is null recurrent.


This corollary proves that positive recurrence and null recurrence are classproperties. Nevertheless, to the contrary of recurrent or transient cases, acontinuous-time Markov chain X can be positive recurrent (respectively nullrecurrent) even though its embedded chain Y is null recurrent (respectively positiverecurrent). We will see examples of such chains in the following chapter.

THEOREM 2.24.– Let X be a continuous-time irreducible Markov chain on a statespace S. If X has an invariant probability πX = (πX

j , j ∈ S) such that

j∈S

νjπXj < ∞ then X is positive recurrent.

PROOF.– If |S| = 1, the result is trivial, by definition. Let us assume that S ≥ 2. SinceX is irreducible, we have νj > 0, for all j ∈ S. If X has an invariant probability πX ,we have πXQ = 0 and πX = 1. Let μ = (μj , j ∈ S) be the row vector defined,for all j ∈ S, by μj = νjπ

Xj . The vector μ can also be written as μ = πXΛ, where

Λ is the diagonal matrix containing the rates νj , j ∈ S. By hypothesis, we haveμ < ∞ and, since S ≥ 2, we have μ > 0. If P is the transition probability matrixof the discrete-time Markov chain Y embedded at the transition instants of X , wehave Q = Λ(P − I). We then have:

0 = πXQ = πXΛP − πXΛ = μP − μ.

Let us recall that for infinite matrices, multiplication is distributive over addition,under the condition that the matrices involved are well-defined and that multiplicationis associative for positive infinite matrices. More extensive details on these questionsare given in the first chapter of [KEM 66] and in section 4.3.

The vector v = μ/(μ ) is, therefore, an invariant probability for the chain Y ,which is irreducible, since X is irreducible, from theorem 2.15. Corollary 1.4 thenstates that Y is positive recurrent and, therefore, recurrent, which implies that X isalso recurrent, from theorem 2.14. It then follows from theorem 2.23, that X is positiverecurrent.

THEOREM 2.25.– Let X be a continuous-time irreducible Markov chain. X ispositive recurrent if and only if it is non-explosive and has an invariant probability.In this case, the invariant probability is unique and positive, it is denoted byπX = (πX

j , j ∈ S) and is given, for |S| ≥ 2, by:

πXj =

1

νjmj.

PROOF.– Let X be a continuous-time irreducible Markov chain on the state space S.If |S| = 1, the result is trivial. For |S| ≥ 2, since X is irreducible, we have νi > 0,for all i ∈ S.


If X is positive recurrent then, from theorem 2.17, X is non-explosive. Moreover,from theorem 2.23, X has a unique, positive invariant probability πX given, for allj ∈ S, by:

πXj =

1

νjmj.

Conversely, let us assume that X is non-explosive and that it has a unique invariantprobability, which we denote by πX = (πX

j , j ∈ S). For all t ≥ 0, n ≥ 0 andi, j ∈ S, we have defined in [2.14] the functions Pi,j(n, t) by:

Pi,j(n, t) = {Xt = j, Tn+1 > t | X0 = i}.From relation [2.41] of corollary 2.2, we have Pi,j(0, t) = e−νjt1{i=j} and, for

all n ≥ 1,


0 k∈S,k=j

Pi,k(n− 1, u)Qk,je−νj(t−u)du.

Multiplying each side of this equation by πXi and summing over i ∈ S, we obtain,

from Fubini’s theorem and the monotone convergence theorem, for n ≥ 1,

i∈S

πXi Pi,j(n, t) = πX

j e−νjt +t

0 k∈S,k=j i∈S

πXi Pi,k(n− 1, u)Qk,je

−νj(t−u)du.

[2.59]

For n = 0, we have Pi,j(0, t) = e−νjt1{i=j}, that is, for all j ∈ S and t ≥ 0,

i∈S

πXi Pi,j(0, t) = πX

j e−νjt ≤ πXj .

Assume that, for an integer n ≥ 1, we have, for all j ∈ S and t ≥ 0,

i∈S

πXi Pi,j(n− 1, t) ≤ πX

j ,

then, from relation [2.59], we obtain:

i∈S

πXi Pi,j(n, t) ≤ πX

j e−νjt +t

0 k∈S,k=j

πXk Qk,je

−νj(t−u)du.


πX being an invariant probability for X , we have πXQ = 0, that is:

k∈S,k=j

πXk Qk,j = −πX

j Qj,j = νjπXj .

This gives, by replacing this quantity in the previous inequality,

i∈S


j e−νjt + νjπXj

t

0

e−νj(t−u)du = πXj .

We have thus shown that, for all t ≥ 0, n ≥ 0 and j ∈ S, we have:

i∈S


j . [2.60]

The Markov chain X being minimal, we have, from relation [2.48],

limn−→∞Pi,j(n, t) = Pi,j(t).

From the dominated convergence theorem and inequality [2.60], we obtain:

limn−→∞

i∈S

πXi Pi,j(n, t) =

i∈S

πXi Pi,j(t) ≤ πX

j . [2.61]

On the other hand, since X is non-explosive, we have, from equivalences [2.55],

j∈S

Pi,j(t) = {ζ > t | X0 = i} = 1.

Summing over j ∈ S in both sides of inequality [2.61], we obtain, from Fubini’stheorem, 1 on each side, since πX is a probability. We then have, for all j ∈ S,

i∈S

πXi Pi,j(t) = πX

j .

If X is transient then as t tends to infinity and from corollary 2.4 and the dominatedconvergence theorem, we obtain πX

j = 0 for all j ∈ S, which is in contradiction withthe fact that πX is a probability. It then follows that X is recurrent. πX being aninvariant probability for X , it follows, from theorem 2.23, that X is positive recurrent.


This previous theorem is very important since it shows that the existence of aninvariant probability is not a sufficient condition for the chain to be positive recurrent.In addition, it also has to be non-explosive. We will see in section 3.5 an example of aMarkov chain having an invariant probability without being positive recurrent, or evenrecurrent.

COROLLARY 2.6.– Let X be a continuous-time irreducible Markov chain. If one ofthe following conditions is satisfied then X does not have an invariant probability:

1) X is null recurrent.

2) X is transient and non-explosive.

PROOF.– If X is null recurrent then, by definition, X is recurrent and since X isirreducible, it has, from theorem 2.21, up to a multiplicative constant, a unique,positive invariant measure, which we denote by γX . From theorem 2.23, we have,since X is null recurrent,

i∈S

γXi = ∞.

It follows that X cannot have an invariant probability.

If X is transient and non-explosive then, from theorem 2.25, X cannot have aninvariant probability. Indeed, in this case, if X has an invariant probability then X ispositive recurrent, which is in contradiction with the hypothesis.

In the case of discrete-time irreducible Markov chains, we have seen that only thepositive recurrent ones have an invariant probability. In the continuous-time case, theexplosion phenomenon makes things a little bit more complex. If I denotes the setof continuous-time irreducible Markov chains with countably infinite state spaces, wecan decompose I following the partition:

I = R+ ∪R0 ∪ Te ∪ Tne,

where R+ (respectively R0) denotes the set of chains of I that are positive recurrent(respectively null recurrent), where Te denotes the set of chains of I that are transientand explosive and where Tne denotes the set of chains of I that are transient and non-explosive. If Pinv denotes the set of chains of I that have an invariant probabilitiy thenwe have seen, from corollary 2.6, that:

R0 ∩ Pinv = ∅ and Tne ∩ Pinv = ∅.We then have, from theorem 2.25,

R+ ⊂ Pinv ⊂ R+ ∪ Te.


We will see in the examples of section 3.5 that these two inclusions are strict, thatis there exist irreducible, transient and explosive Markov chains having an invariantprobability and there also exist irreducible, transient and explosive Markov chainswithout invariant probability.

THEOREM 2.26.– Let X be a continuous-time irreducible Markov chain. X ispositive recurrent if and only if for all s > 0, the process W (s) = {Xns, n ≥ 0} is adiscrete-time positive recurrent Markov chain.

PROOF.– Let X be a continuous-time irreducible Markov chain. If X is positiverecurrent then, from theorem 2.25, X is non-explosive and has a unique invariantprobability πX . We then have πXQ = 0 and πX = 1. From lemma 2.14, W (s) is adiscrete-time irreducible Markov chain. From theorem 2.22, we have, for all s > 0,πXP (s) = πX and, therefore, W (s) has a unique invariant probability. Corollary 1.4then ensures that W (s) is positive recurrent.

Conversely, let us assume that for all s > 0, W (s) is a discrete-time positiverecurrent Markov chain. The matrix P (s) is thus a stochastic matrix, which implies,from [2.55], that X is non-explosive and, from lemma 2.14, that W (s) is irreduciblesince X is irreducible. From corollary 1.4, W (s) has a unique invariant probabilityμ. From theorem 2.20, X is recurrent since W (s) is also recurrent and from theorem2.22, we have μQ = 0. X is thus irreducible, non-explosive and has an invariantprobability. Theorem 2.25 then allows us to conclude that X is positive recurrent.

THEOREM 2.27.– Let X be a continuous-time Markov chain on the state space S,with infinitesimal generator Q. Let π be a measure on S. The following two assertionsare equivalent:

1) πQ = 0.

2) For every partition A,B of S, we havei∈A

πi

j∈B

Qi,j =i∈B

πi

j∈A

Qi,j .

PROOF.– Let π be a measure on S. Assume that πQ = 0. We then have, for all j ∈ S,

i∈S

πiQi,j = 0.

Let A,B be two sets forming a partition of S, that is two non-empty sets such thatA ∩B = ∅ and A ∪B = S. Summing over j ∈ A, we obtain, by Fubini’s theorem,

i∈S

πi

j∈A

Qi,j = 0,


and also, since A and B form a partition of S,

i∈A

πi

j∈A

Qi,j +i∈B

πi

j∈A

Qi,j = 0.

However, since Q is an infinitesimal generator, we have:

j∈A

Qi,j = −j∈B

Qi,j .

Hence:

−i∈A

πi

j∈B

Qi,j +i∈B

πi

j∈A

Qi,j = 0.

Conversely, by taking successively for all ∈ S, A = { } and B = S \ { }, weobtain:

πj∈S\{ }

Q ,j =i∈S\{ }

πiQi, ,

that is:

−π Q , =i∈S\{ }

πiQi, ,

or, for all ∈ S,

0 =i∈S

πiQi, ,

hence πQ = 0.

2.10. Convergence to equilibrium

THEOREM 2.28.– Let X be a continuous-time irreducible and non-explosive Markovchain having an invariant probability πX . For all i, j ∈ S, we have:

limt−→∞Pi,j(t) = πX

j .


PROOF.– Let s > 0. From lemma 2.14, since X is non-explosive, the discrete-timeprocess W (s) = {W (s)

n , n ≥ 0} defined, for all n ≥ 0, by:

W (s)n = Xns,

is a homogeneous discrete-time Markov chain on S, with transition probability matrixP (s). Moreover, W (s) is aperiodic and W (s) is irreducible since X is irreducible.From theorem 2.25, X is positive recurrent, therefore, the invariant probability πX

is unique. Theorem 2.22 then states that the invariant probability πX satisfies, for alls ≥ 0,

πXP (s) = πX .

Thus, for all s > 0, the discrete-time Markov chain W (s) is irreducible, aperiodicand has an invariant probability πX . It then follows, from theorem 1.22, that for alli, j ∈ S, we have:

limn−→∞Pi,j(ns) = πX

j .

Therefore, for all ε > 0, there exists an integer N such that:

n ≥ N =⇒ |Pi,j(ns)− πXj | ≤ ε/2.

The function F (s) = 1 − e−νis is continuous, increasing and non-negative withF (0) = 0, therefore, for all ε > 0, there exists s > 0, such that:

1− e−νis ≤ ε/2.

Let us recall that for all x ∈ , x denotes the integer part of x, that is the largestinteger less than or equal to x.

For t ≥ Ns, by taking n = t/s , we have n ≥ N and ns ≤ t < (n + 1)s. Itfollows that for t ≥ Ns, from theorem 2.5, we have:

|Pi,j(t)− Pi,j(ns)| ≤ 1− e−νi(t−ns) ≤ 1− e−νis ≤ ε/2.

In total, we have, for all t ≥ Ns,

|Pi,j(t)− πXj | ≤ |Pi,j(t)− Pi,j(ns)|+ |Pi,j(ns)− πX

j | ≤ ε,



THEOREM 2.29.– Let X be a continuous-time irreducible Markov chain. If |S| ≥ 2then we have, for all j ∈ S,

limt−→∞ {Xt = j} =

1

νjmj,

for every initial distribution, with the convention 1/∞ = 0.

REMARK 2.2.– Note that the case where |S| = 1 is of no interest. Indeed, in thiscase, the limit is trivially equal to 1 whereas the right-hand side is not defined, sinceby definition, νj = 0 and mj = ∞.

PROOF.– If |S| ≥ 2, since X is irreducible, we have νj > 0 for all j ∈ S.

If X is transient then we have mj = ∞, for all j ∈ S and from corollary 2.4, wehave:

limt−→∞ {Xt = j} = 0,

which completes the proof in this case.

Assume that X is recurrent. If X is positive recurrent then, from theorem 2.25, itis non-explosive and has a unique invariant probability πX given by:

πXj =

1

νjmj.

Theorem 2.28 states that:

limt−→∞Pi,j(t) = πX

j .

If α is the initial distribution of X , we obtain, from the dominated convergencetheorem,

limt−→∞ {Xt = j} = lim

t−→∞i∈S

αiPi,j(t) = πXj =

1

νjmj,

which completes the proof in this case.


If X is null recurrent then we have mj= ∞, for all j ∈ S. It is then sufficient toshow that:

limt−→∞ {Xt = j} = 0.

X being recurrent, X is non-explosive, from theorem 2.17. X being irreducible,the process W (s) = {Xns, n ≥ 0} is an aperiodic and irreducible Markov chain,from lemma 2.14. Moreover, theorems 2.20 and 2.26 allow us to conclude that W (s)

is null recurrent. Theorem 1.23 then ensures that, for all i, j ∈ S,

limn−→∞Pi,j(ns) = 0.

We then proceed in the same way as in the proof of theorem 2.28.

For all ε > 0, there exists an integer N such that:

n ≥ N =⇒ Pi,j(ns) ≤ ε/2.

The function F (s) = 1 − e−νis is continuous, increasing and non-negative withF (0) = 0, therefore, for all ε > 0, there exists s > 0, such that:

1− e−νis ≤ ε/2.

For t ≥ Ns, by taking n = t/s , we have n ≥ N and ns ≤ t < (n + 1)s. Itfollows that for t ≥ Ns, from theorem 2.5, we have:

|Pi,j(t)− Pi,j(ns)| ≤ 1− e−νi(t−ns) ≤ 1− e−νis ≤ ε/2.

In total, we have, for all t ≥ Ns,

Pi,j(t) ≤ |Pi,j(t)− Pi,j(ns)|+ Pi,j(ns) ≤ ε.

If α is the initial distribution of X , we obtain, from the dominated convergencetheorem,

limt−→∞ {Xt = j} = lim

t−→∞i∈S

αiPi,j(t) = 0,



As in the discrete case, the limiting probability distribution π is often called thestationary distribution or probability of X . The study of process X or a function of itat finite instants with an initial distribution different from the stationary distribution isreferred to as transient regime analysis and when time approaches infinity or when theinitial distribution is equal to the stationary distribution, it is referred to as stationaryregime analysis.

2.11. Ergodic theorem

The following result gives the asymptotic behavior of the random variable thatrepresents the percentage of time spent in a given state on the interval [0, t] when ttends to infinity.

If the state space is reduced to a single state, this percentage of time is equal to 1for all t > 0.

THEOREM 2.30.– ERGODIC THEOREM.– Let X be an irreducible Markov chain ona state space S, with |S| ≥ 2. For all j ∈ S, we have:

limt−→∞

1

t

t

0

1{Xs=j}ds =1

νjmj, -a.s.

where mj = {τX(j) | X0 = j} is the expected return time to state j, with theconvention 1/∞ = 0. If, moreover, X is positive recurrent then, for every boundedfunction r : S −→ , we have:

limt−→∞

1

t

t

0

rXsds =j∈S

rjπj , -a.s.

where π = (πj , j ∈ S) is the unique invariant probability of X .

PROOF.– For all j ∈ S and for all t ≥ 0, we denote by Wj(t) the total time spent bythe chain X in state j on the interval [0, t] and by Wj the total time spent by the chainX in state j, that is:

Wj(t) =t

0

1{Xs=j}ds and Wj =∞

0

1{Xs=j}ds.

If X is transient then, from Fubini’s theorem and corollary 2.4, we have, for alli, j ∈ S,

{Wj | X0 = i} =∞

0

{1{Xs=j} | X0 = i}ds =∞

0

Pi,j(s)ds < ∞.


It then follows that {Wj < ∞ | X0 = i} = 1 and, if α = (αi, i ∈ S) denotesthe initial distribution of X , that:

{Wj < ∞} =i∈S

αi {Wj < ∞ | X0 = i} = 1.

Since:

Wj(t)

t≤ Wj

t

and since Wj is finite with probability 1, we obtain, by taking the limit,

limt−→∞

Wj(t)

t= 0, -a.s.

On the other hand, since X is irreducible and |S| ≥ 2, we have νj > 0 forall j ∈ S. Moreover, X being transient, we have mj = ∞ for all j ∈ S, that is1/(νjmj) = 0, which gives the result in the transient case.

Let us now assume that X is recurrent. From theorem 2.17, X is not explosive.Since X is irreducible and |S| ≥ 2, we have νj > 0 for all j ∈ S.

Let j be a fixed state of S. We denote by T 0j the first time t such that Xt = j, that

is:

T 0j = inf{t ≥ 0 | Xt = j}.

If X0 = j then T 0j = 0 and if X0 = j then T 0

j = τX(j), where we recall thatτX(j) is defined by:

τX(j) = inf{t ≥ T1 | Xt = j}.

We also recall that Y denotes the Markov chain embedded at the transition instantsof X and that the first passage time, τY (j), to state j for the chain Y , is defined by:

τY (j) = inf{n ≥ 1 | Yn = j}.

The chain X being irreducible and recurrent, the chain Y is also irreducible andrecurrent, from theorems 2.15 and 2.14. Theorem 1.16 then states that τY (j) is finite


with probability 1. It follows, from relation [2.50], that τX(j) is finite with probability1. Therefore, T 0

j is also finite with probability 1.

For n ≥ 1, we denote by Mnj the time spent by X during its nth visit to state j,

we denote by Lnj the time spent by X between two successive visits to state j and we

denote by Tnj the time of the nth return of X to state j. These random variables are

defined, for all n ≥ 0, by:

Mn+1j = inf{t > Tn

j | Xt = j} − Tnj

Tn+1j = inf{t > Tn

j +Mn+1j | Xt = j}

Ln+1j = Tn+1

j − Tnj .

From corollary 2.1, the sojourn times in the states visited successively by X areindependent and have an exponential distribution. By the strong Markov propertyapplied at times Tn

j , n ≥ 0, we show, as it has already been done in [RUB 89b], thatthe random variables L1

j , L2j , . . . are independent and identically distributed with

mean mj = {τX(j) | X0 = j}. In the same way, the random variablesM1

j ,M2j , . . . are independent and identically exponentially distributed with rate νj .

By the strong law of large numbers as formulated in theorem 1.25, we obtain:

limn−→∞

L1j + · · ·+ Ln

j

n= mj with probability 1

and

limn−→∞

M1j + · · ·+Mn

j

n=

1

νjwith probability 1.

Let t ≥ T 1j . Such a t exists since, for all n ≥ 0, Tn

j is finite with probability 1.Then there exists an integer n ≥ 1, depending on t, such that: Tn

j ≤ t < Tn+1j . It

follows that:

n

k=1

Mkj =

Tnj

0

1{Xs=j}ds ≤t

0

1{Xs=j}ds ≤Tn+1j

0

1{Xs=j}ds =n+1

k=1

Mkj

and

Tnj = T 0

j + L1j + · · ·+ Ln

j ≤ t < T 0j + L1

j + · · ·+ Ln+1j = Tn+1

j .


We then obtain:

M1j + · · ·+Mn

j

T 0j + L1

j + · · ·+ Ln+1j

≤ 1

t

t

0

1{Xs=j}ds ≤M1

j + · · ·+Mn+1j

T 0j + L1

j + · · ·+ Lnj

.

The upper bound can be written as:

M1j + · · ·+Mn+1

j

T 0j + L1

j + · · ·+ Lnj

=n+ 1

n

M1j + · · ·+Mn+1

j

n+ 1T 0j

n+

L1j + · · ·+ Ln

j

n

.

T 0j being finite with probability 1, we have T 0

j /n −→ 0 with probability 1 whenn −→ ∞. Hence:

limn−→∞

M1j + · · ·+Mn+1

j

T 0j + L1

j + · · ·+ Lnj

=1

νjmjwith probability 1.

In the same way, for the lower bound, we have:

M1j + · · ·+Mn

j

T 0j + L1

j + · · ·+ Ln+1j

=n

n+ 1

M1j + · · ·+Mn

j

nT 0j

n+ 1+

L1j + · · ·+ Ln+1

j

n+ 1

,

that is:

limn−→∞

M1j + · · ·+Mn

j

T 0j + L1

j + · · ·+ Ln+1j

=1

νjmjwith probability 1.

Since we have Tnj ≤ t < Tn+1

j , the variable Vj(t) = n + 1, depending on t,represents the total number of visits to state j on the interval [0, t]. When t tends toinfinity, Vj(t) tends to the total number Vj of visits to state j. From theorem 1.13, wehave fi,j = 1, for all i, j ∈ S, since X is recurrent. From corollary 1.3, it follows that{Vj = ∞ | X0 = i} = 1, that is Vj = ∞ with probability 1. Since, when t tends to

infinity, n tends to infinity with probability 1, we obtain:

limt−→∞

1

t

t

0

1{Xs=j}ds =1

νjmj, -a.s.

which completes the first part of the proof.


If, in addition, X is positive recurrent then, from theorem 2.25, X has a unique,positive invariant probability, which we denote by π = (πj , j ∈ S) and which isgiven, for all j ∈ S, by:

πj =1

νjmj.

Let r be a bounded function from S to . To prove the second part, we write:

1

t

t

0

rXsds =j∈S

rj1

t

t

0

1{Xs=j}ds =j∈S

rjWj(t)

t

and we pick up the second part of the proof of theorem 1.27 by replacing Vj(n)/nby Wj(t)/t. Indeed, by taking c = supj∈S |rj | and considering a subset F ⊂ S, wehave:

1

t

t

0

rXsds−j∈S

rjπj

=j∈S

rjWj(t)

t− πj

≤ cj∈F

Wj(t)

t− πj + c

j∈S\F

Wj(t)

t− πj + 2c

j∈S\Fπj .

Noting that:

j∈S

Wj(t) = t andj∈S

πj = 1,

we obtain:

j∈S\F

Wj(t)

t− πj = −

j∈F

Wj(t)

t− πj ≤

j∈F

Wj(t)

t− πj ,

which leads to:

1

t

t

0

rXsds−j∈S

rjπj ≤ 2cj∈F

Wj(t)

t− πj + 2c

j∈S\Fπj .


We have seen that, for all j ∈ S, we have:

limt−→∞

Wj(t)

t= πj , -a.s.

Let ε > 0, we choose F finite such that:

j∈S\Fπj ≤ ε

4c

and we choose T = T (ω) such that for t ≥ T , we have:

Wj(t)

t− πj ≤ ε

4c|F | , for all j ∈ F,

where |F | denotes the number of states of F . We thus ensure that, for all t ≥ T , wehave:

2cj∈F

Wj(t)

t− πj + 2c

j∈S\Fπj ≤ ε,


COROLLARY 2.7.– Let X be an irreducible Markov chain on a state space S, with|S| ≥ 2. For all j ∈ S and for every initial distribution α, we have:

limt−→∞

1

t

t

0

{Xs = j}ds = 1

νjmj.

In particular, for all i, j ∈ S, we have:

limt−→∞

1

t

t

0

Pi,j(s)ds =1

νjmj.

PROOF.– From theorem 2.30, we have:

limt−→∞

1

t

t

0

1{Xs=j}ds =1

νjmj, -a.s.


and since:

0 ≤ 1

t

t

0

1{Xs=j}ds ≤ 1,

from the dominated convergence theorem, we obtain:

limt−→∞

1

t

t

0

1{Xs=j}ds =1

νjmj,

that is, from Fubini’s theorem,

limt−→∞

1

t

t

0

{Xs = j}ds = 1

νjmj.

The second result is obtained simply by choosing α = δi.

2.12. First passage times

2.12.1. First passage time to a state

Recall that the random variable τX(j), which represents the time taken by thechain X to reach or to return to state j, is defined by:

τX(j) = inf{t ≥ T1 | Xt = j},

where τX(j) = ∞ if this set is empty and where T1 is the first jump time of chainX . The time τX(j) is also referred to as the first passage time to state j. This randomvariable has already been studied, in part, in section 2.6. We have shown in theorem2.12 that if we define, for all i, j ∈ S and t ≥ 0, Fi,j(t) = {τX(j) ≤ t | X0 = i}then the functions Fi,j are of class C1 over + and we have:

Fi,j(t) = νi


Pi,kFk,j(t)

⎤⎦, [2.62]

where P is the transition probability matrix of the discrete-time Markov chain Yembedded at the transition instants of the chain X .


As in the discrete case, let us fix a state j0 ∈ S and let us define, for all i ∈ S andt ≥ 0,

Gi(t) = {τX(j0) > t | X0 = i}.

We then have Gi(t) = 1 − Fi,j0(t) and relation [2.62] allows us to obtain, for alli ∈ S and t ≥ 0, by taking S0 = S \ {j0},

Gi(t) = −νi Pi,j0 − 1 +Gi(t) +k∈S0

Pi,k(1−Gk(t))

= −νi Gi(t)−k∈S0

Pi,kGk(t) .

The matrix Q being defined by Q = −Λ(I − P ), we obtain, for all i ∈ S0,

Gi(t) =k∈S0

−νi(Ii,k − Pi,k)Gk(t) =k∈S0

Qi,kGk(t) [2.63]

and

Gj0(t) = −νj0Gj0(t) + νj0k∈S0

Pj0,kGk(t) =k∈S

Qj0,kGk(t).

We define the column vector G(t) by:

G(t) = (Gi(t), i ∈ S0)

and we denote by A the matrix obtained from Q by removing the row and the columncorresponding to state j0, that is A = (Qi,j)i,j∈S0

. We then have G(0) = andrelation [2.63] can be written in matrix notation, for all t ≥ 0, as:

G (t) = AG(t). [2.64]

Defining, for all i ∈ S, wi = {τX(j0) | X0 = i}, we have wi = mi,j0 andrelation [2.54] can then be written as:

wi =1

νi+

k∈S0

Pi,kwk, [2.65]


with wi = ∞ if νi = 0. If w is the column vector defined by w = (wi, i ∈ S0), thisrelation can be written in matrix notation as:

w = u+Mw, [2.66]

where u = (ui, i ∈ S0) is the column vector whose coefficient ui is given by ui =1/νi and where M is the matrix obtained from P by removing the row and the columncorresponding to state j0, that is M = (Pi,j)i,j∈S0

. For all i ∈ S, we define, as in thediscrete case, vi by:

vi = {τX(j0) = ∞ | X0 = i}.

We then have vi = 1− fXi,j0

.

Note that, from theorem 1.13, using theorems 2.14, 2.15 and relation [2.51], ifi −→ j0 and if i is recurrent then we have fX

i,j0= 1, that is vi = 0.

REMARK 2.3.– For the calculation of the first expected passage time wi to state j0,starting from state i ∈ S0 = S \ {j0}, we only consider the states i such that vi = 0,that is fX

i,j0= 1. Indeed, if i is such that vi > 0, that is fX

i,j0< 1 then, starting from

state i, τX(j0) has a non-zero probability to be infinite, which means that wi = ∞.

THEOREM 2.31.– Let X be a continuous-time Markov chain and let j0 be a state ofS such that vi = 0, that is fX

i,j0= 1, for all i ∈ S0 = S \ {j0}. We then have:

w =∞

k=0

Mku.

PROOF.– Since vi = 0, that is since fXi,j0

= 1, the time τX(j0) is finite with probability1 when X0 = i. Moreover, vi = 0 implies that for all i ∈ S0, we have νi > 0. Indeed,if νi = 0 then the state i is absorbing and we have fX

i,j0= 0 that is vi = 1. We denote

by τY (j0) the first passage time to state j0 for the embedded chain Y , that is:

τY (j0) = inf{n ≥ 1 | Yn = j0}.

By definition of τX(j0) and of τY (j0) and since τX(j0) < ∞, i-a.s. we have,by [2.50] and when X0 = i ∈ S0,

τX(j0) =

τY (j0)

k=1

Sk =∞

k=1

Sk1{τY (j0)≥k},


where, for k ≥ 1, Sk is the time spent by the chain X in the kth state visited. Notethat we must distinguish the set S0 and the random variables Sk, k ≥ 1. For all k ≥ 1and i ∈ S0, we have, since X0 = Y0,

{Sk1{τY (j0)≥k} | X0 = i}= {Sk1{Yk−1∈S0,...,Y1∈S0} | Y0 = i}=

∈S0

{Sk | Yk−1 = , Yk−2 ∈ S0, . . . , Y1 ∈ S0, Y0 = i}

× {Yk−1 = , Yk−2 ∈ S0, . . . , Y1 ∈ S0 | Y0 = i}=

∈S0

{Sk | Yk−1 = }i1,...,ik−2∈S0

Pi,i1Pi1,i2 · · ·Pik−2,

=∈S0

(Mk−1)i,1

ν

= (Mk−1u)i,

where the third equality is due to the Markov property and theorem 1.1 and the fourthequality is due to the definition of M . Summing this relation over k ≥ 1, we obtain,for all i ∈ S0, from the monotone convergence theorem,

wi = {τX(j0) | X0 = i} =∞

k=1

{Sk1{τY (j0)≥k} | X0 = i} =∞

k=1

(Mk−1u)i,


REMARK 2.4.– 1) The condition vi = 0 is important in the previous theorem. Indeed,in the case of the continuous-time pure birth process described in section 3.4, bytaking j0 = 0 and X0 = 1, we have v1 = 1, which gives τX(0) = ∞ with probability1, that is, in particular, w1 = ∞. On the other hand, for all i, j ≥ 1, we haveMi,j = 1{j=i+1}, which leads, for all k ≥ 0, to (Mk)i,j = 1{j=i+k}. It follows that:

∞

k=0

Mku1=

∞

k=1

1

νk

and this last series converges in the case where the chain X is explosive.

2) In the case where some states of S0 are such that vi > 0, we consider the setsS+0 and S0

0 defined by S+0 = { ∈ S0 | v > 0} and S0

0 = { ∈ S0 | v = 0}. Thesetwo sets of states form a partition of S0 and we decompose matrix M following thispartition by writing:

M =M+ M+,0

M0,+ M0,


where the matrix M+ (respectively M0) contains the transition probabilities betweenthe states of S+

0 (respectively S00 ) and where the matrix M+,0 (respectively M0,+)

contains the transition probabilities from the states of S+0 (respectively S0

0 ) to thestates of S0

0 (respectively S+0 ).

Let i ∈ S00 . The state i cannot be absorbing since vi = 0. Let us assume that there

exists a state j ∈ S+0 such that Mi,j > 0. From relation [2.51] and theorem 1.30, we

have, since v = 1− f ,j0 ,

0 = vi =∈S0

Mi, v =

∈S+0

Mi, v ≥ Mi,jvj > 0.

It then follows that M0,+ = 0. Theorem 2.31 can then be extended to the casewhere some vi are non-zero and we have wi = ∞ if i ∈ S+

0 and

w0 =

∞

k=0

(M0)ku0,

where w0 = (wi, i ∈ S00) and u0 = (ui, i ∈ S0

0).

THEOREM 2.32.– Let X be a continuous-time Markov chain and let j0 be a state ofS such that vi = 0, that is fX

i,j0= 1, for all i ∈ S0 = S \ {j0}. The column vector w

is the smallest non-negative solution to the linear system z = u+Mz.

PROOF.– We have already seen by relation [2.66] that w satisfies the system z =u+Mz. Let z be a solution to the system z = u+Mz such that z ≥ 0. By iteratingthis relation n times, we obtain:

z =n

k=0

Mku+Mk+1z.

Since z ≥ 0, we get:

z ≥n

k=0

Mku,

which gives, by taking the limit when n tends to infinity and from theorem 2.31,

z ≥ w,


Note that remark 2.4 applies here as well in the case where some states i of S0 aresuch that vi > 0. In this case, by picking up the notations used before, the columnvector w0 is the smallest non-negative solution to the linear system z = u0 +M0z.


2.12.2. First passage time to a subset of states

As we have seen in the discrete case, the results obtained in the previous sectioncan be easily generalized to a subset of states. Let us consider a partition of the statespace S in two non-empty subsets B and Bc. Let us recall that, by definition of apartition, we have B ∩Bc = ∅ and B ∪Bc = S. We then define the random variableτ(Bc), which represents the time taken to reach the set Bc, by:

τ(Bc) = inf{t ≥ T1 | Xt ∈ Bc},where τ(Bc) = ∞ is the empty set. The time τ(Bc) is also called the first passagetime to the subset Bc or the hitting time of subset Bc. For all i ∈ S and t ≥ 0,we define Fi,Bc(t) = {τ(Bc) ≤ t | X0 = i}. The following theorem is then ageneralization of theorem 2.12 obtained by taking Bc = {j}.

THEOREM 2.33.– For all i ∈ S and t ≥ 0, we have:

Fi,Bc(t) = (1− e−νit)k∈Bc

Pi,k + e−νitt

0 k∈B

Pi,kFk,Bc(u)νieνiudu. [2.67]

Moreover, the functions Fi,Bc are of class C1 over + and we have:

Fi,Bc(t) = νik∈Bc

Pi,k − Fi,Bc(t) +k∈B

Pi,kFk,Bc(t) . [2.68]

PROOF.– The proof is almost identical to that of theorem 2.12. If the initial statei is absorbing then relation [2.67] is satisfied since in this case, we have νi = 0,T1 = ∞, τ(Bc) = ∞ and thus Fi,Bc(t) = 0. We then assume that the state i is notabsorbing, that is T1 < ∞ with probability 1. Let us first note that if XT1 ∈ Bc thenτX(Bc) = T1. We then have, from theorem 2.3,

Fi,Bc(t) = {τX(Bc) ≤ t | X0 = i}=

k∈Bc

{τX(Bc) ≤ t,XT1 = k | X0 = i}

+k∈B

{τX(Bc) ≤ t,XT1 = k | X0 = i}

=k∈Bc

{T1 ≤ t,XT1 = k | X0 = i}

+k∈B

{τX(Bc) ≤ t,XT1 = k | X0 = i}

= (1− e−νit)k∈Bc

Pi,k +k∈B

{τX(Bc) ≤ t,XT1 = k | X0 = i}.


Conditioning with respect to T1 and using the strong Markov property at time T1

as well as theorem 2.3, we have, for k ∈ B,

{τX(Bc) ≤ t,XT1= k | X0 = i}

=t

0

{τX(Bc) ≤ t | XT1 = k, T1 = u,X0 = i}Pi,kνie−νiudu

=t

0

{τX(Bc) ≤ t− u | X0 = k}Pi,kνie−νiudu

=t

0

Fk,Bc(t− u)Pi,kνie−νiudu

= e−νitt

0

Fk,Bc(u)Pi,kνieνiudu,

where the second equality uses the strong Markov property at time T1 and the lastequality is due to the change of variable u := t− u. Regrouping the terms, we obtain:


Pi,k + e−νit

k∈B

t

0

Pi,kFk,Bc(u)νieνiudu.

The monotone convergence theorem then allows us to interchange the sum and theintegral, that is:


Pi,k + e−νitt

0 k∈B

Pi,kFk,Bc(u)νieνiudu,

which completes the first part of the proof. The second part is identical to that oftheorem 2.12.

For all i ∈ S and t ≥ 0, let us define:

Gi(t) = {τX(Bc) > t | X0 = i}.Note that in order to simplify the notation, we have removed the reference to the

subset Bc. We then have Gi(t) = 1− Fi,Bc(t) and relation [2.68] allows us to write,for all i ∈ S and t ≥ 0,

Gi(t) = −νik∈Bc

Pi,k − 1 +Gi(t) +k∈B

Pi,k(1−Gk(t))

= −νi

⎡⎣Gi(t)−k∈B,k=i

Pi,kGk(t)

⎤⎦.


The matrix Q being defined by Q = −Λ(I − P ), we obtain, for all i ∈ B,

Gi(t) =k∈B

Qi,kGk(t) [2.69]

and, for all i ∈ Bc,

Gi(t) = −νiGi(t) +k∈B

Qi,kGk(t). [2.70]

We decompose the transition rate matrix Q following the partition B,Bc bywriting:

Q =QB QBBc

QBcB QBc.

The matrix QB (respectively QBc ) regroups the transition rates between thestates of B (respectively Bc) and the matrix QBBc (respectively QBcB) regroups thetransition rates from the states of B (respectively Bc) to the states of Bc (respectivelyB). We then denote, for all t ≥ 0, by GB(t) and by GBc(t) the column vectorsdefined by:

GB(t) = (Gi(t), i ∈ B) and GBc(t) = (Gi(t), i ∈ Bc).

Relations [2.69] and [2.70] can then be written in matrix notation as:

GB(t) = QBGB(t) and GBc(t) = −ΛBcGBc(t) +QBcBGB(t), [2.71]

where ΛBc is the diagonal matrix containing the rates νi, for i ∈ Bc.

Let us now define, for all i ∈ S, wi = {τX(Bc) | X0 = i}. We then obtain thegeneralization of theorem 2.16, where P denotes the transition probability matrix ofthe Markov chain Y embedded at the transition instants of X.

THEOREM 2.34.– For all i ∈ S, we have:

wi =1

νi+

k∈B

Pi,kwk, [2.72]

where we define 1/νi = ∞ if νi = 0.


PROOF.– If the initial state i is absorbing then νi = 0 and, therefore, T1 = ∞ withprobability 1. It then follows, by definition, that τX(Bc) = ∞ with probability 1hence wi = ∞ and relation [2.72] is satisfied.

If the initial state i is not absorbing then we have νi > 0 and T1 < ∞ withprobability 1. Proceeding in the same way as in the proof of theorem 2.16, we obtain,for all i ∈ S,

wi = {τX(Bc) | X0 = i}=

k∈B

Pi,k {τX(Bc) | XT1= k,X0 = i}

+k∈Bc

Pi,k {τX(Bc) | XT1 = k,X0 = i}

=k∈B

Pi,k {τX(Bc) | XT1 = j,X0 = i}+ 1

νik∈Bc

Pi,k.

Conditioning with respect to T1, using theorem 2.3 and the Markov property attime T1, we have, for k ∈ B,

{τX(Bc) | XT1 = k,X0 = i}

=∞

0

{τX(Bc) | XT1 = k, T1 = s,X0 = i}νie−νisds

=∞

0

(s+ {τX(Bc) | X0 = k}) νie−νisds

=1

νi+ {τX(Bc) | X0 = k}

=1

νi+ wk.

In total, we obtain:

wi =1

νik∈B

Pi,k +k∈B

Pi,kwk +1

νik∈Bc

Pi,k

=1

νi+

k∈B

Pi,kwk,



We denote by wB and wBc the column vectors defined by:

wB = (wi, i ∈ B) and wBc = (wi, i ∈ Bc).

Equation [2.72] can be written in matrix notation as:

wB = uB + PBwB and wBc = uBc + PBcBwB ,

where uB = (ui, i ∈ B) and uBc = (ui, i ∈ Bc) are the column vectors whosecoefficients ui are given by ui = 1/νi, where PB is the sub-matrix of P containingthe transition probabilities between the states of B and where PBcB is the sub-matrixof P containing the transition probabilities from the states of Bc to the states of B.For all i ∈ S, we define, as in the discrete case, vi by:

vi = {τX(Bc) = ∞ | X0 = i}.

Defining fXi,Bc = {τX(Bc) < ∞ | X0 = i}, we have vi = 1− fX

i,Bc .

REMARK 2.5.– Theorem 1.32 can be easily extended to the continuous case. Indeed,for all j ∈ Bc, we have {t ≥ T1 | Xt = j} ⊆ {t ≥ T1 | Xt ∈ Bc}, henceτX(Bc) ≤ τX(j). It follows that τX(j) < ∞ =⇒ τX(Bc) < ∞, which means thatfor all i ∈ S, we have fX

i,j ≤ fXi,Bc . In particular, if fX

i,j = 1 then fXi,Bc = 1 and

vi = 0.

As in the previous section, in order to calculate the first expected passage time wi

to the subset Bc, starting from state i ∈ B, we only consider states i such that vi = 0,that is fX

i,Bc = 1. Indeed, if i is such that vi > 0, that is fXi,Bc < 1 then, starting from

state i, τX(Bc) has a non-zero probability to be infinite, which means that wi = ∞.

THEOREM 2.35.– Let X be a continuous-time Markov chain and let B,Bc be apartition of S such that vi = 0, that is fX

i,Bc = 1, for all i ∈ B. We then have:

wB =

∞

k=0

(PB)kuB .

PROOF.– The proof is almost identical to that of theorem 2.31. Since vi = 0, that issince fX

i,Bc = 1, the time τX(Bc) is finite with probability 1 when X0 = i. Moreover,vi = 0 for all i ∈ B implies that we have νi > 0 for all i ∈ B. Indeed, if νi = 0 thenthe state i is absorbing and we have fX

i,Bc = 0 and vi = 1. We denote by τY (Bc) the

first passage time to the subset Bc for the embedded chain Y , that is:

τY (Bc) = inf{n ≥ 1 | Yn ∈ Bc}.


By definition of τX(Bc) and τY (Bc) and since τX(Bc) < ∞, i-a.s. we have, as

in [2.50] and when X0 = i ∈ B,

τX(Bc) =

τY (Bc)

k=1

Sk =∞

k=1

Sk1{τY (Bc)≥k},

where Sk, k ≥ 1, is the time spent by the chain X in the kth state visited. For allk ≥ 1 and i ∈ B, we have, since X0 = Y0,

{Sk1{τY (Bc)≥k} | X0 = i}= {Sk1{Yk−1∈B,...,Y1∈B} | Y0 = i}=

∈B

{Sk | Yk−1 = , Yk−2 ∈ B, . . . , Y1 ∈ B, Y0 = i}

× {Yk−1 = , Yk−2 ∈ B, . . . , Y1 ∈ B | Y0 = i}=

∈B

{Sk | Yk−1 = }i1,...,ik−2∈B

Pi,i1Pi1,i2 · · ·Pik−2,

=∈B

(PB)k−1

i,

1

ν

= (PB)k−1uB i

,

where the third equality is due to the Markov property and theorem 1.1 and the fourthequality is due to the definition of PB . Summing this relation over k, we obtain, for alli ∈ B, by the monotone convergence theorem and since wi = {τX(Bc) | X0 = i},

wi =∞

k=1

{Sk1{τY (Bc)≥k} | X0 = i} =∞

k=1

(PB)k−1uB i

,


Remark 2.4 is also valid here. As for point 1, it is sufficient to take Bc = {j0} andfor point 2, it is sufficient to replace {j0} with Bc and the matrix M with the matrixPB .

THEOREM 2.36.– Let X be a continuous-time Markov chain and let B,Bc be apartition of S such that vi = 0, that is fX

i,Bc = 1, for all i ∈ B. The column vectorwB is the smallest non-negative solution to the linear system z = uB + PBz.


PROOF.– The proof is identical to that of theorem 2.32. We have already seen that wB

satisfies the system z = uB + PBz. Let z be a solution to the system z = uB + PBzsuch that z ≥ 0. By iterating this relation n times, we obtain:

z =n

k=0

(PB)kuB + (PB)

k+1z.

Since z ≥ 0, we obtain:

z ≥n

k=0

(PB)kuB ,

which gives, by taking the limit when n −→ ∞ and from theorem 2.35,

z ≥ wB ,

which completes this proof.

Note that here again, we can apply remark 2.4 to the case where some states i of Bare such that vi > 0. In this case, by defining B0 = { ∈ B | vi = 0} and denoting byPB0 the sub-matrix of PB containing the transition between states of B0, the columnvector wB0 = (wi, i ∈ B0) is the smallest non-negative solution to the linear systemz = uB0 + PB0z, where uB0 = (ui, i ∈ B0).

We can often see in the literature as well as in practice a variant of the variableτX(Bc) = inf{t ≥ T1 | Xt ∈ Bc}, which we denote by Γ(Bc) and which is definedby:

Γ(Bc) = inf{t ≥ 0 | Xt ∈ Bc},where as usual these variables are infinite if the corresponding sets are empty. Thesetwo variables are different, but if X0 ∈ B then we have τX(Bc) = Γ(Bc). On theother hand, if X0 ∈ Bc then we have Γ(Bc) = 0 and τX(Bc) ≥ T1. Defining, for alli ∈ S and t ≥ 0,

Ki(t) = {Γ(Bc) > t | X0 = i}and defining the column vectors KB(t) and KBc(t) by:

KB(t) = (Ki(t), i ∈ B) and KBc(t) = (Ki(t), i ∈ Bc),

it follows that KB(0) = , KBc(0) = 0 and from the first relation of [2.71], that:

KB(t) = QBKB(t). [2.73]


2.13. Absorbing Markov chains

We now consider a continuous-time Markov chain X whose countable state spaceS contains a finite, non-empty set B of transient states and J , 1 ≤ J ≤ ∞, classesof recurrent states C1, · · · , CJ if J < ∞ and C1, . . . if J = ∞. The infinitesimalgenerator Q of X can then be written as:

Q =

⎛⎜⎜⎜⎜⎜⎜⎜⎝

QB QB,C1QB,C2

· · · QB,CJ−1QB,CJ

0 QC1 0 · · · 0 00 0 QC2

0 · · · 0...

. . . . . . . . . . . ....

0 0 0 · · · QCJ−10

0 0 0 · · · 0 QCJ

⎞⎟⎟⎟⎟⎟⎟⎟⎠, [2.74]

where QCj (respectively QB) is the transition rate matrix between the states of Cj

(respectively B) and QB,Cj is the transition rate matrix from the states of B to thestates of Cj . If J = ∞ then the structure of the matrix Q is identical to that of [1.3].In the rest of this section, the notation j = 1, . . . , J means j ≥ 1 if J = ∞.

We assume, without any loss of generality, that for all j = 1, . . . , J , we haveQB,Cj = 0. Indeed, the condition QB,Cj = 0 means that the class Cj is not accessiblefrom the set B. It can, therefore, be studied independently of the other classes. Sucha Markov chain is referred to as an absorbing Markov chain. The subset B beingfinite and the classes Cj being recurrent, from theorem 2.17, the Markov chain Xis non-explosive. The transition function matrix P (t) = (Pi,j(t), i, j ∈ S), wherePi,j(t) = {Xt = j | X0 = i}, is, therefore, stochastic for all t ≥ 0. It is thus,from theorem 2.11, the unique solution to the backward equation [2.45] and forwardequation [2.47].

Following the form [2.74] of the matrix Q, it is easy to see that the matrix P (t)has the following form:

P (t) =

⎛⎜⎜⎜⎜⎜⎜⎜⎝

PB(t) R1(t) R2(t) · · · RJ−1(t) RJ (t)0 PC1(t) 0 · · · 0 00 0 PC2(t) 0 · · · 0...

. . . . . . . . . . . ....

0 0 0 · · · PCJ−1(t) 00 0 0 · · · 0 PCJ

(t)

⎞⎟⎟⎟⎟⎟⎟⎟⎠, [2.75]

where the matrix Rj(t) contains the transition functions from the states of B to thestates of Cj and where the matrix PCj (t) (respectively PB(t)) contains the transitionfunctions between the states of Cj (respectively B).


The backward equations P (t) = QP (t) can then be written, for all j = 1, . . . , J ,as:

PB(t) = QBPB(t)

and, for all j = 1, . . . , J ,

Rj(t) = QBRj(t) +QB,CjPCj (t).

Since B is finite, we have:

PB(t) = eQBt.

The states of B being transient, it follows, from corollary 2.4, that:

limt−→∞ eQBt = 0. [2.76]

For all j = 1, . . . , J , we have, since the matrices QB and eQBt commute,

e−QBtRj(t) = −QBe−QBtRj(t) + e−QBtRj(t)

= −QBe−QBtRj(t) + e−QBt QBRj(t) +QB,CjPCj (t)

= e−QBtQB,CjPCj (t).

Since Rj(0) = 0, it follows that:

e−QBtRj(t) =t

0

e−QBuQB,CjPCj (u)du,

that is:

Rj(t) =t

0

eQB(t−u)QB,CjPCj (u)du.

We assume that X0 ∈ B. The initial probability distribution α of X can then bewritten as:

α = (αB , 0, . . . , 0),

where αB is the row vector of dimension |B| containing the initial probabilitiescorresponding to the states of B. The hypothesis X0 ∈ B ensures that αB = 1. By


taking Bc = C1 ∪ · · · ∪CJ and since X0 ∈ B, we define the total time τX(Bc) spentin the transient states, by:

τX(Bc) = inf{t ≥ T1 | Xt ∈ Bc} = inf{t ≥ 0 | Xt ∈ Bc}.

We then have, for all t ≥ 0,

τX(Bc) ≤ t ⇐⇒ Xt ∈ Bc.

From this it follows that, for all t ≥ 0,

{τX(Bc) ≤ t} = {Xt ∈ Bc}= 1− {Xt ∈ B}= 1−

i∈B

(αP (t))i

= 1− αBPB(t)

= 1− αBeQBt .

This corresponds to the solution to equation [2.73] when B is finite. Since B isfinite, it follows, by taking the limit when t tends to infinity and from relation [2.76],that:

{τX(Bc) < ∞} = 1.

If P denotes the transition probability matrix of the Markov chain Y embedded atthe transition instants of X and if ΛB is the diagonal matrix containing the rates νi,for i ∈ B then we have:

QB = −ΛB(I − PB),

where the matrix PB is the sub-matrix of P containing the transition probabilitiesbetween states of B.

LEMMA 2.15.– The matrix QB is invertible and we have:

(QB)−1 = −(I − PB)

−1(ΛB)−1.

PROOF.– Since B does not contain absorbing states, the matrix ΛB is invertible.Moreover, B being finite and containing only transient states, the matrix I − PB is


invertible, from theorem 1.36. The matrix QB is, therefore, also invertible and, sinceQB = −ΛB(I − PB), we have:

(QB)−1 = −(I − PB)

−1(ΛB)−1,


We then have, from relation [2.76],

{τX(Bc)} =∞

0

{τX(Bc) > t}dt

= αB

∞

0

eQBtdt

= −αB(QB)−1 < ∞.

We have thus handled the case where J = 1, since in this case, we have Bc = C1.We now assume that J ≥ 2. The hitting time of the class Cj , defined by:

τX(Cj) = inf{t ≥ T1 | Xt ∈ Cj},

satisfies, for all t ≥ 0 and since X0 ∈ B,

τX(Cj) ≤ t ⇐⇒ Xt ∈ Cj .

We then have, from the form [2.75] of the matrix P (t), for all t ≥ 0,

{τX(Cj) ≤ t} = {Xt ∈ Cj}=

i∈Cj

(αP (t))i

= αBRj(t)

= αB

t

0

eQB(t−u)QB,CjPCj (u)du .

The matrix P (t) being stochastic, the matrices PCj are also stochastic, and wehave PCj (t) = , for all t ≥ 0. It then follows that:

{τX(Cj) ≤ t} = αB

t

0

eQB(t−u)duQB,Cj

= αB(QB)−1 eQBt − I QB,Cj

.


By taking the limit when t tends to infinity, we obtain, from relation [2.76] andsince B is finite,

{τX(Cj) < ∞} = −αB(QB)−1QB,Cj . [2.77]

Let Cj be a fixed class. There exists a state i ∈ B such that {τX(Cj) < ∞ |X0 = i} < 1. Indeed, since:

{τX(Cj) < ∞ | X0 = i} = − (QB)−1QB,Cj i

,

if for all i ∈ B, we have {τX(Cj) < ∞ | X0 = i} = 1 then we obtain−(QB)

−1QB,Cj = , that is QB,Cj = −QB , which means, following the form[2.74] of the matrix Q, that, for every class C , = j, we have QB,C = 0, which iscontrary to the hypothesis, since we have J ≥ 2. For this state i, such that{τX(Cj) < ∞ | X0 = i} < 1, we then have {τX(Cj) | X0 = i} = ∞.

Let us now calculate the probability that the absorption occurs in class Cj . Thisprobability, which we denote by pj , is defined by:

pj = {XτX(Bc) ∈ Cj}.

Since X0 ∈ B, we have the following equivalence, for all t ≥ 0,

τX(Cj) ≤ t ⇐⇒ τX(Bc) ≤ t and XτX(Bc) ∈ Cj , [2.78]

which corresponds to equivalence [1.35] in the discrete case. We then have, for allt ≥ 0,

{τX(Cj) ≤ t} = {τX(Bc) ≤ t,XτX(Bc) ∈ Cj}. [2.79]

Taking the limit when t tends to infinity, we obtain, since {τX(Bc) < ∞} = 1and using [2.77],

pj = {τX(Cj) < ∞} = −αB(QB)−1QB,Cj .

Summing over j, we get:

J

j=1

pj = −αB(QB)−1

J

j=1

QB,Cj = −αB(QB)−1(−QB) = αB = 1.


We also have, for all t ≥ 0, using [2.79],

{τX(Bc) > t,XτX(Bc) ∈ Cj}= {XτX(Bc) ∈ Cj} − {τX(Bc) ≤ t,XτX(Bc) ∈ Cj}= pj − {τX(Cj) ≤ t}= −αB(QB)

−1eQBtQB,Cj .

The expected absorption time in class Cj is then equal, from relation [2.76] andsince B is finite, to:

{τX(Bc)1{XτX (Bc)∈Cj}} =∞

0

{τX(Bc) > t,XτX(Bc) ∈ Cj}dt

= −αB(QB)−1

∞

0

eQBtdtQB,Cj

= αB(QB)−2QB,Cj < ∞.

Using equivalence [2.78], this expected time can also be written as{τX(Cj)1{τX(Cj)<∞}}. Note that, in this way, we find the calculation of{τX(Bc)}, that is:

{τX(Bc)} =J

j=1

{τX(Bc)1{XτX (Bc)∈Cj}}

= αB(QB)−2

J

j=1

QB,Cj

= αB(QB)−2(−QB)

= −αB(QB)−1 .

Finally, the expected absorption time given that absorption occurs in class Cj canbe written as {τX(Bc) | XτX(Bc) ∈ Cj} = {τX(Cj) | τX(Cj) < ∞} and isequal, when pj = 0, to:

{τX(Bc) | XτX(Bc) ∈ Cj} ={τX(Bc)1{XτX (Bc)∈Cj}}

pj

=αB(QB)

−2QB,Cj

−αB(QB)−1QB,Cj

.


Note that, as in the discrete case, we can have pj = 0 in some particular cases, forinstance when the initial distribution is such that the class Cj is only accessible fromstates of B that are never visited by the chain.

2.14. Bibliographical notes

As in the discrete case, there are a number of books dealing with continuous-timeMarkov chains with finite and countably infinite state spaces. Some of these volumesare less recent but nevertheless fundamental, such as [BHA 60], [COX 65],[CHU 67], [ÇIN 75], [FEL 57], [FEL 66] [FRE 83], [KAR 75], [KAR 81] and[ROS 83]. Among the latest books on this subject, let us cite in particular [AND 91],[ASM 03], [BHA 90], [BOL 98], [BRÉ 98], [KIJ 97], [KUL 10], [NOR 97], [TIJ 03]and [TRI 02], which present the main results and, for some, offer a number ofapplications related to dynamic systems modeling. The study of the performance anddependability of these systems has given rise to numerous new developments, forexample [KOB 78], [TRI 02] and [KOB 09], in particular within queuing theory,which we will tackle in Chapter 5, and to many calculation techniques such asuniformization, which will be presented in Chapter 4. Each of these books deals inmore or less detail with the theoretical aspects of the analysis of continuous-timeMarkov chains. Let us cite, in addition, [SYS 92], which proposes a very detailedstudy of passage times in continuous-time Markov chains. A large number ofnumerical techniques exist to solve Markov chains and are proposed, in particular, in[STE 94] and [STE 09].

Our approach in this chapter consisted of dealing with Markov chains in a veryprecise way and in particular the study of the first explosion time, the first passagetimes as well as Kolmogorov’s backward and forward equations. A number ofapplications are presented in Chapters 3 and 5 whereas continuous-time Markovchains with finite state spaces are discussed in Chapter 4 together withuniformization.

Markov Chains (Theory, Algorithms and Applications) || Continuous-Time Markov Chains

Documents

Transcript of Markov Chains (Theory, Algorithms and Applications) || Continuous-Time Markov Chains