Part 9 - Continuous-Time Markov Chains I

Introduction

In this part, we will introduce Continuous-Time Markov Chains, which are stochastic processes that evolve over continuous time and possess the Markov property. They are widely used in various fields, including queueing theory, reliability engineering, and population dynamics.

Continuous-Time Markov Chains

Definition: Continuous-Time Markov Chain

A continuous-time stochastic process $\{X_t\}_{t \geq 0}$ with discrete state space $\mathcal{S}$ is a continuous-time Markov chain if, $$ P(X_{t + s} = j \mid X_s = i, X_u, 0 \leq u < s) = P(X_{t + s} = j \mid X_s = i), $$ where $s, t \geq 0$ and $i, j, X_u \in \mathcal{S}$.

The process is time-homogeneous if for $s, t \geq 0$ and all $i, j \in \mathcal{S}$, $$ P(X_{t + s} = j \mid X_s = i) = P(X_t = j \mid X_0 = i). $$ We then define the transition function as the matrix function $P(t)$ with entries of the matrix given by, $$ P(t)_{ij} = P(X_t = j \mid X_0 = i). $$

The Chapman-Kolmogorov Equations

Theorem: Chapman-Kolmogorov Equations

For the transition function $P(t)$ we have,

$P(s + t) = P(s) P(t)$
$P(0) = I$ (identity matrix)

Proof

For the first property; If they are same for an arbitrary element $i, j$ of the matrix, then they are equal. $$ \begin{align*} P(s + t)_{ij} & = P(X_{s + t} = j \mid X_s = i) \newline & = \sum_k P(X_{s + t} = j, X_s = k \mid X_0 = i) \newline & = \sum_k P(X_s = k \mid X_0 = i) P(X_{s + t} = j \mid X_s = k, X_0 = i) \newline & = \sum_k P(X_s = k \mid X_0 = i) P(X_{s + t} = j \mid X_s = k) \newline & = \sum_k P(X_s = k \mid X_0 = i) P(X_t = j \mid X_0 = k) \newline & = \sum_k P(s)_{ik} P(t)_{kj} \newline & = (P(s) P(t))_{ij} \ _\blacksquare \end{align*} $$ The second property follows directly from the definition of the transition function.

Holding Times

Intuition: Holding Times

Define $T_i$ as the time the continuous-time Markov chain $\{X_t\}_{t \geq 0}$ that always starts in state $i$ stays in $i$ before moving to a different state, so that for any $s > 0$, $$ P(T_i > s) = P(X_u = i, 0 \leq u \leq s) $$ The distribution of $T_i$ is memoryless and thus exponential. We define $q_i$ such that, $$ T_i \sim \mathrm{Exponential}(q_i) $$ Remember that this means that the average time the process stays in $i$ is $\frac{1}{q_i}$. The rate of transition out of the state is $q_i$.

Note

We can have $q_i = 0$, meaning that the state $i$ is absorbing, $$ P(T_i > s) = 1. $$

The Embedded Markov Chain

Intuition: Embedded Markov Chain

We can define a new stochastic process by listing the states the chain visits. This will be a discrete time Markov chain called the embedded Markov chain with transition matrix $\tilde{P}$.

Note

The transition matrix $\tilde{P}$ has zeros along its diagonal! Why? Because the process cannot stay in the same state when it makes a transition.

Further, note that the continuous-time Markov chain is completely determined by the expected holding times $(\frac{1}{q_1}, \ldots, \frac{1}{q_k})$ and the transition matrix of the embedded Markov chain $\tilde{P}$.

Intuition: Constructing a Continuous-Time Markov Chain

A way to describe a continuous-time Markov chain is to describe $k \times (k - 1)$ independent “alarm clocks”.

For states $i$ and $j$ so that $i \neq j$, let $q_{ij}$ be the parameter of an Exponentially distributed random variable representing the time until an “alarm clock” rings.

When in state $i$, wait until the first alarm clock $q_i$ rings, then move to the state given by the index $j$ of that alarm clock. This defines a continuous-time Markov chain.

The time until the first alarm clock $q_i$ rings is Exponentially distributed with parameter given by, $$ q_i = q_{i1} + q_{i2} + \ldots + q_{i, i - 1} + q_{i, i + 1} + \ldots + q_{ik}. $$ i.e., the parameter of the holding time distribution at $i$.

The chain is completely described by the rates $q_{ij}$.

Further, we saw also that the chain is completely described by the $p_{ij}$ and the $q_i$. The relationship is described by the equation above, and, $$ p_{ij} = \frac{q_{ij}}{q_{i1} + q_{i2} + \ldots + q_{i, i - 1} + q_{i, i + 1} + \ldots + q_{ik}} = \frac{q_{ij}}{q_i}. $$ for $i \neq j$.

Relationship Between $P(t)$ and $Q$

Intuition: Infinitesimal Generator Matrix

To relate $P(t)$ to the $q_{ij}$‘s, we first relate them to $P^{\prime}(0)$.

Assuming $P(t)$ is differentiable, we can show that, $$ P^{\prime}(0) = \begin{bmatrix} -q_1 & q_{12} & q_{13} & \ldots & q_{1k} \newline q_{21} & -q_2 & q_{23} & \ldots & q_{2k} \newline \vdots & \vdots & \vdots & \ddots & \vdots \newline q_{k1} & q_{k2} & q_{k3} & \ldots & -q_k \newline \end{bmatrix} = Q, $$

Proof: Lazy “proof”

Firstly, let’s show that $Q_{12} = P^{\prime}(0)_{12} = q_{12}$. $$ \begin{align*} Q_{12} & = \lim_{h \to 0} \frac{P(X_h = 2 \mid X_0 = 1)}{h} \newline & = \lim_{h \to 0} \frac{P(h)_{12}}{h} \newline & = \lim_{h \to 0} \frac{P(h)_{12} - P(0)_{12}}{h} \newline & \eqqcolon P^{\prime}(0)_{12} \newline \end{align*} $$ Secondly, show that $-Q_1 = P^{\prime}(0)_{11}$. $$ \begin{align*} -Q_1 & = -[Q_{12} + Q_{13} + \ldots + Q_{1k}] \newline & = - \lim_{h \to 0} \frac{P(h)_{12} + P(h)_{13} + \ldots + P(h)_{1k}}{h} \newline & = - \lim_{h \to 0} \frac{1 - P(h)_{11}}{h} \newline & = \lim_{h \to 0} \frac{P(h)_{11} - 1}{h} \newline & = \lim_{h \to 0} \frac{P(h)_{11} - P(0)_{11}}{h} \newline & \eqqcolon P^{\prime}(0)_{11} \newline \end{align*} $$ The other elements can be shown similarly $_\blacksquare$.

Note that the rows of $P^{\prime}(0)$, i.e., $Q$, sum to zero!

Kolmogorov Forward and Backward Equations

Theorem: Kolmogorov Forward and Backward Equations

The Kolmogorov Forward and Backward equations describe how the transition function $P(t)$ evolves over time.

We get that for all $t \geq 0$, $$ P^{\prime}(t) = P(t) Q = Q P(t). $$ Thus, what this means in terms of the components of $P(t)$, $$ \begin{align*} P^{\prime}(t)_{ij} & = -P_{ij}(t) q_j + \sum_{k \neq j} P_{ik}(t) q_{kj} \newline P^{\prime}(t)_{ij} & = -q_i P_{ij}(t) + \sum_{k \neq i} q_{ik} P_{kj}(t) \newline \end{align*} $$ Thus, the above equations define a set of differential equations which the components of the matrix function $P(t)$ must satisfy.

Proof

$$ \begin{align*} P^{\prime}(t) & = \lim_{h \to 0} \frac{P(t + h) - P(t)}{h} \newline & = \lim_{h \to 0} \frac{P(t) P(h) - P(t)}{h} \newline & = \lim_{h \to 0} P(t) \frac{P(h) - I}{h} \newline & = P(t) \lim_{h \to 0} \frac{P(h) - I}{h} \newline & = P(t) \underbrace{\lim_{h \to 0} \frac{P(h) - P(0)}{h}}_{P^{\prime}(0) = Q} \newline & = P(t) Q \ _\blacksquare \end{align*} $$ Since we arbitrarily chose to write $P(t + h)$ as $P(t) P(h)$, we could have equally written it as $P(h) P(t)$, leading to the other equation, thus completing the proof.

The Matrix Exponential

Intuition: Matrix Exponential Solution

The solution to the Kolmogorov Forward and Backward equations is given by the matrix exponential. For any square matrix $A$ we can define the matrix exponential as, $$ e^A \coloneqq \sum_{n = 0}^{\infty} \frac{1}{n!} A^n = I + A + \frac{1}{2} A^2 + \frac{1}{6} A^3 + \ldots $$ The series converges for all square matrices $A$ (but we will not prove this here).

Some important properties of the matrix exponential are:

$e^0 = I$
$e^A e^{-A} = I$
$e^{(s + t) A} = e^{sA} e^{tA}$
If $AB = BA$, then $e^{A + B} = e^A e^B = e^B e^A$
$\frac{\partial}{\partial t} e^{tA} = A e^{tA} = e^{tA} A$

Note that $P(t) = e^{tQ}$ is the unique solution to the differential equations given by the Kolmogorov Forward and Backward equations with initial condition $P(0) = I$.

Further, note that $P^\prime(t) = Q P(t)$ for all $t \geq 0$ and $P(0) = I$.

In R one can use the expm package to compute matrix exponentials.