Introduction
In this part, we will introduce and define Markov Chains along with some examples. We will conduct some basic computations using them and investigate their long term evolution using powers of matrices. Finally, we will discuss induction and limiting and stationary distributions.
Markov Chains
Example 1 (Number Game)
Consider a game, at each time step , you are at position or . We write , , or for . At each time step, you move to a higher number (or from to ) with probability , or stay put with probability .
The transitions can be specified with,
A more succinct specification is with the transition matrix,
The sequence is an example of a Markov chain.
Definition 1 (Markov Chain)
Let be a discrete set (not necessarily finite), called the state space. A Markov Chain is a sequence of random variables taking values in , with the property,
for all .
The chain is said to be time-homogeneous if, for all 1We generally assume this unless otherwise stated.,
The transition matrix is defined with,
Further, a stochastic matrix is a square matrix with non-negative entries, satisfying , where is a row vector of ones (i.e., all rows sum to one).
Lastly, all transition matrices are stochastic matrices, and all stochastic matrices can be used as transition matrices.
Basic Computations and Long Term Evolution
Intuition (Basic Computations with Markov Chains)
If is a vector describing the probability distribution of states at stage , then is the vector describing the probability distribution of states at stage . Further, if is a vector describing the probability distribution of states at stage , then is the vector describing the probability distribution of states at stage .
Thus, the probability to go from state to state in steps is given by , but we will write .
The Champman-Kolmogorov [^1] relationship is derived from .
Lastly, the probability of being at at stage , and then at in stage and so on up to at stage , with , is by the product of corresponding entries of powers of the transition matrix,
Intuition (Long Term Evolution of Markov Chains)
When the number of states in is finite and not too big, we can investigate the long term behavior by computing for large . In some cases, the powers stabilize into a matrix where all rows are identical. It may also stabilize without identical rows, or it may not stabilize at all.
Note, that if is block-diagonal [^2], it may combine several of these behaviors,
Note (Long Term Evolution Using Simulation)
If is large of infinite, we may instead investigate the long term behavior using simulation.
Algorithm (Simulating a Markov Chain)
Repeat many times:
-
Draw according to
-
For in through :
-
Draw according to Use the distribution of the to approximate the distribution of .
Induction, Limiting and Stationary Distributions
Intuition (Proving stuff using Induction)
Many results about Markov chains can be proved using induction.
-
Formulate a statement depending on a non-negative integer .
-
Prove (the base case).
-
Prove that if is true, then is true (the inductive step).
With this, one may conclude that is true for all non-negative integers .
Example 2 (2-state Markov Chain)
Any 2-state Markov Chain has the transition matrix for some and .
We can prove by induction that, for any ,
We can use this to study what happens when goes to infinity.
Solution
We proceed by induction.
- Base case ():
This is true, since the right-hand side simplifies to,
- Inductive step: Assume the statement is true for some . We need to show that it is true for . We have,
Thus, by induction, the statement is true for all non-negative integers . From this, we can see that as goes to infinity, the powers of the transition matrix converge to,
Definition 2 (Limiting Distribution)
A limiting distribution for a Markov chain with transition matrix is a probability vector such that,
for all states and .
An equivalent formulation is, the limit exists and does not depend on .
Another equivalent formulation is, is a stochastic matrix with all rows identical.
Further, a Markov chain has either no or one unique limiting distribution. If a limiting distribution exists, its probabilities correspond to the proportion of time steps the chain spends at each state in the long run.
Definition 3 (Stationary Distribution)
A stationary distribution for a Markov chain is a distribution that is unchanged when applying one step of the Markov chain. If is the transition matrix, then a probability vector represents a stationary distribution if and only if,
A Markov chain can have zero, one, or many stationary distributions. Limiting distributions are always stationary distributions (but not necessarily vice versa).
Definition 4 (Regular Transition Matrices)
A stochastic matrix is positive if all entries are strictly positive, i.e., for all and . A stochastic matrix is regular if is positive for some .
Theorem 1 (Limit Theorem for Regular Markov Chains)
If the transition matrix is regular, the limiting distribution exists. There are no other stationary distributions and the limiting distribution is positive, i.e., all states have positive probability in the limiting distribution.
Intuition (Finding Stationary Distributions)
Find the satisfying by, for example,
-
Solving the linear system
-
Guessing at a , and showing that , or,
-
Computing an eigenvector for the transpose belonging to the eigenvalue 1.
Having found a satisfying , if the transition matrix is regular, we know represents the unique limiting distribution and the unique stationary distribution.
Example 3 (Random Walks on Undirected Graphs)
An undirected graph consists of nodes and undirected edges connecting them. An undirected graph defines a random walk Markov chain by, at every time step, following one of the edges out of a node, with equal probability.
When the graph is finite, we show that the vector is a stationary distribution, where , where is the number of edges going into node and is the sum over all nodes of their degrees.
If we further generalize, a weighted undirected graph is a graph with a positive weight at any edge between and for all and . If we define the Markov chain by choosing the next node with probabilities according to the weights. Then, when the graph is finite, the vector is a stationary distribution, where , where is the sum of the weights of the edges going into , and is the total sum over all nodes of their values.
Intuition (Moving Between States)
State is accessible from state if for some . States and communicate if is accessible from and is accessible from .
“Communication” is transitive, i.e., if communicates with and communicates with , then communicates with .
Communication is an equivalence relation, subdividing all states into communication classes. Communication classes can be found for example by drawing transition graphs.
Lastly, a Markov chain is irreducible if it has exactly one communication class.
Definition 5 (Recurrence and Transience)
Let be the first passage time to state ,
Let be the probability that a chain starting at will return to ,
A state is recurrent if a chain starting at will eventually revisit , i.e., if .
A state is transient if a chain starting at has a positive probability of never revisiting , i.e., if .
Note
The expected number of visits at when the chain starts at is given by .
is recurrent if and only if .
is transient if and only if .
Note (Communication Classes)
The states of a communication class are either all recurrent or all transient. The states of a finite irreducible Markov chain are all recurrent.
If a state is recurrent, only states inside its communication class are accessible from it. If no states outside a finite communication class are accessible from it, then the class consists of recurrent states.