Markov Chains I

36-467/667

29 October 2020 (Lecture 18)

\[ \newcommand{\Expect}[1]{\mathbb{E}\left[ #1 \right]} \newcommand{\Prob}[1]{\mathbb{P}\left[ #1 \right]} \newcommand{\Probwrt}[2]{\mathbb{P}_{#1}\left( #2 \right)} \newcommand{\Var}[1]{\mathrm{Var}\left[ #1 \right]} \newcommand{\Cov}[1]{\mathrm{Cov}\left[ #1 \right]} \newcommand{\Expectwrt}[2]{\mathbb{E}_{#1}\left[ #2 \right]} \newcommand{\InitDist}{\vec{p}_{\mathrm{init}}} \newcommand{\InvDist}{\vec{p}^{*}} \]

In our previous episodes…

Markov Processes

The Markov property includes two extremes

Markov processes are generative models

rmarkov <- function(n, rinitial, rtransition) {
  x <- vector(length=n)
  x[1] <- rinitial()
  for (t in 2:n) {
    x[t] <- rtransition(x[t-1])
  }
  return(x)
}

Markov Chains

Graph vs. matrix

\(\Leftrightarrow \mathbf{q}=\left[\begin{array}{cc} 0.5 & 0.5 \\ 0.75 & 0.25 \end{array} \right]\)

Your Basic Markov Chain

rmarkovchain <- function(n, p0, q) {
  k <- length(p0)
  stopifnot(k==nrow(q),k==ncol(q),all.equal(rowSums(q),rep(1,time=k)))
  rinitial <- function() { sample(1:k,size=1,prob=p0) }
  rtransition <- function(x) { sample(1:k,size=1,prob=q[x,]) }
  return(rmarkov(n,rinitial,rtransition))
}
q <- matrix(c(0.5, 0.5, 0.75, 0.25), byrow=TRUE, nrow=2)
x <- rmarkovchain(1e4,c(0.5,0.5),q)
head(x)
## [1] 1 1 2 2 1 2

Checking that it works

ones <- which(x[-1e4]==1)  # Why omit the last step?
twos <- which(x[-1e4]==2)
signif(table(x[ones+1])/length(ones),3)
## 
##     1     2 
## 0.504 0.496
signif(table(x[twos+1])/length(twos),3)
## 
##     1     2 
## 0.758 0.242

vs. \((0.5,0.5)\) and \((0.75, 0.25)\) ideally

Why is this a check?

How trajectories evolve

Where trajectories end in the long run

How distributions evolve

More fun with eigenvalues and eigenvectors

Special properties of stochastic matrices

Irreducible, aperiodic Markov chains

Irreducible, aperiodic Markov chains

Invariant distributions

Ergodicity and Weak Dependence

Sample frequencies vs. probabilities

eigen(t(q))
## eigen() decomposition
## $values
## [1]  1.00 -0.25
## 
## $vectors
##           [,1]       [,2]
## [1,] 0.8320503 -0.7071068
## [2,] 0.5547002  0.7071068
eigen(t(q))$vectors[,1]/sum(eigen(t(q))$vectors[,1])
## [1] 0.6 0.4

table(rmarkovchain(1e4,c(0.5,0.5),q))
## 
##    1    2 
## 5983 4017
table(rmarkovchain(1e4,c(0.5,0.5),q))
## 
##    1    2 
## 6038 3962
table(rmarkovchain(1e4,c(0,1),q))
## 
##    1    2 
## 5956 4044
table(rmarkovchain(1e4,c(1,0),q))
## 
##    1    2 
## 5987 4013

Central limit theorem

time.avgs <- replicate(100, mean(rmarkovchain(1e4, c(0.5, 0.5), q)))
qqnorm(time.avgs); qqline(time.avgs)

What if there’s more than one irreducible set?

What if the state space isn’t finite?

What if the state space is continuous?

Higher-order Markov processes

First- vs. Higher- order Markov chains

Variations on the theme

Summary

Backup: The philosophical origins of the Markov property

Backup: The philosophical origins of the Markov property

Backup: The philosophical origins of the Markov property

Backup: TL;DR on the origins of the Markov property

Backup: Markov chain Monte Carlo

Backup: Markov chain Monte Carlo

  1. Set \(X(0)\) however we like, and initialize \(t \leftarrow 0\).
  2. Draw a proposal \(Z(t)\) from some conditional distribution \(r(\cdot|X_t)\) — for instance a Gaussian or uniform centered on \(X(t)\), or anything else easy to draw from and with smooth enough noise.
  3. Draw \(U(t)\) independently from a uniform distribution on \([0,1)\).
  4. If \(U(t) < p(Z(t)))/p(X(t)))\), then \(X(t+1) = Z(t)\), otherwise \(X(t+1)=X(t)\)
  5. Increase \(t\) by 1 and go to step 2.

Backup: Further reading

References

Basharin, Gely P., Amy N. Langville, and Valeriy A. Naumov. 2004. “The Life and Work of A. A. Markov.” Linear Algebra and Its Applications 386:3–26. http://langvillea.people.cofc.edu/MarkovReprint.pdf.

Cassirer, Ernst. 1944. An Essay on Man: An Introduction to a Philosophy of Human Culutre. New Haven, Connecticut: Yale University Press.

Graham, Loren, and Jean-Michel Kantor. 2009. Naming Infinity: A True Story of Religious Mysticism and Mathematical Creativity. Cambridge, Massachusetts: Harvard University Press.

Grimmett, G. R., and D. R. Stirzaker. 1992. Probability and Random Processes. 2nd ed. Oxford: Oxford University Press.

Guttorp, Peter. 1995. Stochastic Modeling of Scientific Data. London: Chapman; Hall.

Hacking, Ian. 1990. The Taming of Chance. Cambridge, England: Cambridge University Press.

Hoek, John van der, and Robert J. Elliott. 2018. Introduction to Hidden Semi-Markov Models. Cambridge, England: Cambridge University Press.

Lasota, Andrzej, and Michael C. Mackey. 1994. Chaos, Fractals, and Noise: Stochastic Aspects of Dynamics. Berlin: Springer-Verlag.

Mackey, Michael C. 1992. Time’s Arrow: The Origins of Thermodynamic Behavior. Berlin: Springer-Verlag.

Metropolis, Nicholas, Arianna W. Rosenbluth, Marshall N. Rosenbluth, Augusta H. Teller, and Edward Teller. 1953. “Equations of State Calculations by Fast Computing Machines.” Journal of Chemical Physics 21:1087–92. https://doi.org/10.1063/1.1699114.

Meyn, S. P., and R. L. Tweedie. 1993. Markov Chains and Stochastic Stability. Berlin: Springer-Verlag.

Porter, Theodore M. 1986. The Rise of Statistical Thinking, 1820–1900. Princeton, New Jersey: Princeton University Press.