11/18/2014

Previously…

We introduced: Markov Chains, Monte Carlo simulation, Bootstrap method

Today: Applications to Probability

Where the Bayesian revolution takes over: drawing from complex probability distributions

  • rejection sampling
  • The Metropolis algorithm
  • General introduction to "Markov Chain Monte Carlo"

Optional reading: Geyer, ``Practical Markov Chain Monte Carlo'', Statistical Science 7 (1992): 473–483

One-dimension probability distributions

We have [dpqr]{norm, binom, pois, …} for known distributions. For many good reasons, we'd like to be able to:

  • draw from complex distributions
  • find the solution to relevant integrals
  • eventually, go to many higher dimensions

What's available?

Integration by Simulation

Law of large numbers: if \(X_1, X_2, \ldots X_n\) all IID with p.d.f. \(p\),

\[ \frac{1}{n}\sum_{i=1}^{n}{f(X_i)} \rightarrow \mathbb{E}_{p}[f(X)] = \int{f(x) p(x) dx} \]

The Monte Carlo principle: to find \(\int{g(x) dx}\), draw from \(p\) and take the sample mean of \(f(x) = g(x)/p(x)\).

And it works, too

Central limit theorem:

\[ \frac{1}{n}\sum_{i=1}^{n}{\frac{g(x_i)}{p(x_i)}} \rightsquigarrow \mathcal{N}\left(\int{g(x)dx}, \frac{\sigma^2_{g/p}}{n}\right) \]

The Monte Carlo approximation to the integral is unbiased, with root mean square error \(\propto n^{-1/2}\) – just keep taking Monte Carlo draws, and the error gets as small as you like, even if \(g\) or \(x\) are very complicated.

Taking unknown samples

Generating from \(p\) is easy if it's a standard distribution or we have a nice, invertible CDF (quantile method). What can we do if all we've got is the probability density function \(p\)?

Suppose the pdf \(f\) is zero outside an interval \([c,d]\), and \(\leq M\) on the interval.

plot of chunk unnamed-chunk-1

Rejection sampling

We know how to draw from uniform distributions in any dimension. Do it in two:

x1 <- runif(100000, 0, 1); y1 <- runif(100000, 0, 3); 
selected <- y1 < dbeta(x1, 3, 6)

plot of chunk unnamed-chunk-3

Rejection sampling

mean(selected)
## [1] 0.329
accepted.points <- x1[selected]
mean(accepted.points < 0.5)
## [1] 0.8524
pbeta(0.5, 3, 6)
## [1] 0.8555

Rejection sampling

For this to work efficiently: we have to cover the target distribution with one that sits close to it.

x2 <- runif(100000, 0, 1); y2 <- runif(100000, 0, 10); 
selected <- y2 < dbeta(x2, 3, 6)
mean(selected)
## [1] 0.101

Rejection sampling

plot of chunk unnamed-chunk-6

Metropolis Algorithm

(Really: Metropolis, Rosenbluth x2, Teller x2, 1953)

Rejection sampling works if we know the complete distribution \(p(x)\), with \(\int p(x) = 1\). For many, many problems, we know the function up to the normalizing constant.

Example: Equations of the canonical ensemble of energy formations in the state of uncontrolled hydrogen fusion,

\[ p(E) = \frac{1}{|Z|} exp(f(E)/kT) \]

\(|Z|\) is legendarily hard to calculate precisely.

Basic Metropolis

On integer steps:

metropolis.one <- function (xx, fun=dpois, ...) {
  prop <- 2*rbinom(1,1,0.5) - 1 + xx
  acc.ratio <- fun(prop, ...)/fun(xx, ...)
  output <- if (acc.ratio > runif(1)) prop else xx
  output
}
replicate (10, metropolis.one(10, lambda=5))
##  [1]  9 11 10  9  9 11 11 10 10 10

Basic Metropolis

On integer steps:

start <- 50
draws <- rep(NA, 10000)
for (iteration in 1:10000) {
  start <- metropolis.one (start, lambda=10)
  draws[iteration] <- start
}

Basic Metropolis

plot(draws)

plot of chunk unnamed-chunk-9

Basic Metropolis

Have to discard the initial state for "burn-in"

plot(draws[-(1:100)])

plot of chunk unnamed-chunk-10

Basic Metropolis

Have to discard the initial state for "burn-in"

hist(draws[-(1:100)])
hist(rpois(10000, 10), border=2, add=TRUE)

plot of chunk unnamed-chunk-11

Basic Metropolis

On integer steps:

metropolis.cts <- function (xx, fun=dgamma, ...) {
  prop <- rnorm(1, xx)
  acc.ratio <- fun(prop, ...)/fun(xx, ...)
  output <- if (acc.ratio > runif(1)) prop else xx
  output
}
replicate (20, metropolis.cts(20, shape=100, rate=10))
##  [1] 20.00 19.54 19.64 19.42 19.61 20.00 20.00 19.71 19.60 20.00 18.79
## [12] 18.23 20.00 19.99 19.91 19.42 18.29 20.00 20.00 20.00

Basic Metropolis

start <- 50
draws <- rep(NA, 10000)
for (iteration in 1:10000) {
  start <- metropolis.cts (start, shape=100, rate=10)
  draws[iteration] <- start
}

Basic Metropolis

plot(draws)

plot of chunk unnamed-chunk-14

Basic Metropolis

Have to discard the initial state for "burn-in"

plot(draws[-(1:200)])

plot of chunk unnamed-chunk-15

Basic Metropolis

Have to discard the initial state for "burn-in"

hist(draws[-(1:200)])
hist(rgamma(10000, 100, 10), border=2, add=TRUE)

plot of chunk unnamed-chunk-16