We introduced: Markov Chains, Monte Carlo simulation, Bootstrap method
11/18/2014
We introduced: Markov Chains, Monte Carlo simulation, Bootstrap method
Where the Bayesian revolution takes over: drawing from complex probability distributions
Optional reading: Geyer, ``Practical Markov Chain Monte Carlo'', Statistical Science 7 (1992): 473–483
We have [dpqr]{norm, binom, pois, …} for known distributions. For many good reasons, we'd like to be able to:
What's available?
Law of large numbers: if \(X_1, X_2, \ldots X_n\) all IID with p.d.f. \(p\),
\[ \frac{1}{n}\sum_{i=1}^{n}{f(X_i)} \rightarrow \mathbb{E}_{p}[f(X)] = \int{f(x) p(x) dx} \]
The Monte Carlo principle: to find \(\int{g(x) dx}\), draw from \(p\) and take the sample mean of \(f(x) = g(x)/p(x)\).
Central limit theorem:
\[ \frac{1}{n}\sum_{i=1}^{n}{\frac{g(x_i)}{p(x_i)}} \rightsquigarrow \mathcal{N}\left(\int{g(x)dx}, \frac{\sigma^2_{g/p}}{n}\right) \]
The Monte Carlo approximation to the integral is unbiased, with root mean square error \(\propto n^{-1/2}\) – just keep taking Monte Carlo draws, and the error gets as small as you like, even if \(g\) or \(x\) are very complicated.
Generating from \(p\) is easy if it's a standard distribution or we have a nice, invertible CDF (quantile method). What can we do if all we've got is the probability density function \(p\)?
Suppose the pdf \(f\) is zero outside an interval \([c,d]\), and \(\leq M\) on the interval.
We know how to draw from uniform distributions in any dimension. Do it in two:
x1 <- runif(100000, 0, 1); y1 <- runif(100000, 0, 3); selected <- y1 < dbeta(x1, 3, 6)
mean(selected)
## [1] 0.329
accepted.points <- x1[selected] mean(accepted.points < 0.5)
## [1] 0.8524
pbeta(0.5, 3, 6)
## [1] 0.8555
For this to work efficiently: we have to cover the target distribution with one that sits close to it.
x2 <- runif(100000, 0, 1); y2 <- runif(100000, 0, 10); selected <- y2 < dbeta(x2, 3, 6) mean(selected)
## [1] 0.101
(Really: Metropolis, Rosenbluth x2, Teller x2, 1953)
Rejection sampling works if we know the complete distribution \(p(x)\), with \(\int p(x) = 1\). For many, many problems, we know the function up to the normalizing constant.
Example: Equations of the canonical ensemble of energy formations in the state of uncontrolled hydrogen fusion,
\[ p(E) = \frac{1}{|Z|} exp(f(E)/kT) \]
\(|Z|\) is legendarily hard to calculate precisely.
On integer steps:
metropolis.one <- function (xx, fun=dpois, ...) { prop <- 2*rbinom(1,1,0.5) - 1 + xx acc.ratio <- fun(prop, ...)/fun(xx, ...) output <- if (acc.ratio > runif(1)) prop else xx output } replicate (10, metropolis.one(10, lambda=5))
## [1] 9 11 10 9 9 11 11 10 10 10
On integer steps:
start <- 50 draws <- rep(NA, 10000) for (iteration in 1:10000) { start <- metropolis.one (start, lambda=10) draws[iteration] <- start }
plot(draws)
Have to discard the initial state for "burn-in"
plot(draws[-(1:100)])
Have to discard the initial state for "burn-in"
hist(draws[-(1:100)]) hist(rpois(10000, 10), border=2, add=TRUE)
On integer steps:
metropolis.cts <- function (xx, fun=dgamma, ...) { prop <- rnorm(1, xx) acc.ratio <- fun(prop, ...)/fun(xx, ...) output <- if (acc.ratio > runif(1)) prop else xx output } replicate (20, metropolis.cts(20, shape=100, rate=10))
## [1] 20.00 19.54 19.64 19.42 19.61 20.00 20.00 19.71 19.60 20.00 18.79 ## [12] 18.23 20.00 19.99 19.91 19.42 18.29 20.00 20.00 20.00
start <- 50 draws <- rep(NA, 10000) for (iteration in 1:10000) { start <- metropolis.cts (start, shape=100, rate=10) draws[iteration] <- start }
plot(draws)
Have to discard the initial state for "burn-in"
plot(draws[-(1:200)])
Have to discard the initial state for "burn-in"
hist(draws[-(1:200)]) hist(rgamma(10000, 100, 10), border=2, add=TRUE)