Simulation and the Monte Carlo Method

36-467/667

20 October 2020 (Lecture 15)

\[ \newcommand{\Expect}[1]{\mathbb{E}\left[ #1 \right]} \newcommand{\Prob}[1]{\mathbb{P}\left[ #1 \right]} \newcommand{\Var}[1]{\mathrm{Var}\left[ #1 \right]} \newcommand{\Cov}[1]{\mathrm{Cov}\left[ #1 \right]} \]

In recent episodes

Agenda

Simulation

The general strategy

(The Gibbs sampler is a way of adapting this strategy to variables which don’t have a natural order)

The general strategy applied to AR(1)

\[ X(t) = \alpha + \beta X(t-1) + \epsilon(t) \]

It’s easy to turn this into code

# Simulate a time series of length n from a Gaussian AR(1) model with given
# coefficients, noise level and initial condition
  # Based on Lecture 11 code
# Inputs: length of output time series (n)
  # intercept of AR(1) (a)
  # coefficient of AR(1) (b)
  # standard deviation of the innovations (sd.innovation)
  # initial value of time series (x.start)
# Output: vector of length n
# Presumes: innovations should be Gaussian white noise
  # a, b, sd.innovation, x.start are all length-1 numerical values
ar1.sim <- function(n, a, b, sd.innovation, x.start) {
    x <- vector(length=n)
    x[1] <- x.start # Because R enumerates from 1, not 0
    for (t in 2:n) {
        x[t] <- a+b*x[t-1] + rnorm(1, sd=sd.innovation)
    }
    return(x)
}

It’s easy to extend this to higher-order models

# Simulate a time series of length n from a Gaussian AR(2) model with given
# coefficients, noise level and initial condition
  # Derived from ar1.sim above
# Inputs: length of output time series (n)
  # intercept of AR(2) (a)
  # vecotr of coefficients of AR(2) (b)
  # standard deviation of the innovations (sd.innovation)
  # initial two values of time series (x.start)
# Output: vector of length n
# Presumes: innovations should be Gaussian white noise
  # a and sd.innovation are length-1 numerical values
  # b and x.start are length-2 numerical vectors
  # b has coefficient on X(t-1) first
  # x.start has oldest value first
ar2.sim <- function(n, a, b, sd.innovation, x.start) {
    x <- vector(length=n)
    x[1] <- x.start[1] # Because R enumerates from 1, not 0
    x[2] <- x.start[2]
    for (t in 3:n) {
        x[t] <- a+b[1]*x[t-1] + b[2]*x[t-2] + rnorm(1, sd=sd.innovation)
    }
    return(x)
}

For you to think through offline: How would you make this work for an AR\((p)\)?

“Math is hard; let’s go simulate”

Three big reasons to simulate a model:

  1. We can’t easily see what it’s going to do;
  2. We can’t easily calculate its exact behavior;
  3. We can’t easily see what would happen if something changed

See what it’s going to do

Replacing calculations with simulations

“The Monte Carlo method”

The power of Monte Carlo

What’s the sample correlation between two independent random walks of length 100?

x <- ar1.sim(n=100, a=0, b=1, x.start=0, sd.innovation=1)
z <- ar1.sim(n=100, a=0, b=1, x.start=0, sd.innovation=1)
cor(x, z)
## [1] 0.2034217

cor(ar1.sim(n=100, a=0, b=1, x.start=0, sd.innovation=1),
    ar1.sim(n=100, a=0, b=1, x.start=0, sd.innovation=1))
## [1] -0.6896914
var(replicate(1000, cor(ar1.sim(n=100, a=0, b=1, x.start=0, sd.innovation=1),
    ar1.sim(n=100, a=0, b=1, x.start=0, sd.innovation=1))))
## [1] 0.2340391
var(replicate(1000, cor(ar1.sim(n=1e4, a=0, b=1, x.start=0, sd.innovation=1),
    ar1.sim(n=1e4, a=0, b=1, x.start=0, sd.innovation=1))))
## [1] 0.2462083

Why is this happening?

What good is the math then?

What if?

Summary

Backup: The origin of Monte Carlo

Backup: Stanislaw Ulam holding the Fermiac

via

References

Ernst, Philip A., Larry A. Shepp, and Abraham J. Wyner. 2017. “Yule’s ‘Nonsense Correlation’ Solved!” Annals of Statistics 45:1789–1809. https://doi.org/10.1214/16-AOS1509.

Metropolis, Nicholas, Arianna W. Rosenbluth, Marshall N. Rosenbluth, Augusta H. Teller, and Edward Teller. 1953. “Equations of State Calculations by Fast Computing Machines.” Journal of Chemical Physics 21:1087–92. https://doi.org/10.1063/1.1699114.

Schwartz, David N. 2017. The Last Man Who Knew Everything: The Life and Times of Enrico Fermi, Father of the Nuclear Age. New York: Basic Books.

Yule, G. Udny. 1926. “Why Do We Sometimes Get Nonsense-Correlations Between Time-Series? — a Study in Sampling and the Nature of Time-Series.” Journal of the Royal Statistical Society 89:1–63. https://doi.org/10.2307/2341482.