36-467/667
20 October 2020 (Lecture 15)
\[ \newcommand{\Expect}[1]{\mathbb{E}\left[ #1 \right]} \newcommand{\Prob}[1]{\mathbb{P}\left[ #1 \right]} \newcommand{\Var}[1]{\mathrm{Var}\left[ #1 \right]} \newcommand{\Cov}[1]{\mathrm{Cov}\left[ #1 \right]} \]
(The Gibbs sampler is a way of adapting this strategy to variables which don’t have a natural order)
\[ X(t) = \alpha + \beta X(t-1) + \epsilon(t) \]
# Simulate a time series of length n from a Gaussian AR(1) model with given
# coefficients, noise level and initial condition
# Based on Lecture 11 code
# Inputs: length of output time series (n)
# intercept of AR(1) (a)
# coefficient of AR(1) (b)
# standard deviation of the innovations (sd.innovation)
# initial value of time series (x.start)
# Output: vector of length n
# Presumes: innovations should be Gaussian white noise
# a, b, sd.innovation, x.start are all length-1 numerical values
ar1.sim <- function(n, a, b, sd.innovation, x.start) {
x <- vector(length=n)
x[1] <- x.start # Because R enumerates from 1, not 0
for (t in 2:n) {
x[t] <- a+b*x[t-1] + rnorm(1, sd=sd.innovation)
}
return(x)
}
# Simulate a time series of length n from a Gaussian AR(2) model with given
# coefficients, noise level and initial condition
# Derived from ar1.sim above
# Inputs: length of output time series (n)
# intercept of AR(2) (a)
# vecotr of coefficients of AR(2) (b)
# standard deviation of the innovations (sd.innovation)
# initial two values of time series (x.start)
# Output: vector of length n
# Presumes: innovations should be Gaussian white noise
# a and sd.innovation are length-1 numerical values
# b and x.start are length-2 numerical vectors
# b has coefficient on X(t-1) first
# x.start has oldest value first
ar2.sim <- function(n, a, b, sd.innovation, x.start) {
x <- vector(length=n)
x[1] <- x.start[1] # Because R enumerates from 1, not 0
x[2] <- x.start[2]
for (t in 3:n) {
x[t] <- a+b[1]*x[t-1] + b[2]*x[t-2] + rnorm(1, sd=sd.innovation)
}
return(x)
}
For you to think through offline: How would you make this work for an AR\((p)\)?
Three big reasons to simulate a model:
What’s the sample correlation between two independent random walks of length 100?
x <- ar1.sim(n=100, a=0, b=1, x.start=0, sd.innovation=1)
z <- ar1.sim(n=100, a=0, b=1, x.start=0, sd.innovation=1)
cor(x, z)
## [1] 0.2034217
cor(ar1.sim(n=100, a=0, b=1, x.start=0, sd.innovation=1),
ar1.sim(n=100, a=0, b=1, x.start=0, sd.innovation=1))
## [1] -0.6896914
var(replicate(1000, cor(ar1.sim(n=100, a=0, b=1, x.start=0, sd.innovation=1),
ar1.sim(n=100, a=0, b=1, x.start=0, sd.innovation=1))))
## [1] 0.2340391
var(replicate(1000, cor(ar1.sim(n=1e4, a=0, b=1, x.start=0, sd.innovation=1),
ar1.sim(n=1e4, a=0, b=1, x.start=0, sd.innovation=1))))
## [1] 0.2462083
Ernst, Philip A., Larry A. Shepp, and Abraham J. Wyner. 2017. “Yule’s ‘Nonsense Correlation’ Solved!” Annals of Statistics 45:1789–1809. https://doi.org/10.1214/16-AOS1509.
Metropolis, Nicholas, Arianna W. Rosenbluth, Marshall N. Rosenbluth, Augusta H. Teller, and Edward Teller. 1953. “Equations of State Calculations by Fast Computing Machines.” Journal of Chemical Physics 21:1087–92. https://doi.org/10.1063/1.1699114.
Schwartz, David N. 2017. The Last Man Who Knew Everything: The Life and Times of Enrico Fermi, Father of the Nuclear Age. New York: Basic Books.
Yule, G. Udny. 1926. “Why Do We Sometimes Get Nonsense-Correlations Between Time-Series? — a Study in Sampling and the Nature of Time-Series.” Journal of the Royal Statistical Society 89:1–63. https://doi.org/10.2307/2341482.