Linear Methods for Noise Removal and Filtering

36-467/36-667

1 October 2020 (Lecture 10)

\[ \newcommand{\Expect}[1]{\mathbb{E}\left[ #1 \right]} \newcommand{\Var}[1]{\mathrm{Var}\left[ #1 \right]} \newcommand{\SampleVar}[1]{\widehat{\mathrm{Var}}\left[ #1 \right]} \newcommand{\Cov}[1]{\mathrm{Cov}\left[ #1 \right]} \newcommand{\TrueRegFunc}{\mu} \newcommand{\EstRegFunc}{\widehat{\TrueRegFunc}} \DeclareMathOperator{\tr}{tr} \DeclareMathOperator*{\argmin}{argmin} \DeclareMathOperator{\det}{det} \newcommand{\TrueNoise}{\epsilon} \newcommand{\EstNoise}{\widehat{\TrueNoise}} \newcommand{\SignalNoise}{N} \]

In our previous episodes

Optimal linear filtering for time series

Optimal linear filtering for time series

\[\begin{eqnarray} \hat{S}(t_0) & = & \alpha + \vec{\beta} \cdot \left[\begin{array}{c} X(t_1) \\ X(t_2) \\ \vdots \\ X(t_n) \end{array}\right]\\ \alpha & = & \Expect{S(t_0)} - \vec{\beta} \cdot \left[\begin{array}{c} \Expect{X(t_1)}\\ \Expect{X(t_2)} \\ \vdots \\ \Expect{X(t_n)}\end{array}\right]\\ \vec{\beta} & = & {\left[\begin{array}{cccc} \Var{X(t_1)} & \Cov{X(t_1), X(t_2)} & \ldots & \Cov{X(t_1), X(t_n)}\\ \Cov{X(t_1), X(t_2)} & \Var{X(t_2)} & \ldots & \Cov{X(t_2), X(t_n)}\\ \vdots & \vdots & \ldots & \vdots\\ \Cov{X(t_1), X(t_n)} & \Cov{X(t_2), X(t_n)} & \ldots & \Var{X(t_n)}\end{array}\right]}^{-1} \left[\begin{array}{c} \Cov{S(t_0), X(t_1)}\\ \Cov{S(t_0), X(t_2)}\\ \vdots \\ \Cov{S(t_0), X(t_n)}\end{array}\right] \end{eqnarray}\]

This should look familiar

Simple case, to build intuition

\[\begin{eqnarray} \Cov{X(t), X(t+h)} & = & \Cov{S(t)+\SignalNoise(t), S(t+h)+\SignalNoise(t+h)}\\ & = & \Cov{S(t), S(t+h)} + \Cov{\SignalNoise(t), \SignalNoise(t+h)}\\ & & + \Cov{S(t), \SignalNoise(t+h)} + \Cov{S(t+h), \SignalNoise(t)}\\ & = & \gamma(h) + \sigma^2\mathbf{1}(h=0)\\ \Cov{S(t), X(t+h)} & = & \Cov{S(t), S(t+h) + \SignalNoise(t+h)}\\ & = & \Cov{S(t), S(t+h)} + \Cov{S(t), \SignalNoise(t+h)}\\ & = & \gamma(h) \end{eqnarray}\]

(Last time I said it’s common to add an extra “nugget” of auto-covariance at lag 0 for the observations; that’s handling measurement noise)

One observation, one estimate

\[\begin{eqnarray} \hat{S}(t) & = & \beta X(t)\\ \beta & = & \frac{\Cov{S(t), X(t)}}{\Var{X(t)}}\\ &= & \frac{\gamma(0)}{\gamma(0) + \sigma^2} < 1\\ \hat{S}(t) & = & \frac{\gamma(0)}{\gamma(0) + \sigma^2} X(t) \end{eqnarray}\]

Two observations, one estimate

Two observations, one estimate

\[\begin{eqnarray} \left[ \begin{array}{cc} \beta_0 \\ \beta_1 \end{array}\right] & = & \left[\begin{array}{cc} \Var{X(t)} & \Cov{X(t), X(t-1)}\\ \Cov{X(t-1), X(t)} & \Var{X(t-1)}\end{array}\right]^{-1} \left[\begin{array}{c} \Cov{X(t), S(t)} \\ \Cov{X(t-1), S(t)}\end{array} \right]\\ & = & \left[\begin{array}{cc} \gamma(0)+\sigma^2 & \gamma(1)\\ \gamma(1) & \gamma(0)+\sigma^2\end{array}\right]^{-1} \left[\begin{array}{c} \gamma(0) \\ \gamma(1) \end{array} \right]\\ &=& \frac{1}{(\gamma(0)+\sigma^2)^2-\gamma^2(1)}\left[\begin{array}{cc} \gamma(0)+\sigma^2 & -\gamma(1)\\ -\gamma(1) & \gamma(0)+\sigma^2\end{array}\right]\left[\begin{array}{c} \gamma(0) \\ \gamma(1) \end{array} \right]\\ & = & \frac{1}{(\gamma(0)+\sigma^2)^2-\gamma^2(1)}\left[\begin{array}{c} (\gamma(0)+\sigma^2)\gamma(0) - \gamma^2(1) \\ \gamma(1)\sigma^2\end{array}\right] \end{eqnarray}\]

What’s going on here?

The general pattern: the Wiener filter

Extracting periodic or cyclic patterns

Estimating the periodic component

For simplicity, say we have \(X(0), X(1), \ldots X(n-1)\), \(n=kT\)

\[\begin{eqnarray} \EstRegFunc(0) & = & \frac{1}{k}\sum_{i=0}^{k-1}{X(iT)}\\ \EstRegFunc(1) & = & \frac{1}{k}\sum_{i=0}^{k -1}{X(1+iT)}\\ \EstRegFunc(t) & = & \frac{1}{k}\sum_{i=0}^{k-1}{X(t+iT)}\\ & \vdots & \\ \EstRegFunc(T-1) & = & \frac{1}{k}\sum_{i=0}^{k-1}{X(T-1+iT)} \end{eqnarray}\]

Estimating the periodic component

library(gstat)
data(wind)
dub.jan1 <- wind[wind$month == 1 & wind$day == 1, "DUB"]
mean(dub.jan1)
## [1] 12.67722
colMeans(wind[wind$month == 1 & wind$day == 1, -(1:3)])
##       RPT       VAL       ROS       KIL       SHA       BIR       DUB       CLA 
## 14.610000 11.200556 12.931667  6.845556 10.798333  7.678333 12.677222  8.952778 
##       MUL       CLO       BEL       MAL 
## 10.057222  9.936667 13.987222 18.536667

Now repeat for every day of the calendar

After extracting the periodic component

Summary

Trend + periodicity + fluctuations

\[\begin{eqnarray} X(t) & = & m(t) + p(t) + \TrueNoise(t)\\ & = & \text{long-run trend} + \text{periodic component} + \text{random fluctuations} \end{eqnarray}\]

Backup: Trend + Period for CO2 Concentration

Photo: Mauna Loa Observatory by Rick Peterson

Backup: Trend + Period for CO2 Concentration

Backup: Trend + Period for CO2 Concentration

Backup: Trend + Period for CO2 Concentration

Backup: Trend + Period for CO2 Concentration

Backup: Trend + Period for CO2 Concentration

Backup: Trend + Period for CO2 Concentration

Backup: Trend + Period for CO2 Concentration

quad.trend.plus.cycle <- lm(co2ppm ~ date + I(as.numeric(date)^2) + factor(month), 
    data = mauna_loa_monthly)
signif(coefficients(quad.trend.plus.cycle), 2)
##           (Intercept)                  date I(as.numeric(date)^2) 
##               3.3e+02               3.0e-03               9.7e-08 
##        factor(month)2        factor(month)3        factor(month)4 
##               6.3e-01               1.4e+00               2.5e+00 
##        factor(month)5        factor(month)6        factor(month)7 
##               3.0e+00               2.2e+00               6.1e-01 
##        factor(month)8        factor(month)9       factor(month)10 
##              -1.5e+00              -3.2e+00              -3.3e+00 
##       factor(month)11       factor(month)12 
##              -2.1e+00              -9.5e-01

Backup: Trend + Period for CO2 Concentration

Backup: Trend + Period for CO2 Concentration

Backup: Trend + Period for CO2 Concentration


  1. In photosynthesis, plants use carbon dioxide and water (and sunlight) to make the sugar glucose and oxygen; like us, they later metabolize that sugar for energy, which involves taking in oxygen and releasing carbon dioxide. So plants are going to remove carbon dioxide from the atmosphere, on net, when they photosynthesize more than they burn sugar. In other words, plants will be net carbon dioxide removers when they are green and growing in the spring and summer, and net carbon dioxide releasers when they aren’t, in the fall and winter. Since there are more plants in the northern hemisphere than the southern, because there’s more land mass in the northern hemisphere, the annual carbon dioxide cycle tracks the northern hemisphere seasons rather than the southern.