Linear Methods for Noise Removal and Filtering

36-467/36-667

27 September 2018

\[ \newcommand{\Expect}[1]{\mathbb{E}\left[ #1 \right]} \newcommand{\Var}[1]{\mathrm{Var}\left[ #1 \right]} \newcommand{\SampleVar}[1]{\widehat{\mathrm{Var}}\left[ #1 \right]} \newcommand{\Cov}[1]{\mathrm{Cov}\left[ #1 \right]} \newcommand{\TrueRegFunc}{\mu} \newcommand{\EstRegFunc}{\widehat{\TrueRegFunc}} \DeclareMathOperator{\tr}{tr} \DeclareMathOperator*{\argmin}{argmin} \DeclareMathOperator{\det}{det} \newcommand{\TrueNoise}{\epsilon} \newcommand{\EstNoise}{\widehat{\TrueNoise}} \newcommand{\SignalNoise}{N} \]

In our previous episodes

Optimal linear filtering for time series

Optimal linear filtering for time series

\[\begin{eqnarray} \hat{S}(t_0) & = & \alpha + \vec{\beta} \cdot \left[\begin{array}{c} X(t_1) \\ X(t_2) \\ \vdots \\ X(t_n) \end{array}\right]\\ \alpha & = & \Expect{S(t_0)} - \vec{\beta} \cdot \left[\begin{array}{c} \Expect{X(t_1)}\\ \Expect{X(t_2)} \\ \vdots \\ \Expect{X(t_n)}\end{array}\right] ~ \text{(goes away if everything's centered)}\\ \vec{\beta} & = & {\left[\begin{array}{cccc} \Var{X(t_1)} & \Cov{X(t_1), X(t_2)} & \ldots & \Cov{X(t_1), X(t_n)}\\ \Cov{X(t_1), X(t_2)} & \Var{X(t_2)} & \ldots & \Cov{X(t_2), X(t_n)}\\ \vdots & \vdots & \ldots & \vdots\\ \Cov{X(t_1), X(t_n)} & \Cov{X(t_2), X(t_n)} & \ldots & \Var{X(t_n)}\end{array}\right]}^{-1} \left[\begin{array}{c} \Cov{S(t_0), X(t_1)}\\ \Cov{S(t_0), X(t_2)}\\ \vdots \\ \Cov{S(t_0), X(t_n)}\end{array}\right] \end{eqnarray}\]

This should look familiar

Simple case, to build intuition

\[\begin{eqnarray} \Cov{X(t), X(t+h)} & = & \Cov{S(t)+\SignalNoise(t), S(t+h)+\SignalNoise(t+h)}\\ & = & \Cov{S(t), S(t+h)} + \Cov{\SignalNoise(t), \SignalNoise(t+h)}\\ & & + \Cov{S(t), \SignalNoise(t+h)} + \Cov{S(t+h), \SignalNoise(t)}\\ & = & \gamma(h) + \sigma^2\mathbf{1}(h=0)\\ \Cov{S(t), X(t+h)} & = & \Cov{S(t), S(t+h) + \SignalNoise(t+h)}\\ & = & \Cov{S(t), S(t+h)} + \Cov{S(t), \SignalNoise(t+h)}\\ & = & \gamma(h) \end{eqnarray}\]

One observation, one estimate

\[\begin{eqnarray} \hat{S}(t) & = & \beta X(t)\\ \beta & = & \frac{\Cov{S(t), X(t)}}{\Var{X(t)}}\\ &= & \frac{\gamma(0)}{\gamma(0) + \sigma^2} < 1\\ \hat{S}(t) & = & \frac{\gamma(0)}{\gamma(0) + \sigma^2} X(t) \end{eqnarray}\]

Two observations, one estimate

EXERCISE: Find \(\beta_0, \beta_1\) in terms of the function \(\gamma\) and \(\sigma^2\)

Hint 1: \(\left[\begin{array}{cc} a & b \\ c & d\end{array}\right]^{-1} = \frac{1}{ad - bc}\left[\begin{array}{cc} d & -b \\ -c & a\end{array}\right]\)

Hint 2: \(\Cov{X(t), X(t+h)} = \gamma(h) + \sigma^2\mathbf{1}(h=0)\) and \(\Cov{S(t), X(t+h)} = \gamma(h)\)

Two observations, one estimate

\[\begin{eqnarray} \left[ \begin{array}{cc} \beta_0 \\ \beta_1 \end{array}\right] & = & \left[\begin{array}{cc} \Var{X(t)} & \Cov{X(t), X(t-1)}\\ \Cov{X(t-1), X(t)} & \Var{X(t-1)}\end{array}\right]^{-1} \left[\begin{array}{c} \Cov{X(t), S(t)} \\ \Cov{X(t-1), S(t)}\end{array} \right]\\ & = & \left[\begin{array}{cc} \gamma(0)+\sigma^2 & \gamma(1)\\ \gamma(1) & \gamma(0)+\sigma^2\end{array}\right]^{-1} \left[\begin{array}{c} \gamma(0) \\ \gamma(1) \end{array} \right]\\ &=& \frac{1}{(\gamma(0)+\sigma^2)^2-\gamma^2(1)}\left[\begin{array}{cc} \gamma(0)+\sigma^2 & -\gamma(1)\\ -\gamma(1) & \gamma(0)+\sigma^2\end{array}\right]\left[\begin{array}{c} \gamma(0) \\ \gamma(1) \end{array} \right]\\ & = & \frac{1}{(\gamma(0)+\sigma^2)^2-\gamma^2(1)}\left[\begin{array}{c} (\gamma(0)+\sigma^2)\gamma(0) - \gamma^2(1) \\ \gamma(1)\sigma^2\end{array}\right] \end{eqnarray}\]

What’s going on here?

The general pattern: the Wiener filter

Extracting periodic or cyclic patterns

Estimating the periodic component

For simplicity, say we have \(X(0), X(1), \ldots X(n)\), \(n=kT\)

\[\begin{eqnarray} \EstRegFunc(0) & = & \frac{1}{k-1}\sum_{i=0}^{k-1}{X(iT)}\\ \EstRegFunc(1) & = & \frac{1}{k-1}\sum_{i=0}^{k -1}{X(1+iT)}\\ \EstRegFunc(t) & = & \frac{1}{k-1}\sum_{i=0}^{k-1}{X(t+iT)}\\ & \vdots & \\ \EstRegFunc(T-1) & = & \frac{1}{k-1}\sum_{i=0}^{k-1}{X(T-1+iT)} \end{eqnarray}\]

Estimating the periodic component

library(gstat)
data(wind)
dub.jan1 <- wind[wind$month == 1 & wind$day == 1, "DUB"]
mean(dub.jan1)
## [1] 12.67722
colMeans(wind[wind$month == 1 & wind$day == 1, -(1:3)])
##       RPT       VAL       ROS       KIL       SHA       BIR       DUB 
## 14.610000 11.200556 12.931667  6.845556 10.798333  7.678333 12.677222 
##       CLA       MUL       CLO       BEL       MAL 
##  8.952778 10.057222  9.936667 13.987222 18.536667

Now repeat for every day of the calendar

After extracting the periodic component

Summary

Trend + periodicity + fluctuations

\[\begin{eqnarray} X(t) & = & m(t) + p(t) + \TrueNoise(t)\\ & = & \text{long-run trend} + \text{periodic component} + \text{random fluctuations} \end{eqnarray}\]