Principal Components Analysis II

36-467/667

11 September 2018

\[ \newcommand{\X}{\mathbf{x}} \newcommand{\w}{\mathbf{w}} \newcommand{\V}{\mathbf{v}} \newcommand{\S}{\mathbf{s}} \newcommand{\Expect}[1]{\mathbb{E}\left[ #1 \right]} \newcommand{\Var}[1]{\mathrm{Var}\left[ #1 \right]} \newcommand{\SampleVar}[1]{\widehat{\mathrm{Var}}\left[ #1 \right]} \newcommand{\Cov}[1]{\mathrm{Cov}\left[ #1 \right]} \newcommand{\TrueRegFunc}{\mu} \newcommand{\EstRegFunc}{\widehat{\TrueRegFunc}} \DeclareMathOperator{\tr}{tr} \DeclareMathOperator*{\argmin}{argmin} \DeclareMathOperator{\dof}{DoF} \DeclareMathOperator{\det}{det} \newcommand{\TrueNoise}{\epsilon} \newcommand{\EstNoise}{\widehat{\TrueNoise}} \]

In our last episode…

Some properties of the PCs

Some properties of the eigenvalues

Some properties of PC scores

\[\begin{eqnarray} \Var{\text{scores}} & = & \frac{1}{n} \S^T \S\\ & = & \frac{1}{n} (\X\w)^T(\X\w)\\ & = & \frac{1}{n}\w^T \X^T \X \w\\ & = & \w^T \V\w = \w^T \mathbf{\Lambda} \w\\ & = & \mathbf{\Lambda} \w^T\w\\ & = & \mathbf{\Lambda} \end{eqnarray}\]

Some properties of PCA as a whole

Another way to think about PCA

PCA can be used for any multivariate data

PCA with spatial data

Recall the states…

state.pca <- prcomp(state.x77, scale. = TRUE)
signif(state.pca$rotation[, 1:2], 2)
##               PC1    PC2
## Population  0.130  0.410
## Income     -0.300  0.520
## Illiteracy  0.470  0.053
## Life Exp   -0.410 -0.082
## Murder      0.440  0.310
## HS Grad    -0.420  0.300
## Frost      -0.360 -0.150
## Area       -0.033  0.590

states are locations, PCs are patterns of variables

Each score is spatially distributed

Try it the other way

Turn the data on its side

state.vars.pca <- prcomp(t(scale(state.x77)))  # What's t()?
length(state.vars.pca$sdev)  # Why 8?
## [1] 8
head(signif(state.vars.pca$rotation[, 1:2]), 2)
##                PC1       PC2
## Alabama -0.2801370 0.0316183
## Alaska   0.0147876 0.5653260
signif(state.vars.pca$x[, 1], 2)
## Population     Income Illiteracy   Life Exp     Murder    HS Grad 
##      -2.60       2.90      -6.80       4.90      -6.70       4.80 
##      Frost       Area 
##       4.30      -0.69

The states turned on their sides…

Not exactly the same

… but pretty close. No coincidence! (See end)

2nd principal component

A famous example

Some maps from Cavalli-Sforza, Menozzi, and Piazza (1993)

World PC1

(\(\approx 35\%\) of between-population variance)

Some maps from Cavalli-Sforza, Menozzi, and Piazza (1993)

World PC2

(\(\approx 18\%\) of between-population variance)

Some maps from Cavalli-Sforza, Menozzi, and Piazza (1993)

World PC3

(\(\approx 12\%\) of between-population variance)

Some maps from Cavalli-Sforza, Menozzi, and Piazza (1993)

Some maps from Cavalli-Sforza, Menozzi, and Piazza (1993)

Some maps from Cavalli-Sforza, Menozzi, and Piazza (1993)

PCA with multiple time series

Irish wind data

Irish wind data

##   year month day   RPT   VAL   ROS   KIL   SHA  BIR   DUB   CLA   MUL
## 1   61     1   1 15.04 14.96 13.17  9.29 13.96 9.87 13.67 10.25 10.83
## 2   61     1   2 14.71 16.88 10.83  6.50 12.62 7.67 11.50 10.04  9.79
## 3   61     1   3 18.50 16.88 12.33 10.13 11.17 6.17 11.25  8.04  8.50
## 4   61     1   4 10.58  6.63 11.75  4.58  4.54 2.88  8.63  1.79  5.83
## 5   61     1   5 13.33 13.25 11.42  6.17 10.71 8.21 11.92  6.54 10.92
## 6   61     1   6 13.21  8.12  9.96  6.67  5.37 4.50 10.67  4.42  7.17
##     CLO   BEL   MAL                time
## 1 12.58 18.50 15.04 1961-01-01 12:00:00
## 2  9.67 17.54 13.83 1961-01-02 12:00:00
## 3  7.67 12.75 12.71 1961-01-03 12:00:00
## 4  5.88  5.46 10.88 1961-01-04 12:00:00
## 5 10.34 12.92 11.83 1961-01-05 12:00:00
## 6  7.50  8.12 13.17 1961-01-06 12:00:00

Irish wind data

Irish wind data — one time series

Irish wind data — all the time series

PCA: \(n = 6574\), \(p=12\)

wind.pca.1 <- prcomp(wind[, 4:15])
wind.pca.1$sdev
##  [1] 15.149749  4.806761  3.848214  2.840283  2.796445  1.932717  1.809999
##  [8]  1.559231  1.408849  1.355770  1.164033  1.079990

PC1: The eigenvector

plot(-wind.pca.1$rotation[, 1], ylim = c(0, 1))
text(1:12, -wind.pca.1$rotation[, 1], pos = 3, labels = colnames(wind)[4:15])

A pattern over space

PC1: The eigenvector

A function of space

PC1: The scores

A function of time

Break for in-class exercise:

Describe the first component

PCA with spatio-temporal data

Interpreting PCA results

PCA is exploratory analysis, not statistical inference

Some alternatives to PCA

Summing up

Orthogonal matrices

PCA of \(\X\) vs. PCA of \(\X^T\)

PCA of \(\X\) vs. PCA of \(\X^T\)

\[\begin{eqnarray} \mathbf{u}\mathbf{\Psi}\mathbf{u}^T & = & n^{-1} \X \X^T\\ \mathbf{u}\mathbf{\Psi}\mathbf{u}^T & = & n^{-1} \S \w^T \w \S^T\\ \mathbf{u}\mathbf{\Psi}^{1/2} \mathbf{\Psi}^{1/2}\mathbf{u}^T & = & n^{-1/2} \S \w^T \w \S^T n^{-1/2}\\ (\mathbf{u}\mathbf{\Psi}^{1/2}) (\mathbf{u}\mathbf{\Psi}^{1/2})^T & = & n^{-1/2} \S \S^T n^{-1/2}\\ \mathbf{u} & = & n^{-1/2} \mathbf{\Psi}^{-1/2} \S \end{eqnarray}\]

New PC1 \(\propto\) old scores on PC1, etc.

No, really, PCA doesn’t do statistical inference

Other alternatives to PCA

References

Anthony, David W. 2007. The Horse, the Wheel and Language: How Bronze-Age Riders from the Eurasian Steppes Shaped the Modern World. Princeton: Princeton University Press.

Cavalli-Sforza, Luigi L. 2000. Genes, Peoples, and Languages. New York: North Point Press.

Cavalli-Sforza, Luigi L., Paolo Menozzi, and Alberto Piazza. 1993. “Demic Expansions and Human Evolution.” Science 259:639–46. https://doi.org/10.1126/science.8430313.

———. 1994. The History and Geography of Human Genes. Princeton: Princeton University Press.

Goerg, Georg M. 2013. “Forecastable Component Analysis (Foreca).” In Proceedings of the 30th International Conference on Machine Learning [Icml 2013], edited by Sanjoy Dasgupta and David McAllester, 28:64–72. 2. http://proceedings.mlr.press/v28/goerg13.html.

Novembre, John, and Matthew Stephens. 2008. “Interpreting Principal Component Analyses of Spatial Population Genetic Variation.” Nature Genetics 40:646–49. https://doi.org/10.1038/ng.139.

Shalizi, Cosma Rohilla. n.d. Advanced Data Analysis from an Elementary Point of View. Cambridge, England: Cambridge University Press. http://www.stat.cmu.edu/~cshalizi/ADAfaEPoV.

Stone, James V. 2004. Independent Component Analysis: A Tutorial Introduction. Cambridge, Massachusetts: MIT Press.