36-462/662, Spring 2020
27 February 2020
\[ \newcommand{\X}{\mathbf{x}} \newcommand{\w}{\mathbf{w}} \newcommand{\V}{\mathbf{v}} \newcommand{\S}{\mathbf{s}} \newcommand{\Expect}[1]{\mathbb{E}\left[ #1 \right]} \newcommand{\Var}[1]{\mathrm{Var}\left[ #1 \right]} \newcommand{\SampleVar}[1]{\widehat{\mathrm{Var}}\left[ #1 \right]} \newcommand{\Cov}[1]{\mathrm{Cov}\left[ #1 \right]} \DeclareMathOperator{\tr}{tr} \DeclareMathOperator*{\argmin}{argmin} \]
Dataset pre-loaded in R:
## Population Income Illiteracy Life Exp Murder HS Grad Frost Area
## Alabama 3615 3624 2.1 69.05 15.1 41.3 20 50708
## Alaska 365 6315 1.5 69.31 11.3 66.7 152 566432
## Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417
## Arkansas 2110 3378 1.9 70.66 10.1 39.9 65 51945
## California 21198 5114 1.1 71.71 10.3 62.6 20 156361
## Colorado 2541 4884 0.7 72.06 6.8 63.9 166 103766
## List of 5
## $ sdev : num [1:8] 1.897 1.277 1.054 0.841 0.62 ...
## $ rotation: num [1:8, 1:8] 0.126 -0.299 0.468 -0.412 0.444 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : chr [1:8] "Population" "Income" "Illiteracy" "Life Exp" ...
## .. ..$ : chr [1:8] "PC1" "PC2" "PC3" "PC4" ...
## $ center : Named num [1:8] 4246.42 4435.8 1.17 70.88 7.38 ...
## ..- attr(*, "names")= chr [1:8] "Population" "Income" "Illiteracy" "Life Exp" ...
## $ scale : Named num [1:8] 4464.49 614.47 0.61 1.34 3.69 ...
## ..- attr(*, "names")= chr [1:8] "Population" "Income" "Illiteracy" "Life Exp" ...
## $ x : num [1:50, 1:8] 3.79 -1.053 0.867 2.382 0.241 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : chr [1:50] "Alabama" "Alaska" "Arizona" "Arkansas" ...
## .. ..$ : chr [1:8] "PC1" "PC2" "PC3" "PC4" ...
## - attr(*, "class")= chr "prcomp"
The weight/loading matrix \(\w\) gets called $rotation
(why?):
## PC1 PC2
## Population 0.130 0.410
## Income -0.300 0.520
## Illiteracy 0.470 0.053
## Life Exp -0.410 -0.082
## Murder 0.440 0.310
## HS Grad -0.420 0.300
## Frost -0.360 -0.150
## Area -0.033 0.590
Each column is an eigenvector of \(\V\)
-Break for in-class exercise
## [1] 1.90 1.30 1.10 0.84 0.62 0.55 0.38 0.34
Standard deviations along each principal component \(=\sqrt{\lambda_i}\)
If we keep \(k\) components, \[ R^2 = \frac{\sum_{i=1}^{k}{\lambda_i}}{\sum_{j=1}^{p}{\lambda_j}} \]
(Denominator \(=\tr{\V}\) — why?)
plot(cumsum(state.pca$sdev^2), xlab="Number of PCs", ylab="Cumulative variance",
type="b", ylim=c(0, sum(state.pca$sdev^2)))
## PC1 PC2
## Alabama 3.80 -0.23
## Alaska -1.10 5.50
## Arizona 0.87 0.75
## Arkansas 2.40 -1.30
## California 0.24 3.50
## Colorado -2.10 0.51
## Connecticut -1.90 -0.24
## Delaware -0.42 -0.51
## Florida 1.20 1.10
## Georgia 3.30 0.11
Columns here are \(\vec{x}_i \cdot \vec{w}_1\) and \(\vec{x}_i \cdot \vec{w}_2\)
So, for instance, \[ s_{\text{Alabama}, 1} = 0.13 x_{\text{Alabama}, \text{Population}} + -0.3 x_{\text{Alabama}, \text{Income}} + \ldots -0.033 x_{\text{Alabama}, \text{Area}} \]
(after centering and scaling the features)
## Minnesota North Dakota Iowa Utah Nebraska Colorado
## -2.4 -2.4 -2.3 -2.3 -2.2 -2.1
## North Carolina Georgia South Carolina Alabama Mississippi
## 2.7 3.3 3.7 3.8 4.0
## Louisiana
## 4.2
size of state abbreviation \(\propto\) projection on to PC1
coordinates = state capitols, except for AK and HI