Statistical Computing, 36-350
Wednesday September 21, 2016
To plot a histogram of a numeric vector, use hist()
trump.lines = readLines("http://www.stat.cmu.edu/~ryantibs/statcomp-F16/data/trump.txt")
trump.words = strsplit(paste(trump.lines, collapse=" "),
split="[[:space:]]|[[:punct:]]")[[1]]
trump.words = tolower(trump.words[trump.words != ""])
trump.wlens = nchar(trump.words)
hist(trump.wlens)
Several options are available as arguments to hist()
, such as col
, freq
, breaks
, xlab
, ylab
, main
hist(trump.wlens, col="pink", freq=TRUE) # Frequency scale, default
hist(trump.wlens, col="pink", freq=FALSE) # Probability scale
hist(trump.wlens, col="pink", freq=FALSE, breaks=0:20,
xlab="Word length", main="Trump word lengths")
To estimate a density from a numeric vector, use density()
. This returns a list; it has components x
and y
, so we can actually call lines()
directly on the returned object
density.est = density(trump.wlens, adjust=2) # Twice the default bw
class(density.est)
## [1] "density"
names(density.est)
## [1] "x" "y" "bw" "n" "call" "data.name"
## [7] "has.na"
hist(trump.wlens, col="pink", freq=FALSE, breaks=0:20,
xlab="Word length", main="Trump word lengths")
lines(density.est, lwd=3)
To add a histogram to an existing plot (say, another histogram), use hist()
with add=TRUE
hist(trump.wlens, col="pink", freq=FALSE, breaks=0:20,
xlab="Word length", main="Trump word lengths")
hist(trump.wlens + 2, col=rgb(0,0.5,0.5,0.5), # Note the use of transparency
freq=FALSE, breaks=0:20, add=TRUE)
To plot a heatmap of a numeric matrix, use image()
(mat = 1:5 %o% 6:10) # %o% gives for outer product
## [,1] [,2] [,3] [,4] [,5]
## [1,] 6 7 8 9 10
## [2,] 12 14 16 18 20
## [3,] 18 21 24 27 30
## [4,] 24 28 32 36 40
## [5,] 30 35 40 45 50
image(mat) # Red means low, white means high
image()
The orientation of image()
is to plot the heatmap according to the following order, in terms of the matrix elements:
\[\begin{array}{cccc} (1,\text{nrow}) & (2, \text{nrow}) & \ldots & (\text{ncol},\text{nrow}) \\ \vdots & & & \\ (1,2) & (2,2) & \ldots & (\text{ncol}, 2) \\ (1,1) & (2,1) & \ldots & (\text{ncol}, 1) \end{array}\]
This is a 90 degrees counterclockwise rotation of the “usual” printed order. Therefore, if you want the displayed heatmap to follow the usual order, you must rotate the matrix 90 degrees clockwise before passing it in to image()
. (Reverse the row order, then take the transpose)
clockwise90 = function(a) { t(a[nrow(a):1,]) }
image(clockwise90(mat))
The default is to use a red-to-white color scale in image()
. But the col
argument can take any vector of colors. Functions gray.colors()
, heat.colors()
, terrain.colors()
, rainbow()
, etc., all return continguous color vectors of a given length
scores.mat = as.matrix(read.table("http://www.stat.cmu.edu/~ryantibs/statcomp-F16/data/scores.dat"))
image(scores.mat) # Default is col=heat.colors(12)
image(scores.mat, col=heat.colors(20)) # More colors
image(scores.mat, col=terrain.colors(20)) # Terrain colors
image(scores.mat, col=cm.colors(20)) # Cyan-magenta colors
To draw contour lines from a numeric matrix, use contour()
; to add contours to an existing plot (like, a heatmap), use contour()
with add=TRUE
contour(scores.mat)
image(scores.mat, col=terrain.colors(20))
contour(scores.mat, add=TRUE)