Name:
Andrew ID:
Collaborated with:

This lab is to be done in class (completed outside of class if need be). You can collaborate with your classmates, but you must identify their names above, and you must submit your own lab as an knitted HTML file on Canvas, by Thursday 10pm, this week.

This week’s agenda: basic indexing, with a focus on matrices; some more basic plotting; vectorization; using for() loops.

Prostate cancer data set

We’re going to look at a data set on 97 men who have prostate cancer (from the book The Elements of Statistical Learning). There are 9 variables measured on these 97 men:

  1. lpsa: log PSA score
  2. lcavol: log cancer volume
  3. lweight: log prostate weight
  4. age: age of patient
  5. lbph: log of the amount of benign prostatic hyperplasia
  6. svi: seminal vesicle invasion
  7. lcp: log of capsular penetration
  8. gleason: Gleason score
  9. pgg45: percent of Gleason scores 4 or 5

To load this prostate cancer data set into your R session, and store it as a matrix pros.dat:

pros.dat =
  as.matrix(read.table("http://www.stat.cmu.edu/~ryantibs/statcomp-S18/data/pros.dat"))

Basic indexing and calculations

Exploratory data analysis with plots

A bit of Boolean indexing never hurt anyone

magic.denom = c(0.19092077, 0.08803179, 1.91148819, 0.34076326, 0.00000000,
                0.25730390, 0.15441770, 6.30903678, 0.23021447)

Iterate away (lots of plots … sorry, graders!)

i = 1
var.name = "FOO"
title = paste("Histogram of", var.name)

My plot is at your command (optional)