This week’s agenda: learning to master pipes and dplyr.

# Load the tidyverse!

Pipes to base R

For each of the following code blocks, which are written with pipes, write equivalent code in base R (to do the same thing).

# Pipes:
letters %>%
  toupper %>%
## [1] "A+B+C+D+E+F+G+H+I+J+K+L+M+N+O+P+Q+R+S+T+U+V+W+X+Y+Z"
# Base R:
# Pipes:
"     Ceci n'est pas une pipe     " %>% 
  gsub("une", "un", .) %>%
## [1] "Ceci n'est pas un pipe"
# Base R:
# Pipes:
rnorm(1000) %>% 
  hist(breaks=30, main="N(0,1) draws", col="pink", prob=TRUE) 

# Base R:
# Pipes:
rnorm(1000) %>% 
  hist(breaks=30, plot=FALSE) %>%
  `[[`("density") %>%
## [1] 0.45
# Base R:

Base R to pipes

For each of the following code blocks, which are written in base R, write equivalent code with pipes (to do the same thing).

# Base R:
paste("Your grade is", sample(c("A","B","C","D","R"), size=1))
## [1] "Your grade is R"
# Pipes:
# Base R:[which.max(state.x77[,"Illiteracy"])] 
## [1] "Louisiana"
# Pipes:
str.url = ""

# Base R:
lines = readLines(str.url)
text = paste(lines, collapse=" ")
words = strsplit(text, split="[[:space:]]|[[:punct:]]")[[1]]
wordtab = table(words)
wordtab = sort(wordtab, decreasing=TRUE)
head(wordtab, 10)
## words
##       the  and   of   to  our will    I   in have 
##  592  189  146  127  126   90   83   73   69   58
# Pipes:
# Base R:
lines = readLines(str.url)
text = paste(lines, collapse=" ")
words = strsplit(text, split="[[:space:]]|[[:punct:]]")[[1]]
words = words[words != ""]
wordtab = table(words)
wordtab = sort(wordtab, decreasing=TRUE)
head(wordtab, 10)
## words
##  the  and   of   to  our will    I   in have    a 
##  189  146  127  126   90   83   73   69   58   51
# Pipes:

Sprints data, revisited

Below we read in a data frame sprint.w.df containing the top women’s times in the 100m sprint, as seen in previous labs. We also define a function that was used in Lab 8, to convert the Wind column to numeric values. In what follows, use dplyr and pipes to answer the following questions on sprint.w.df.

sprint.w.df = read.table(
  sep="\t", header=TRUE, quote="", stringsAsFactors=TRUE) = Vectorize(function(x) {
  x = strsplit(as.character(x), split = ",")[[1]]
  ifelse(length(x) > 1, 
         as.numeric(paste(x, collapse=".")), 

Prostate cancer data, revisited

Below we read in a data frame pros.df containing measurements on men with prostate cancer, as seen in previous labs. As before, in what follows, use dplyr and pipes to answer the following questions on pros.df.

pros.df = 