To count how many characters in a string, don’t use length()
, use nchar()
nchar("coffee")
## [1] 6
nchar("code monkey")
## [1] 11
length("code monkey")
## [1] 1
length(c("coffee", "code monkey"))
## [1] 2
nchar()
vectorizesCan pass a vector of strings to nchar()
, and it returns the character counts in each element. This is called vectorization
nchar(c("coffee", "code monkey"))
## [1] 6 11
nchar(c("Spider-Man", "does whatever", "a spider can"))
## [1] 10 13 12
Some basic examples of vectorization
c(1,2,3) + c(1,2,3)
## [1] 2 4 6
1:10 - 1 # This is an example of recycling
## [1] 0 1 2 3 4 5 6 7 8 9
1:10 * -1 # So is this
## [1] -1 -2 -3 -4 -5 -6 -7 -8 -9 -10
abs(-3:3)
## [1] 3 2 1 0 1 2 3
log(1:5)
## [1] 0.0000000 0.6931472 1.0986123 1.3862944 1.6094379
log(exp(1:7)) # Notice two vectorizations happening here
## [1] 1 2 3 4 5 6 7
Grab a subseqence of characters from a string, called a substring, using substr()
phrase = "Give me a break"
substr(phrase, 1, 4)
## [1] "Give"
substr(phrase, nchar(phrase)-4, nchar(phrase))
## [1] "break"
substr(phrase, nchar(phrase)+1, nchar(phrase)+10)
## [1] ""
substr()
vectorizesJust like nchar()
(and many other functions)
presidents = c("Clinton", "Bush", "Reagan", "Carter", "Ford")
substr(presidents, 1, 2) # Grab the first 2 letters from each
## [1] "Cl" "Bu" "Re" "Ca" "Fo"
substr(presidents, 1:5, 1:5) # Grab the first, 2nd, 3rd, etc.
## [1] "C" "u" "a" "t" ""
substr(presidents, 1, 1:5) # Grab the first, first 2, first 3, etc.
## [1] "C" "Bu" "Rea" "Cart" "Ford"
substr(presidents, nchar(presidents)-1, nchar(presidents)) # Grab the last 2 letters from each
## [1] "on" "sh" "an" "er" "rd"
To replace a character, or a substring, use substr()
phrase
## [1] "Give me a break"
substr(phrase, 1, 1) = "L"
phrase # "G" changed to "L"
## [1] "Live me a break"
substr(phrase, 1000, 1001) = "R"
phrase # Nothing happened
## [1] "Live me a break"
substr(phrase, 1, 4) = "Show"
phrase # "Live" changed to "Show"
## [1] "Show me a break"
Another example of substr()
vectorizing
presidents
## [1] "Clinton" "Bush" "Reagan" "Carter" "Ford"
first.letters = substr(presidents, 1, 1)
first.letters.scrambled = sample(first.letters)
substr(presidents, 1, 1) = first.letters.scrambled;
presidents
## [1] "Flinton" "Cush" "Ceagan" "Barter" "Rord"
You can only replace exact many letters as you specify
phrase
## [1] "Show me a break"
substr(phrase, 1, 4) = "Provide"
phrase # Only replaced the first 4 letters
## [1] "Prov me a break"
substr(phrase, nchar(phrase)-4, nchar(phrase)) = "cat"
phrase # Only replaced the first 3 letters, in the last word
## [1] "Prov me a catak"