Number of characters

To count how many characters in a string, don’t use length(), use nchar()

nchar("coffee")

## [1] 6

nchar("code monkey")

## [1] 11

length("code monkey")

## [1] 1

length(c("coffee", "code monkey"))

## [1] 2

`nchar()` vectorizes

Can pass a vector of strings to nchar(), and it returns the character counts in each element. This is called vectorization

nchar(c("coffee", "code monkey"))

## [1]  6 11

nchar(c("Spider-Man", "does whatever", "a spider can"))

## [1] 10 13 12

Reminder: vectorization

Some basic examples of vectorization

c(1,2,3) + c(1,2,3)

## [1] 2 4 6

1:10 - 1 # This is an example of recycling

##  [1] 0 1 2 3 4 5 6 7 8 9

1:10 * -1 # So is this

##  [1]  -1  -2  -3  -4  -5  -6  -7  -8  -9 -10

abs(-3:3)

## [1] 3 2 1 0 1 2 3

log(1:5)

## [1] 0.0000000 0.6931472 1.0986123 1.3862944 1.6094379

log(exp(1:7)) # Notice two vectorizations happening here

## [1] 1 2 3 4 5 6 7

Getting a substring

Grab a subseqence of characters from a string, called a substring, using substr()

phrase = "Give me a break"
substr(phrase, 1, 4)

## [1] "Give"

substr(phrase, nchar(phrase)-4, nchar(phrase))

## [1] "break"

substr(phrase, nchar(phrase)+1, nchar(phrase)+10)

## [1] ""

`substr()` vectorizes

Just like nchar() (and many other functions)

presidents = c("Clinton", "Bush", "Reagan", "Carter", "Ford")
substr(presidents, 1, 2) # Grab the first 2 letters from each

## [1] "Cl" "Bu" "Re" "Ca" "Fo"

substr(presidents, 1:5, 1:5) # Grab the first, 2nd, 3rd, etc.

## [1] "C" "u" "a" "t" ""

substr(presidents, 1, 1:5) # Grab the first, first 2, first 3, etc.

## [1] "C"    "Bu"   "Rea"  "Cart" "Ford"

substr(presidents, nchar(presidents)-1, nchar(presidents)) # Grab the last 2 letters from each

## [1] "on" "sh" "an" "er" "rd"

Replacements

To replace a character, or a substring, use substr()

phrase

## [1] "Give me a break"

substr(phrase, 1, 1) = "L"
phrase # "G" changed to "L"

## [1] "Live me a break"

substr(phrase, 1000, 1001) = "R"
phrase # Nothing happened

## [1] "Live me a break"

substr(phrase, 1, 4) = "Show"
phrase # "Live" changed to "Show"

## [1] "Show me a break"

Vectorized replacements

Another example of substr() vectorizing

presidents

## [1] "Clinton" "Bush"    "Reagan"  "Carter"  "Ford"

first.letters = substr(presidents, 1, 1)
first.letters.scrambled = sample(first.letters)
substr(presidents, 1, 1) = first.letters.scrambled;
presidents

## [1] "Flinton" "Cush"    "Ceagan"  "Barter"  "Rord"

Some replacement quirks

You can only replace exact many letters as you specify

phrase

## [1] "Show me a break"

substr(phrase, 1, 4) = "Provide"
phrase # Only replaced the first 4 letters

## [1] "Prov me a break"

substr(phrase, nchar(phrase)-4, nchar(phrase)) = "cat"
phrase # Only replaced the first 3 letters, in the last word

## [1] "Prov me a catak"

Substrings

Statistical Computing, 36-350

Friday September 2, 2016

Number of characters

`nchar()` vectorizes

Reminder: vectorization

Getting a substring

`substr()` vectorizes

Replacements

Vectorized replacements

Some replacement quirks

Substrings

Statistical Computing, 36-350

Friday September 2, 2016

Number of characters

nchar() vectorizes

Reminder: vectorization

Getting a substring

substr() vectorizes

Replacements

Vectorized replacements

Some replacement quirks

`nchar()` vectorizes

`substr()` vectorizes