Quantifiers in a regex allow us to express: how often?
grep("Oh+", c("O my gosh!", "Oh wow!", "Ohhhhh no!"), value=TRUE)
## [1] "Oh wow!" "Ohhhhh no!"
grep("Oh*", c("O my gosh!", "Oh wow!", "Ohhhhh no!"), value=TRUE)
## [1] "O my gosh!" "Oh wow!" "Ohhhhh no!"
grep("colou?r", c("Americans love 3 colors: red, white, and blue",
"Brits love 3 colours: red, white, and dark blue"), val=TRUE)
## [1] "Americans love 3 colors: red, white, and blue"
## [2] "Brits love 3 colours: red, white, and dark blue"
grep("Bonds?\\?", c("Have you seen Barry Bonds? That guy can play",
"Have you seen James Bond? That guy is cool",
"Bond v United States, 529 U.S. 334 (2000)"), value=TRUE)
## [1] "Have you seen Barry Bonds? That guy can play"
## [2] "Have you seen James Bond? That guy is cool"
grep("10{1,2} ", c("10 dollars", "100 dollars", "1000 dollars"), value=TRUE)
## [1] "10 dollars" "100 dollars"
grep("10{1,2}", c("10 dollars", "100 dollars", "1000 dollars"), value=TRUE)
## [1] "10 dollars" "100 dollars" "1000 dollars"
grep("10{2,}", c("10 dollars", "100 dollars", "1000 dollars"), value=TRUE)
## [1] "100 dollars" "1000 dollars"
grep("[0-9]{3}-[0-9]{4}", c("My office number is 268-1884",
"Bryan's cell phone is 353-1890",
"The police's number is 911"), value=TRUE)
## [1] "My office number is 268-1884" "Bryan's cell phone is 353-1890"
What exactly does a quantifier apply to? This is called its scope
grep("ha{2,}", c("haaa", "haha"), value=TRUE)
## [1] "haaa"
grep("(ha){2,}", c("haaa", "haha"), value=TRUE)
## [1] "haha"
grep("[0-9][[:alpha:]]{2}", c("2L2Q", "21YO"), value=TRUE)
## [1] "21YO"
grep("([0-9][[:alpha:]]){2}", c("2L2Q", "21YO"), value=TRUE)
## [1] "2L2Q"
grep("[0-9]{3}(-[0-9]{4})?", c("My office number is 268-1884",
"The police's number is 911",
"Wait, 911- isn't that the police?"), value=TRUE)
## [1] "My office number is 268-1884" "The police's number is 911"
## [3] "Wait, 911- isn't that the police?"
grep("[0-9]{3}-([0-9]{4})?", c("My office number is 268-1884",
"The police's number is 911",
"Wait, 911- isn't that the police?"), value=TRUE)
## [1] "My office number is 268-1884" "Wait, 911- isn't that the police?"
grep("ton.*", c("ton", "tone", "ton ", "son"), value=TRUE)
## [1] "ton" "tone" "ton "
grep("(ton.)*", c("ton", "tone", "ton ", "son"), value=TRUE)
## [1] "ton" "tone" "ton " "son"
grep("^Win", c("Winning is my favorite pasttime",
"We love statistics", "I hate Windows"), value=TRUE)
## [1] "Winning is my favorite pasttime"
grep("[a-z]$", c("I like lasers", "I like LASERS"), value=TRUE)
## [1] "I like lasers"
grep("Do.*\\?$", c("Do you like cherries?", "Don't you like terriers?",
"Don't you know that terriers like cherries"), value=TRUE)
## [1] "Do you like cherries?" "Don't you like terriers?"
grep("^<.+>|<.+>$", c("<HTML> hi", "bye </HTML>", "a <b> c"), value=TRUE)
## [1] "<HTML> hi" "bye </HTML>"