Skip to contents

str_ functions manipulate strings.

str_clean

The goal of str_clean is to remove punctuation and/or accent from a string.

string <- "a..;éâ...íõ"

By default the function will remove punctuation and accent symbols.

str_clean(string)
#> [1] "aeaio"

But you can remove only remove the accent symbols.

str_clean(string,remove_accent = TRUE,remove_punct = FALSE)
#> [1] "a..;ea...io"

Also you can remove only remove the accent symbols.

str_clean(string,remove_accent = FALSE,remove_punct = TRUE)
#> [1] "aéâíõ"

And there is the possibility to substitute the punctuation symbols.

str_clean(string,remove_accent = FALSE,remove_punct = TRUE,sub_punct = "-")
#> [1] "a---éâ---íõ"

str_extract_char

The goal of str_extract_char is to extract a specific character from a string.

str_extract_char(string = "abcdef",char = 2)
#> [1] "b"

str_keep

The goal of str_keep is to keep only a type of character in the string.

string <- "1Aa45Z89$$%#"

By default the function will keep only letters.

str_keep(string,keep = "text")
#> [1] "AaZ"

But you can also keep only numbers.

str_keep(string,keep = "numbers")
#> [1] 14589

And there is also an option to keep special characters.

str_keep(string,keep = "special")
#> [1] "$$%#"

str_select

The goal of str_select is to select part of a string, before, after or between patterns.

string <- "begin STRING1 TARGET STRING2 end"

By setting the argument before you select only the part of the string before this pattern.

str_select(string,before = "STRING2")
#> [1] "begin STRING1 TARGET"

By setting the argument after you select only the part of the string after this pattern.

str_select(string,after = "STRING1")
#> [1] "TARGET STRING2 end"

By setting the arguments brfore and after you select only the part of the string between both patterns.

str_select(string,"STRING1","STRING2")
#> [1] "TARGET"

str_to_abb

The goal of str_to_abb is to apply uppercase to strings with a number of characters lower than the argument n_abb (default = 3), and only uppercase the first letter for the others words, so abbreviations will not be affected.

countries <- c("France","Br","usa","italy")

str_to_abb(countries,n_abb = 3)
#> [1] "France" "BR"     "USA"    "Italy"