stri_opts_regex {stringi}R Documentation

Generate a List with Regex Matcher Settings

Description

A convenience function to tune the ICU regular expressions matcher's behavior, e.g. in stri_count_regex and other stringi-search-regex functions.

Usage

stri_opts_regex(case_insensitive, comments, dotall, literal, multiline,
  unix_lines, uword, error_on_unknown_escapes, ...)

Arguments

case_insensitive

logical; enable case insensitive matching [regex flag (?i)]

comments

logical; allow white space and comments within patterns [regex flag (?x)]

dotall

logical; if set, '.' matches line terminators, otherwise matching of '.' stops at a line end [regex flag (?s)]

literal

logical; if set, treat the entire pattern as a literal string: metacharacters or escape sequences in the input sequence will be given no special meaning; note that in most cases you would rather use the stringi-search-fixed facilities in this case

multiline

logical; controls the behavior of '$' and '^'. If set, recognize line terminators within a string, otherwise, match only at start and end of input string [regex flag (?m)]

unix_lines

logical; Unix-only line endings. When this mode is enabled, only U+000a is recognized as a line ending by '.', '$', and '^'.

uword

logical; Unicode word boundaries. If set, uses the Unicode TR 29 definition of word boundaries; warning: Unicode word boundaries are quite different from traditional regex word boundaries. [regex flag (?w)] See http://unicode.org/reports/tr29/#Word_Boundaries

error_on_unknown_escapes

logical; whether to generate an error on unrecognized backslash escapes; if set, fail with an error on patterns that contain backslash-escaped ASCII letters without a known special meaning; otherwise, these escaped letters represent themselves

...

any other arguments to this function are purposely ignored

Details

Note that some regex settings may be changed using ICU regex flags inside regexes. For example, "(?i)pattern" does a case-insensitive match of a given pattern, see the ICU User Guide entry on Regular Expressions in the References section or stringi-search-regex.

Value

Returns a named list object; missing settings are left with default values.

References

enum URegexpFlag: Constants for Regular Expression Match Modes – ICU4C API Documentation, http://www.icu-project.org/apiref/icu4c/uregex_8h.html

Regular Expressions – ICU User Guide, http://userguide.icu-project.org/strings/regexp

See Also

Other search_regex: stringi-search-regex, stringi-search

Examples

stri_detect_regex("ala", "ALA") # case-sensitive by default
stri_detect_regex("ala", "ALA", opts_regex=stri_opts_regex(case_insensitive=TRUE))
stri_detect_regex("ala", "ALA", case_insensitive=TRUE) # equivalent
stri_detect_regex("ala", "(?i)ALA") # equivalent

[Package stringi version 1.1.6 Index]