use Erlang regex for Couch DB Mango query string mach
Regular Expressions : The regular expressions allowed here is a subset of the set found in egrep and in the AWK programming language, as defined in the book, The AWK Programming Language, by A. V. Aho, B. W. Kernighan, P. J. Weinberger. They are composed of the following characters:
- matches the non-metacharacter c.
- matches the escape sequence or literal character c.
- matches any character.
- matches the beginning of a string.
- matches the end of a string.
- character class, which matches any of the characters abc… Character ranges are specified by a pair of characters separated by a –.
- negated character class, which matches any character except abc….
- r1 | r2
- alternation. It matches either r1 or r2.
- concatenation. It matches r1 and then r2.
- matches one or more rs.
- matches zero or more rs.
- matches zero or one rs.
- grouping. It matches r.
The escape sequences allowed are the same as for Erlang strings:
- form feed
- newline (line feed)
- carriage return
- vertical tab
- the octal value ddd
- The hexadecimal value hh.
- The hexadecimal value h….
- any other character literally, for example \\\\ for backslash, \\” for “)
To make these functions easier to use, in combination with the function io:get_line which terminates the input line with a new line, the $ characters also matches a string ending with “…\ “. The following examples define Erlang data types:
Atoms [a-z][0-9a-zA-Z_]* Variables [A-Z_][0-9a-zA-Z_]* Floats (\\+|-)?[0-9]+\\.[0-9]+((E|e)(\\+|-)?[0-9]+)?
Regular expressions are written as Erlang strings when used with the functions in this module. This means that any \\ or “ characters in a regular expression string must be written with \\ as they are also escape characters for the string. For example, the regular expression string for Erlang floats is: “(\\\\+|-)?[0-9]+\\\\.[0-9]+((E|e)(\\\\+|-)?[0-9]+)?”.
It is not really necessary to have the escape sequences as part of the regular expression syntax as they can always be generated directly in the string. They are included for completeness and can they can also be useful when generating regular expressions, or when they are entered other than with Erlang strings.