\hypertarget{qore_regex_qore_regex_intro}{}\doxysection{Qore Regular Expression Introduction}\label{qore_regex_qore_regex_intro} Regular expression functionality in Qore is provided by \href{http://www.pcre.org}{\texttt{ PCRE\+: Perl-\/\+Compatible Regular Expression library}}. Using this library, Qore implements regular expression pattern matching using very simple syntax with semantics similar to those of \href{http://www.perl.org}{\texttt{ Perl 5}}. One difference between Qore and Perl to keep in mind is that \mbox{\hyperlink{qore_regex_qore_regex_backreferences}{backreferences}} in Qore are referenced as {\ttfamily \$1}, {\ttfamily \$2}, {\ttfamily \$3}, etc which differs from Perl\textquotesingle{}s syntax (which uses numbered backslashes instead). \begin{DoxyParagraph}{Examples\+:} \begin{DoxyCode}{0} \DoxyCodeLine{\textcolor{comment}{\# call process() if the string starts with an alphanumeric character}} \DoxyCodeLine{\textcolor{keywordflow}{if} (str =\string~ /\string^[[:alnum:]]/)} \DoxyCodeLine{ process(str);} \end{DoxyCode} \begin{DoxyCode}{0} \DoxyCodeLine{\textcolor{comment}{\# example of using regular expressions in a switch statement}} \DoxyCodeLine{switch (str) \{} \DoxyCodeLine{ case /\string^[\string^[:alnum:]]/: \textcolor{keywordflow}{return} \textcolor{keyword}{True};} \DoxyCodeLine{ case /\string^[0-\/9]/: \textcolor{keywordflow}{return} \textcolor{keyword}{False};} \DoxyCodeLine{ default: throw \textcolor{stringliteral}{"{}ERROR"{}}, \mbox{\hyperlink{group__string__functions_ga7a74be141f814ef286046c367b21091c}{sprintf}}(\textcolor{stringliteral}{"{}invalid string \%y"{}}, str);} \DoxyCodeLine{\}} \end{DoxyCode} \begin{DoxyCode}{0} \DoxyCodeLine{\textcolor{comment}{\# regular expression substitution + ignore case \& global options}} \DoxyCodeLine{str =\string~ s/abc/xyz/gi;} \end{DoxyCode} \begin{DoxyCode}{0} \DoxyCodeLine{\textcolor{comment}{\# prefix all non-\/alphanumeric characters with a backslash}} \DoxyCodeLine{str =\string~ s/([\string^[:alnum:]])/\(\backslash\)\(\backslash\)\$1/g;} \end{DoxyCode} \begin{DoxyCode}{0} \DoxyCodeLine{\textcolor{comment}{\# regular expression substring extraction}} \DoxyCodeLine{*list l = (str =\string~ x/(?:(\(\backslash\)w+):(\(\backslash\)w+):)(\(\backslash\)w+)/);} \end{DoxyCode} \end{DoxyParagraph} \hypertarget{qore_regex_qore_regex_operators}{}\doxysection{Qore Regular Expression Operators}\label{qore_regex_qore_regex_operators} The following is a list of operators based on regular expressions (or similar to regular expressions in the case of the transliteration operator). {\bfseries{Regular Expression Operators}} \tabulinesep=1mm \begin{longtabu}spread 0pt [c]{*{2}{|X[-1]}|} \hline {\bfseries{Operator}} &{\bfseries{Description}} \\\cline{1-2} \mbox{\hyperlink{operators_regex_match_operator}{Regular Expression Match Operator (=$\sim$)}} &Returns \mbox{\hyperlink{basic_data_types_True}{True}} if the regular expression matches a string \\\cline{1-2} \mbox{\hyperlink{operators_regex_no_match_operator}{Regular Expression No Match Operator (!$\sim$)}} &Returns \mbox{\hyperlink{basic_data_types_True}{True}} if the regular expression does not match a string \\\cline{1-2} \mbox{\hyperlink{operators_regex_subst_operator}{Regular Expression Substitution Operator}} &Substitutes text in a string based on matching a regular expression \\\cline{1-2} \mbox{\hyperlink{operators_regex_extract_operator}{Regular Expression Pattern Extraction Operator}} &Returns a list of substrings in a string based on matching patterns defined by a regular expression \\\cline{1-2} \mbox{\hyperlink{operators_transliteration_operator}{Transliteration Operator}} &Not a regular expression operator; transliterates one or more characters to other characters in a string \\\cline{1-2} \end{longtabu} See the table below for valid regular expression options.\hypertarget{qore_regex_qore_regex_options}{}\doxysubsection{Qore Regular Expression Operator Options}\label{qore_regex_qore_regex_options} {\bfseries{Regular Expression Options}} \tabulinesep=1mm \begin{longtabu}spread 0pt [c]{*{2}{|X[-1]}|} \hline {\bfseries{Option}} &{\bfseries{Description}} \\\cline{1-2} {\ttfamily i} &Ignores case when matching \\\cline{1-2} {\ttfamily m} &makes start-\/of-\/line ({\ttfamily $^\wedge$}) or end-\/of-\/line ({\ttfamily \$}) match after or before any newline in the subject string \\\cline{1-2} {\ttfamily s} &makes a dot ({\ttfamily .}) match a newline character \\\cline{1-2} {\ttfamily x} &ignores whitespace characters and enables comments prefixed by {\ttfamily \#} \\\cline{1-2} {\ttfamily u} &extends Posix character matching to Unicode characters \\\cline{1-2} {\ttfamily g} &makes global substitutions or global extractions (only applicable with the substitution and extraction operators) \\\cline{1-2} \end{longtabu} \hypertarget{qore_regex_qore_regex_functions}{}\doxysection{Qore Regular Expression Functions}\label{qore_regex_qore_regex_functions} The following is a list of functions providing regular expression functionality where the pattern may be given at run-\/time\+: {\bfseries{Regular Expression Functions}} \tabulinesep=1mm \begin{longtabu}spread 0pt [c]{*{2}{|X[-1]}|} \hline {\bfseries{Function}} &{\bfseries{Description}} \\\cline{1-2} \mbox{\hyperlink{group__string__functions_ga7804f63df9b181e0bacb465eb912feb9}{regex()}} &Returns \mbox{\hyperlink{basic_data_types_True}{True}} if the regular expression matches a string \\\cline{1-2} \mbox{\hyperlink{group__string__functions_gad8963269da93fe1ebaa179dab8d315fa}{regex\+\_\+subst()}} &Substitutes a pattern in a string based on regular expressions and returns the new string \\\cline{1-2} \mbox{\hyperlink{group__string__functions_ga9f8c21f961daa29578dcfd596f5871ff}{regex\+\_\+extract()}} &Returns a list of substrings in a string based on matching patterns defined by a regular expression \\\cline{1-2} \end{longtabu} \hypertarget{qore_regex_qore_regex_escape_patterns}{}\doxysection{Qore Regular Expression Escape Codes}\label{qore_regex_qore_regex_escape_patterns} Escape characters in the pattern string are processed by the \href{http://www.pcre.org}{\texttt{ PCRE}} library similar to how Perl5 handles escape characters.\hypertarget{qore_regex_qore_regex_escape_replacement_string}{}\doxysubsection{Qore Regular Expression Replacement String Escape Codes}\label{qore_regex_qore_regex_escape_replacement_string} Regular expression substitution expressions have the following pattern\+: \begin{DoxyItemize} \item {\ttfamily s/}{\itshape $<$pattern$>$}{\ttfamily /}{\itshape $<$replacement$>$}{\ttfamily /} \end{DoxyItemize} The escape codes in the following table are supported in the replacement string. {\bfseries{Regular Expression Replacement String Escape Codes}} \tabulinesep=1mm \begin{longtabu}spread 0pt [c]{*{6}{|X[-1]}|} \hline {\bfseries{Escape}} &{\bfseries{ASCII}} &{\bfseries{Decimal}} &{\bfseries{Octal}} &{\bfseries{Hex}} &{\bfseries{Description}} \\\cline{1-6} \textbackslash{}a &{\ttfamily BEL} &{\ttfamily 7} &{\ttfamily 007} &{\ttfamily 07} &alarm or bell \\\cline{1-6} \textbackslash{}b &{\ttfamily BS} &{\ttfamily 8} &{\ttfamily 010} &{\ttfamily 08} &backspace \\\cline{1-6} \textbackslash{}e &{\ttfamily ESC} &{\ttfamily 27} &{\ttfamily 033} &{\ttfamily 1B} &escape character \\\cline{1-6} \textbackslash{}f &{\ttfamily FF} &{\ttfamily 12} &{\ttfamily 014} &{\ttfamily 0C} &form feed \\\cline{1-6} \textbackslash{}n &{\ttfamily LF} &{\ttfamily 10} &{\ttfamily 012} &{\ttfamily 0A} &line feed \\\cline{1-6} \textbackslash{}r &{\ttfamily CR} &{\ttfamily 13} &{\ttfamily 015} &{\ttfamily 0D} &carriage return \\\cline{1-6} \textbackslash{}t &{\ttfamily HT} &{\ttfamily 9} &{\ttfamily 011} &{\ttfamily 09} &horizontal tab \\\cline{1-6} \textbackslash{}v &{\ttfamily VT} &{\ttfamily 11} &{\ttfamily 013} &{\ttfamily 0B} &vertical tab \\\cline{1-6} \textbackslash{}\$ &{\ttfamily \$} &{\ttfamily 36} &{\ttfamily 044} &{\ttfamily 24} &a literal dollar sign character \\\cline{1-6} \textbackslash{}\textbackslash{} &{\ttfamily \textbackslash{}} &{\ttfamily 134} &{\ttfamily 092} &{\ttfamily 5C} &a literal backslash character \\\cline{1-6} \textbackslash{}\mbox{[}0-\/7\mbox{]}\mbox{[}0-\/7\mbox{]}\mbox{[}0-\/7\mbox{]} &-\/ &-\/ &-\/ &-\/ &the ASCII character represented by the octal code \\\cline{1-6} \end{longtabu} Otherwise any backslashes in the replacement string will be copied literally to the output string.\hypertarget{qore_regex_qore_regex_backreferences}{}\doxysection{Qore Regular Expression Backreferences}\label{qore_regex_qore_regex_backreferences} \mbox{\hyperlink{namespace_qore}{Qore}} uses {\ttfamily \$}{\itshape num} for backreferences in regular expression substitution expressions. The first backreference is {\ttfamily \$1}, the second \$2, and so on. \begin{DoxyParagraph}{Example} \begin{DoxyCode}{0} \DoxyCodeLine{\textcolor{comment}{\# prefix all non-\/alphanumeric characters with a backslash}} \DoxyCodeLine{str =\string~ s/([\string^[:alnum:]])/\(\backslash\)\(\backslash\)\$1/g;} \end{DoxyCode} \begin{DoxyCode}{0} \DoxyCodeLine{\textcolor{comment}{\# remove parentheses from string at the beginning of the line}} \DoxyCodeLine{str =\string~ s/\string^\(\backslash\)((.*)\(\backslash\))/\$1/;} \end{DoxyCode} \end{DoxyParagraph}