Escaping special characters in regex


#1

I was writing pipeline and in the gsub processor I wrote a regex. As per the documentation

Allowed characters

Any Unicode characters may be used in the pattern, but certain characters are reserved and must be escaped. The standard reserved characters are:

. ? + * | { } [ ] ( ) " \

I had to escape the double quotes but not the other characters. Why?
"pattern": "\\\"\\\"(?=[a-zA-Z0-9])" worked.

I was using the kibana dev tool. Is it because of that that I have to use 3 slashes to escape the double quote? I was expecting to use 2 slashes.
I am not an expert on regex but will like to have some clarity on this.


(Alexander Reelsen) #2

hey,

I agree that is somewhat confusing. So the basics are covered by the java rules of regexes, however being inside of JSON requires another round of escaping, so that when the JSON is parsed you keep the rules of the regular expression (sort of unpacking the escaping at different stages of parsing, once when parsing JSON, once when parsing the regex). Hope that makes sense.

--Alex


(system) closed #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.