Can someone please help me understand how ES handles literal '/' characters in regular expressions?


(JA e) #1

I frequently need to utilize regular expressions and am having some
difficulties.

For example say I have a full_url of
http://www.mycompany.com/pic2af45362bcd322cd/image1.jpg, where the 16
char hex can change and the image number can change.

Logstash parses just the uri portion so i'm searching on the string of
'/pic2af45362bcd322cd/image1.jpg'

If i was using pcre i would say something like... |
"^/pic[a-f0-9]{16}/image[0-9].jpg$"

In reading the documentation on ES/lucene regex (queries are always
anchored / not full pcre) i think i should be able to search like so:
uri://pic[a-f0-9]{16}/image[0-9].jpg/

This does not seem to work. If I search for uri:/pic[a-f0-9]{16}/ it
works, but i have a less exact query. I also tried this query in sense and
receive the same problem, it's almost as if it does not recognize the
forward slashes in regex.

How do you get ES to recognize the / in regex queries if its a character
that doesnt have an escape? I see that you can escape it in standard dsl
queries but apparently not in regex queries?

Thanks,

Jason

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/40dd6e90-3213-40e6-8c58-5c46716acd47%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(JA e) #2

I guess i'll open a bug ticket on github then...see if there are any
thoughts over there.

On Saturday, May 24, 2014 8:50:49 PM UTC-5, JA e wrote:

I frequently need to utilize regular expressions and am having some
difficulties.

For example say I have a full_url of http://www.mycompany.com/pic2af45362bcd322cd/image1.jpg
http://www.mycompany.com/pic2af45362bcd322cd/image1.jpg
, where the 16
char hex can change and the image number can change.

Logstash parses just the uri portion so i'm searching on the string of
'/pic2af45362bcd322cd/image1.jpg'

If i was using pcre i would say something like... |
"^/pic[a-f0-9]{16}/image[0-9].jpg$"

In reading the documentation on ES/lucene regex (queries are always
anchored / not full pcre) i think i should be able to search like so:
uri://pic[a-f0-9]{16}/image[0-9].jpg/

This does not seem to work. If I search for uri:/pic[a-f0-9]{16}/ it
works, but i have a less exact query. I also tried this query in sense and
receive the same problem, it's almost as if it does not recognize the
forward slashes in regex.

How do you get ES to recognize the / in regex queries if its a character
that doesnt have an escape? I see that you can escape it in standard dsl
queries but apparently not in regex queries?

Thanks,

Jason

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8dbe479d-178a-40da-9288-88f97e3dc019%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #3