Cannot escape special characters in query using Java API

(Marek K) #1


I have a field which contains forward slashes (it's a usual path). I'm trying to execute this Query:

QueryBuildres.termQuery("pageId", QueryParser.escape("/my/field/val"))

and I cannot get any results. When I'm looking for 'val' only, then I get the proper results. Any ideas why is that happening? Of course without escaping it also doesn't return the results.

QueryParser.escape parses string properly( it contains "/" for every slash), but when request goes to elasticsearch it's double escaped ("\/)
Here is the log:

[2015-07-10 01:53:00,063][WARN ][] [Aaa AA] [index_name][4] took[420.8micros], took_millis[0], types[page], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"query":{"term":{"pageId":"\\/path\\/and\\/testestest"}}}], extra_source[],

Also I have noticed that It works when I'm using querystring:

QueryBuilders.queryString("pageId:" + QueryParser.escape("/my/field/val"))

, but I wouldn't like to use it like that and type everything by hand.

I also have checked if it works from the console (using cURL):

curl -XGET 'http://localhost:9200/index/_search?q=pageId:\-path\/test'

(Jörg Prante) #2

What is the mapping definition and the analyzer for the field pageId?

When sending termQuery, the field is queried without analyzing. No escape needed. This is for exact match.

queryString is Lucene-based and needs escaping. It analyzes by best effort.

(Marek K) #3

Hi jprant, thank you for answer.

I didn't added any mapping definition and analyzer for the field pageId.
It's default one.
So if it's not needed then I still get zero results.

Do I have to add a mapping and analuzer always? I assume I should do that when I have created index, I haven't added it separately,
Index was created when i was indexing content.

(Jörg Prante) #4

You are indexing paths, like a/b/c. For this, you need to set up an analyzer. Default one can not understand paths.

(Marek K) #5

Solution for the issue:

  1. Add mapping to index for pageId. Setting "index": "not_analyzed" in pageId mapping is enough.
  2. Use filters to query pageId.

(system) #6