Kibana search based on Lucene Query Parser Syntax

axelfelix · June 29, 2016, 2:53pm

hi all,

I try to understand better the behavior of a lucene search through kibana.

I'm aware about analyzer, tokenizer, filter ...

An exemple:

I have a field "test" and I look for "toto-tata.test.ok" value.

If I do test:toto-tata.test.ok I don't get the result as expected.
If I do test:"toto-tata.test.ok" I get the result as expected.

I know in lucene, "toto-tata.test.ok" is a Phrase.

So what represents toto-tata.test.ok without quote for lucene ? How kibana make the search on the field "test" when there is no quote for the search value ?

For information, my metafield _all is disable. I have a default field, it is not the "test" field.

Thanks in advance.
Alex

Joe_Fleming · June 29, 2016, 11:21pm

Great question. I've seen this happen before as well, but I never really understood what was happening, so now I had a real reason to look it up .

When you put something like that in the query bar in Kibana, it passes the contents through to Elasticsearch as the query parameter in a query_sting query. Here's a quick example from the relevant part of the request sent to Elasticsearch, when using clientip:"186.187.11.181" in the query bar:

"query": {
        "bool": {
            "must": [{
                "query_string": {
                    "query": "clientip:\"186.187.11.181\"",
                    "analyze_wildcard": true
                }
            }, {
                "range": {
                    "@timestamp": {
                        "gte": 1467240398565,
                        "lte": 1467241298565,
                        "format": "epoch_millis"
                    }
                }
            }],
            "must_not": []
        }
    },

Drop the quotes and the query becomes:

"query": {
        "bool": {
            "must": [{
                "query_string": {
                    "query": "clientip:186.187.11.181",
                    "analyze_wildcard": true
                }
            }, {
                "range": {
                    "@timestamp": {
                        "gte": 1467240403856,
                        "lte": 1467241303857,
                        "format": "epoch_millis"
                    }
                }
            }],
            "must_not": []
        }
    },

I'm still not entirely sure how the two queries differ, but perhaps it has to fo with the "analyze_wildcard": true specified. According to the docs, "by setting analyze_wildcard to true, an attempt will be made to analyze wildcarded words before searching the term list for matching terms." So, it's possible that the the analyzer is changing the query, and thus affecting the results.

That's not really an answer, I know, but that's as far as I've been able to get with it. In the query string syntax section of the docs, it links back to itself. I'd recommend asking in the Elasticsearch section of the forums, someone over there likely has a better understanding of the query syntax.

Joe_Fleming · June 29, 2016, 11:24pm

You know, I bet it has to do with the - in the query. The dash is a reserved character, so if you don't quote the value or escape that character, it's probably not querying like you think it is. I'm not sure what the - means in the query syntax though...

EDIT: Ah, the - apparently negates a single token, which perhaps means that query is looking for toto*.test.ok, where the * is anything but tata?

axelfelix · June 30, 2016, 10:20am

Hi,

Thanks a lot for your answer, I think you're right, I need to post in Elasticsearch section.

Regarding the "-" character, I made some tests and it negates a single token if you have a space before the "-" in my point of view.

Alex

Topic		Replies	Views
Search based on Lucene Query Parser Syntax Elasticsearch	5	1306	July 5, 2017
Query_string can't find token that _analyze shows is generated, but term query can Elasticsearch	12	604	July 6, 2017
Elastic Search not working correctly in case of quoted search Elasticsearch	2	487	July 6, 2017
Hyphens in query_string Elasticsearch	3	3400	July 6, 2017
Lucene based query string, analyzers and wildcard usage Elasticsearch	3	449	July 6, 2017

Kibana search based on Lucene Query Parser Syntax

Related topics