Search based on Lucene Query Parser Syntax

hi all,

I made a post in Kibana section "Kibana search based on Lucene Query Parser Syntax" but I think elasticsearch section is more appropriate.

I try to understand better the behavior of a lucene search through kibana.

I'm aware about analyzer, tokenizer, filter ...

An exemple:

I have a field "test" and I look for "toto-tata.test.ok" value.

  • If I do test:toto-tata.test.ok I don't get the result as expected.
  • If I do test:"toto-tata.test.ok" I get the result as expected.

I know in lucene, "toto-tata.test.ok" is a Phrase.

So what represents toto-tata.test.ok without quote for lucene ? How kibana make the search on the field "test" when there is no quote for the search value ?

For information, my metafield _all is disable. I have a default field, it is not the "test" field.

Thanks in advance.
Alex

Hey,

if you want to understand how Elasticsearch/Lucene are doing tokenization and split your field values into single terms, you should take a look at the analyze API, which shows you the terms that are stored in the inverted index. This helps a lot to understand how search works (especially in combination with synonyms and specific analyzers).

--Alex

Hi thanks for your answer,

but as explain in my post:

I'm aware about analyzer, tokenizer, filter ...

So I already know how my terms are stored and tokens associated with those terms.

The question is, I know when I make a request with a specific field like test:"toto-tata.test.ok" that "toto-tata.test.ok" is recognize as a phrase for lucene. But how is interpreted if I remove the quote ?

Thanks in advance.

Hey,

a bool query is created for each term, see https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#query-string-syntax

Is that what you needed to know or is there another question hidden?

--Alex

Hey,

I think I find the answer on your link:

As mentioned in Query String Query, the default_field is searched for the
search terms, but it is possible to specify other fields in the query syntax:
where the status field contains active
    status:active
where the title field contains quick or brown.
If you omit the OR operator the default operator will be used
    title:(quick OR brown)
    title:(quick brown)
where the author field contains the exact phrase "john smith"
    author:"John Smith"

So if I understand well, when I look for author:"John Smith", I look for documents having exactly tokens [John,Smith].

But if I look for author:John Smith, I look for documents having token John OR Smith.

Do you agree with this conclusion ?

Thanks in advance,
Alex

EDIT: I made some tests but it doesn't work like that for my exemple.

I try to test:"toto-tata.test.ok", I get the same result than test:(toto-tata.test.ok) but totally different result than test:toto-tata.test.ok