I'm new to Elastic Search, and I simply would like to make a query that filter all entries that have exactly all the supplied values (not an OR but an AND).
Phrase query is typically used if you want to match token "Area" followed by token "A" anywhere in a long analyzed string. So "Foo Area A Bar" will match too.
I think your requirement is "exact match" which is achieved by Term query. For exact match there is no need to compute score. So you sort on some field (_doc if no specific sorting required) or wrap it in a constant score query to avoid scoring overhead.
Best if you can profile queries using by adding "profile": true to see how breakdown looks.
Thank I understand now the difference between term and match_phrase query.
But when you say :
For exact match there is no need to compute score. So you sort on some field (_doc if no specific sorting required) or wrap it in a constant score query to avoid scoring overhead.
I don't understand what you mean here ? Indeed the "max_score" is a part of the output when I play the DSL query with Kibana. Is it not relevant at all in my case ?
@Nicolas_Rey
score is required if you want ElasticSearch to determine how well each document matches. ElasticSearch will sort documents based on score so most relevant documents show up at the top. Score is important if you care about
Matching part of your query for ex. document field has Area but not A.
What's in the field other than what you are searched for. "Foo Area A bar"
There are other scenarios too.
If you use your match_phrase query and search for "Area A" , "Area C", the user 2 document will show up on the top because it matches better compared to the user 1 document.
You can experiment running both queries on your data, use "explain": true to understand why one document is preferred over other.
Not sure why after adding sort, Kibana is showing score. It should be null or in case constant query it should be 1 for all matches.
If I do such (put term queries in filter clause instead of must clause), will I have the exact same outputs (ie same documents that will be filtered) ?
If you don't need score why waste cpu cycles in computing it? For a single query is not going to save you a lot. But If this query is called lots of time time, it adds up.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.