Hi,
Is there a difference, in terms of performance, between exists query/filter and missing query/filter?
Thanks,
Israel
Hi,
Is there a difference, in terms of performance, between exists query/filter and missing query/filter?
Thanks,
Israel
In general, elasticsearch prefers positive queries, which it can execute more efficiently thanks to its inverted index. So exists
queries perform better than missing
queries just like elasticsearch can more quickly find all documents that contain foobar
than documents that do NOT contain it. Also see https://www.elastic.co/guide/en/elasticsearch/reference/2.3/query-dsl-exists-query.html#_literal_missing_literal_query, you should replace your usage of missing
queries with exists
queries in MUST_NOT
clauses.
thanks!!
My common use case here is "match if missing or false" (or maybe "match if missing or 0"), looking for the opposite of {"example": true}. This is quite easy to write as
{"bool": {"should":{"missing": {"field": "example"}, "term": {"example": false}}}}
(maybe with a minimum_should_match if other conditions are there as well), but that's now deprecated. The alternative with must_not and exists seems to require nested bools to achieve and is much more complicated to express. Or am I missing something?
You aren't missing anything. The query language is designed to mimic the
underlying implementation so you don't have any surprises. It can be
verbose because of it. I wasn't around when that decision was made but I
expect it is an example of a place where a complete abstraction could very
easily hide performance issues - so a less abstract DSL.
Could you do {"bool": {"must_not": [{"term": "example": true}]}}
?
I could indeed, thank you, it gives exactly the same result in the case I am looking at. I had kind-of assumed to be not-equal it also had to exist.
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.