Exists vs missing performance


Is there a difference, in terms of performance, between exists query/filter and missing query/filter?



In general, elasticsearch prefers positive queries, which it can execute more efficiently thanks to its inverted index. So exists queries perform better than missing queries just like elasticsearch can more quickly find all documents that contain foobar than documents that do NOT contain it. Also see https://www.elastic.co/guide/en/elasticsearch/reference/2.3/query-dsl-exists-query.html#_literal_missing_literal_query, you should replace your usage of missing queries with exists queries in MUST_NOT clauses.


My common use case here is "match if missing or false" (or maybe "match if missing or 0"), looking for the opposite of {"example": true}. This is quite easy to write as
{"bool": {"should":{"missing": {"field": "example"}, "term": {"example": false}}}}
(maybe with a minimum_should_match if other conditions are there as well), but that's now deprecated. The alternative with must_not and exists seems to require nested bools to achieve and is much more complicated to express. Or am I missing something?

You aren't missing anything. The query language is designed to mimic the
underlying implementation so you don't have any surprises. It can be
verbose because of it. I wasn't around when that decision was made but I
expect it is an example of a place where a complete abstraction could very
easily hide performance issues - so a less abstract DSL.

Could you do {"bool": {"must_not": [{"term": "example": true}]}}?

I could indeed, thank you, it gives exactly the same result in the case I am looking at. I had kind-of assumed to be not-equal it also had to exist.