Query with track_total_hits:true faster than track_total_hits:false

Hey Adrien :wave: ,

:wave: :slight_smile:

You are correct, it wouldn't help in the current state.

Is there anything I could optimize on the query to improve the behaviour or give Lucene/ES any hints about it?

Not much... The only idea that comes to mind is trying to move some SHOULD clauses to MUST clauses if they match all documents... which may not apply to you.

Rereading your post you wrote that the query is not great at dynamically pruning hits. However it's not only not great, but actually worse than the tracking of total hits, and that's what surprises me most. I'd be totally fine with equal performance in this case.

You are right, it's very disappointing. We've been working hard on fixing it for pure disjunctions (we used to have a similar issue where enabling hit counting could make these queries faster, typically when there were many high-frequency clauses). We should better look into these queries as well. Typically the problem is that we do more work for dynamic pruning but don't actually save evaluating docs.

We should look deeper into these queries that mix SHOULD/MUST clauses now. In case you're interested in giving it a try, a starting point would be to make WANDScorer accept required clauses in addition to optional clauses and find a way to make it work with it.

1 Like