Filter by current score

Assume I have a query that is quite slow due to the usage of many should clauses (e.g. location based decay functions) in the bool query on many documents. To increase the speed of the query I would like to reformulate the query in a hierarchical way so that easy to check queries like match in the should clause are executed on all documents but more expensive queries like 2d decay functions are executed only on the top results, i.e. scores, returned by the easier queries. So it works like a pyramid and on each level I would be able to execute the next level only on the top n or top x% results (by score) of the current level.

How would I accomplish something like this? I read about the deprecated filter: {limit: {value: 10}} query but it didn't work for me and I don't want to start using deprecated features. The terminate_after parameter in the search api is interesting but seems to work only on the top level of the whole query. There is also the advanced min_score parameter but that would require that I have a clue at query construction time about the range of my scores. In most cases this is not possible to know in advance. Also I'd like to avoid scripts if possible since on many productive ES installation this feature is disabled due to security concerns.

Does anyone know how I would do that? Thanks!

Hi,

maybe the Rescoring functionality can help you here. You can place all "cheap" query conditions (like filters etc.) in the main query and then re-arrange the top-N results based on a more expensive second query that is only executed on the results of the first. It is also possible to execute multiple rescores in sequence, although I haven't tried that one yet.

Really cool, that is exactly what I was looking for and speeds up my query dramatically. Thanks! How could I miss that in the docs? Dooh!?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.