Performance of sorting on nested field

My use case required me to return a set of documents sorted using nested objects, followed by everything else scored normally. I wanted to solve this by adding two sort values: the nested sorting first and _score second. Unfortunately this turned out to be extremely slow for the reasons mentioned above (I had searches with millions of results, with only a handful of documents containing nested fields).

My solution was to perform two distinct searches at the application level (still in a way that was completely transparent to the caller):

  • first filter for documents containing the nested object you want to search on, and sort them: this is going to be fast since you are ensuring that the nested documents are actually there;
  • than (if necessary) perform a search for the rest of the documents and return them if needed.

IMPORTANT
After we deployed this system, memory consumption skyrocketed: the only solution was to move to Elasticsearch 2.1 (where the memory issue is completely solved). I have the impression that the memory issues were caused by nested sorting, but I did not have the resources to investigate this thoroughly. Just a warning.

1 Like