Paging for multisearch (from & size are not quite doing it)


I have a question regarding using a multisearch across many indexes with from & size parameters for the purposes of paging.

I have many indexes that I want to combine into a multisearch. The problem I'm having is that "from" & "size" appear to be applied to each index individually & then the results from each index are combined. This will result in pages after the first containing suboptimal results if the distribution of scores / numbers of results from results in each index are significantly different. From googling, this is known & expected behaviour & there are good performance related reasons for it not to be changed.

Is there an obvious solution to this that I've missed?

One idea is that I could combine all of the indexes into one big index (which is how I originally had it), but that seemed to give me worse search performance than separating them. The things in each index are also structurally different (but obviously not so structurally different that they can't be used in a multisearch & the results combined in a meaningful way). Also, which indexes get combined in a search is determined by the clients of the service I'm building. Nevertheless, I could trivially (LOL) replace the use of multiple indexes with a single index and use a keyword field to discriminate between the different types of things, & make that part of the search.

Another idea is to use function score queries to help with the paging, but this only works if the score can be manipulated after it is calculated by elasticsearch. This would work by not using "from", but instead indicating the maximum score the page is interested in, & the size of the result set it wants. The function score query would take the elastic search score for each result &, if it is larger than the maximum required, reset it to 0, thus pushing it down to the bottom of the search results, exposing the results that are close to, but less than the maximum required score at the top of the list for each indexes results. I'd still end up with "size" * (the number of indexes) results, but I could post filter that to take the top "size" & note the lowest score in order to use as the max score in the next page.

Is that possible? Sane?

Yet another idea is that my service returns details about the "from" & "size" for each index I want to include in the multisearch. I'd have to calculate these for each index based on the current "from" for that index and the number of items in the combined page that came from that index. This is obviously doable, but imposes more work on the clients of the service I'm building.

Any help will be greatly appreciated. Even if you tell me I'm talking nonsense.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.