Spreading expensive aggregations over a multi search, is it faster?


(dandric) #1

Hi!

I was wondering if there is any benefit in spreading aggregations into their individual searches as part of a multi search as opposed to bundling them up as part of a single query? More precisely, will all the queries and their aggregations try to run in parallel when part of a multi search?

So instead of Search(filter, aggs1, aggs2, ... aggsN) we have MultiSearch[(filter, aggs1), (filter, aggs2), ... (filter, aggsN)].


(Mark Walkom) #2

I can't see why to be honest. You'd need to "merge" the results manually in your app code to get the same thing, which seems like more work.

Yep.

What makes you ask this?


(dandric) #3

We have some pretty gnarly query times right now for generating our aggregations. Individual aggregations are taking multiple seconds to finish.

So say we have a query that spits out some aggregations. Lets say it took ~40s to compute 6 aggregations. I'm investigating and trying to verify if instead having a single filtered query with 6 aggregations we have 6 filtered queries with 1 aggregation each in a multi search.

The real question is, does a filtered query with a bunch of aggregations run those aggregations in parallel anyways? If so then I lose any performance benefit I would have had with the multi search.

Our prod ES cluster is v1.4.4

As for merging the results, computationally it is trivial when compared to the total time to compute a single aggregation.

Some early numbers I've been generating does show a multisearch being faster, but I want to confirm the expected behaviour of a regular search in case the numbers I'm seeing are bogus due to caching/server load/networking issues/... etc


(Mark Walkom) #4

Upgrading will improve performance, that's a pretty easy win if you can get it.


(system) #5