I was wondering if there is any benefit in spreading aggregations into their individual searches as part of a multi search as opposed to bundling them up as part of a single query? More precisely, will all the queries and their aggregations try to run in parallel when part of a multi search?
So instead of Search(filter, aggs1, aggs2, ... aggsN) we have MultiSearch[(filter, aggs1), (filter, aggs2), ... (filter, aggsN)].
We have some pretty gnarly query times right now for generating our aggregations. Individual aggregations are taking multiple seconds to finish.
So say we have a query that spits out some aggregations. Lets say it took ~40s to compute 6 aggregations. I'm investigating and trying to verify if instead having a single filtered query with 6 aggregations we have 6 filtered queries with 1 aggregation each in a multi search.
The real question is, does a filtered query with a bunch of aggregations run those aggregations in parallel anyways? If so then I lose any performance benefit I would have had with the multi search.
Our prod ES cluster is v1.4.4
As for merging the results, computationally it is trivial when compared to the total time to compute a single aggregation.
Some early numbers I've been generating does show a multisearch being faster, but I want to confirm the expected behaviour of a regular search in case the numbers I'm seeing are bogus due to caching/server load/networking issues/... etc
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.