Performance of terms aggregation based on the number of indices/shards

(Pradeep Reddy) #1

I understand that a terms aggregation(with some sorting) request would get fired to each shard. Let's say If I specify size to 100k, Coordinating node will get 100k buckets from each shard and reduce,sort.

If the search request is on a wildcard indices (daily indices, let's say 30 indices), will the terms aggregation request fetch 100k buckets from each index+shard to the coordinating node and then reduce,sort? If that is the case, would decreasing the number of indices,shards decrease the memory overhead on the coordinating node?

My requirement is to aggregate on a large number of buckets with sorting+pagination. Consdiering that < 6.x doens't have pagination, I would have to specify a large page size and paginate on the client side.

If I use composite aggregation in 6.x, would that help when we have to do sorting also in addition to pagination? ES will internally have to compute values for all buckets probably because it has to sort? so there may not be any performance advantage by paginating I believe?

One solution is to change the way I store the data, is there any other solution to solve the issue of high number of buckets + pagination + sorting?


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.