Result Sets of Aggregation Partitioning

blaineparker · May 25, 2017, 11:34am

How is partitioning in aggregations achieved? How is the data selected for each partition and what guarantees can be made surrounding the ordering of results in each partition and across partitions?

How can partitioning be used to paginate results if I want to guarantee some ordering of the aggregation results?

Mark_Harwood · May 25, 2017, 9:24pm

Are you talking about the partitioning feature provided in the 'include' clause of the 'terms' aggregation?

blaineparker · May 26, 2017, 7:03am

Yeah.

Mark_Harwood · May 26, 2017, 8:30am

The value of each term e.g. ipaddress 146.204.187.221 is hashed and then we take the modulo of the number of paritions you picked e.g. 20. This means all terms will be evenly assigned between 20 partitions. We use the same technique to route documents evenly to shards based on their doc IDs.
For each aggregation request you make you pick one of your partitions e.g. partition 7 of 20. This means each shard will be performing analysis on the same subset of terms.

The terms within one request (e.g. partition 1 of 20) will be ordered by whatever criteria you pick.
There is no guarantee of order across partitions because partitions by design are a randomised division of the data.

You can't. "Some ordering" could be expanded to include something as tricky as userIds sorted by their last logged access date. Given very large numbers of user IDs and a distributed index it is impossible to compute this total ordering without resorting to map/reduce style streaming of masses of data across the network and creating temporary data files etc.
Using term partitioning however you can reduce the amount of data streamed but have to live with in-partition ordering only. This would be adequate for the scenario of finding userIDs that have expired but will not solve all problems. Another option for the complex scenarios like "last access dates" is to opt for entity-centric indexes based around user.

blaineparker · May 26, 2017, 9:18am

Thank so much. This gives me a lot of clarity.

system · June 23, 2017, 9:19am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Terms aggregation bucket Sorting is not working when using with partition Elasticsearch	13	1481	April 22, 2022
Is aggregation partitions results are unique? Elasticsearch	3	848	January 9, 2018
Aggregation Pagination Composite VS Include Partion Elasticsearch	4	2512	January 10, 2019
Slice the aggregation Elasticsearch	2	861	February 16, 2018
Elastic Search Pagination in Term Aggrigation Elasticsearch	10	553	January 3, 2019

Result Sets of Aggregation Partitioning

Related topics