Aggregation Pagination Composite VS Include Partion

Hi,

At my company we use a terms aggregation with include: partition number to paginate the results.
I noticed that in newer versions you have a composite aggregation which also allows pagination.
Which do you think is a better solution?
Is there a performance difference between the methods?

Thanks,

Gary

Composite aggregation is reliant on ordering buckets by a key that is derived from values in a single document. Taking a simple example, ordering buckets by IPAddress or perhaps extracting the domain name from a referrerUrl field. Each of these buckets can have sub aggregations e.g. max of date field (to know the last time you saw them) but you can't order the top-level buckets on this child agg "max date" property which is derived from multiple docs.

The terms aggregation can order by the values of child-aggregations e.g. reverse-sort on the max date value for an IP address to find IP addresses that haven't been active in a while. The downside is if the number of unique top-level buckets (IP addresses in this case ) is large you may have to use partitioning to ensure results are accurate within each arbitrary subset of the data.

I am referring to the ability to paginate a composite aggregation with a single source terms aggregation using after:
If the number of composite buckets is too high (or unknown) to be returned in a single response it is possible to split the retrieval in multiple requests
See:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-composite-aggregation.html#_after

Ah OK - so you've no need for ordering by child aggs etc. I'd expect Composite to be faster for this use case then.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.