Composite Aggregation and Sorting with a non source


Using Elasticsearch, I would like to construct a query that buckets my results but also allows for pagination. Based on this requirement, it seems that I need to use a composite aggregation.

However, I also would like to sort the results based on a field that is not one of the sources. It doesn’t seem like composite aggregation currently supports this. Is this correct? If you want to sort by a field, it has to be one of the “sources” in the elasticsearch query

Here’s an example of what I’m trying to do:

I have a giant list of messages from different people that are timestamped of when the message was sent. I would like to bucket the messages by person and have the list of people sorted by the time of the last message sent. If John Smith was the person to send me the most recent message, then his name bucket would appear at the top of the list. The next person in the list would be the person who sent the most recent message after John Smith.

In this example, the composite aggregation source would be the name of the person.

I believe that can’t use terms aggregation because it doesn’t support pagination. Is there a way to make this work with composite aggregation?

The composite aggregation is a way to paginate over all buckets of a complex aggregation. It doesn't allow custom sort. If you want a list of messages collapsed by user and sorted by most recent you can use field collapsing:

GET my_index/_search
    "collapse" : {
        "field" : "user", 
    "sort": [ {"timestamp": "desc" } ]

The query will return the most recent message by user and you can paginate over the results using from.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.