Hello,
I have a requirement to return aggregations over multiple fields, and present in UI the returned aggregations order by time descending.
I wrote the query below which seems to work.
The only problem I noticed is that sorting is performed over the returned buckets only, rather than over the entire data I have in database.
I would like to always get the newest bucket of what I have stored in database, regardless the page size limit I use.
Is there an option to do that?
However, this will not work for me because I need to perform the aggregation with a filter query (not all events in database are participating in aggregation), Plus, this is a beta feature not officially released.
I still wonder - is there no way to sort the aggregations before they are returned by the query?
I searched all over, and read all the documentation and forums and still could not find the way to do that.
My filter is dynamic and changing each time I calculate the aggregations because it is coming from UI (a user is selecting some filters and press "search").
Do you suggest to perform a "one-time" transform for every UI request to view the aggregated data (by leaving optional 'frequency' and 'sync' parameters empty)?
Is that the only way in Elasticsearch to sort Composite aggregations (or term aggregations) by the 'max_timestamp' as appears in my original query in this thread?
If doing such a one-time transform, is the transform API going to be faster than doing the same by my application?
Regardless, I fear that iterating over the entire source index and build the aggregated index from it would take more than few seconds. And this will not be a good web user experience.
The only problem here is the limitation to a single aggregated field.
Adding another field to the aggregation creates a hierarchy of buckets and therefore I cannot sort by timestamp.
But - if I need to aggregate by 2 fields f1+f2, I guess I can just create another field named "aggregateBy" during indexing, which is a combination of f1+f2. Then, perform aggregation by the "aggregatedBy" field.
Does this sound like a good approach?
Thanks Mark, I will look into it.
What about performance in that case? Is using script going to affect performance?
In this aspect, is it better to prepare the aggregation field while indexing?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.