Avoid recalculation from scratch of Transform aggregation

pmagne · February 11, 2020, 6:54pm

Hi,

In a continuous transform, when a new document is added to the source index and the transform has to update one of its own document, it seems that the aggregations calculations are restarted from scratch. It induces that if some documents previously selected by the transform are no longer present in the source index, they are no longer taken into account for the calculations of the aggregations and their information is lost.

Is there a way to tell the transform to only update the aggregation fields based on the new retrieved documents by keeping the previously calculated information intact ?

For instance, in a value_count aggregation, the previously calculated count would be incremented with the only consideration of the new documents.

Hendrik_Muhs · February 12, 2020, 10:24am

Hi,

thank you for your feedback. Updating is currently not possible but this is on our list for the future. However the aggregation is not restarted from scratch, but only the changed entities are updated. Still, transform requires that the source does not get deleted.

If you look for compaction, rollup might be the better tool for you.

To give you a bit more background: Updating is simple for count, min, max, a bit more complicated for avg, very complex for e.g. cardinality, percentile. For anything that requires scripts we would need a user supplied merge/combine method. With other words, this is harder than it seems.

Still, there are usecases like yours where updating would be beneficial. Another usecase might be performance related. For large amounts of data, an update should be more performant than the rewrite at the moment.

pmagne · February 14, 2020, 7:14am

Hi,

Thank you for your very clear answer. Indeed, I see now the complexity of the matter for some aggregations.

system · March 13, 2020, 7:14am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Can a transform recalculate if old documents update their values? Elasticsearch transforms	4	1285	June 11, 2021
How is an 'avg' aggregation updated in transforms when old records are deleted? Elasticsearch	3	377	May 11, 2020
Questions Regarding Deleting Source Data After Transform Elasticsearch transforms	6	1328	September 3, 2021
Is is possible to partially update a dest doc with transforms? Elasticsearch elastic-stack-machine-learning	5	1452	December 26, 2019
Transform for predefined fields Elasticsearch	9	234	August 8, 2022

Avoid recalculation from scratch of Transform aggregation

Related topics