Can you elaborate on that? Maybe it's just a problem with your configuration, maybe this has been fixed meanwhile.
Regarding the query, yes that's correct. Transform has to re-query all data until the checkpoint, but if you look at the other parts of the query, you should see more filters. Transform e.g. narrows the query to only update certain terms. Say only user
E but not
D have changed something on their profile. If transform runs it 1st queries for changes (you should see this one, too) and than recomputes the pivot for
E but not
You might now ask: Why is it not taking just the new values and update the doc?
To explain the challenge: Transform is a generic tool and supports a lot of different aggregations. To illustrate the "Update instead of Re-compute" problem:
- min/max/sum are easy
- for average we could store sum and count to make it update-able to update a median you need a histogram, fortunately we have that now (histogram datatype)
- for cardinality we have to store the sketch, e.g. the hyperloglog data structure, we do not have such a data type yet
- for scripted metric we need the user to write the update method
This doesn't mean we do not want to support update at all, we have plans to implement this in future. Due to the challenges explained, we likely will not support every aggregation/data type to be update-able or at least add support step by step over time.