frequency defines how often transform looks for new data and in case of a failure how quick it re-tries. This setting only defines scheduling, this setting has no impact on how the data is transformed. With other words, using different
frequencies does not lead to different data.
sync and the sub-setting
delay does not impact how data is transformed, assuming you set it up correctly:
delay defines the ingest delay, it means: "When is it safe to query for data?". The time used for the timestamp field can have different delays, e.g. if you feed the timestamp from an external system, it might be, that you batch data and send it e.g. every 5 minutes. For this case
delay must be 5 minutes plus whatever it takes to transfer the data over the network and index it in elasticsearch (
A continuous transform works in 2 steps:
- identify the data points that need to be updated
- re-create the data points that needed to be updated.
If you configure
sync with a
delay to low, step 1 might miss data points to be updated.
Regarding your example:
When it runs the transform at 10, assuming it run it the last time at 9.
- query the
source between 9 and 10 and e.g. identify that
b have been changed (but not
- query the
source till 10 filtered by
b and update the documents for
- in step 2 it queries all data
b can be terms but also ranges if you are grouping by
- if the query runs at 10 it does not query "lower than 10" but "lower than 10 - delay", accordingly the range is step 1 is
9-delay <= x < 10-delay
So again both
frequency do not affect how data is transformed but only how and when it is updated. The transformation is defined in your
group_by. If you are not getting the expected it results, you might have a problem with setting up
sync or a problem in your data.
I think I can better help you if you post your config and explain what you expect.