Transforms on data older than a specific timestamp: checkpoints are not created

Continuous mode is configured on a date field. The timestamp of that field is used for checkpointing. In your example time_upper_bound_millis = 1644238440000 - which translates to 02/08/2022 @ 7:24am in UTC is the time upper bound of the checkpoint. If you push data before that time transform is not able to query for this data. That's by design.

Note the difference between timestamp_millis and time_upper_bound_millis. time_upper_bound_millis is calculating taking the system time, deducting delay. In this case the value is additionally rounded down to a bucket boundary, because you use a date_histogram in your transform configuration.

To workaround your problem you have 2 options:

  • increase delay: if you increase delay to e.g. 6d transform won't miss any data that arrives now - 6d, however it also won't process any data in between.
  • use an ingest timestamp for sync: If you add another timestamp field in your source data which contains the date when the data has been ingested/indexed in Elasticsearch you can configure sync to use the ingest timestamp while you can still pivot on another timestamp field.
2 Likes