I have a transform to identify anomaly and trigger alert in one of my index. Following are the details of the behavior.
- For each minute, aggregate last 5 minutes data and check for alerts.
- Send records to relevant indexes.
Problematic current behavior
- For each 5 minutes, aggregate the last 5 minutes data and check for alerts.
- Send alerts to relevant indexes.
Cause of the problem
Transform rule has fixed time interval of 5 minutes for
group_by and sync delay is
60s (1 min).
This will capture events for
- XX:00 - XX:05
- XX:05 - XX:10
- XX:10 - XX:15
This interval can be changed to 1 minute but it will check alerts in last one minute record per minute.
Alerts get only triggered per five minutes even if the event happens at the first minute of the interval.
It would be highly appreciated if you can provide a suggestion for this problem.
Per default transform creates buckets after the bucket is complete, that's it waits 5 minutes, if your
date_histogram is configured with 5 minutes interval. This improves performance, because transform does not need to update documents. You can change this using the setting
align_checkpoints. It is default
true and can be set to
false. This will tell transform to process incomplete buckets for the price of more updates and therefore some performance penalty. You find this setting in the docs
Note that you will still have a waiting time of at least 1 minute if your
sync delay is set to
60s, because transform will only query for data that is at least 1 minute old. This setting compensates ingest delays and data coming in in different order. If you know that your configured timestamp is guaranteed to reach elasticsearch earlier, you can decrease this setting to further optimize the time to trigger the alert. An even better approach which will compensate any problem on the data ingestion is the use of an ingest timestamp as explained here. By using an ingest timestamp you can decrease the setting for
sync to e.g.
5s (You can't decrease it to
0s, because the refresh interval of a lucene index per default is
1s, so I think
2s should be the minimum).
Thank you for this great explanation. Highly appreciate the support.