Alerts not Triggering Email Notifications Despite Anomalies Detected in ML Jobs

Hi Elasticsearch Community,

I’ve created several Single Metric Machine Learning (ML) jobs, each configured to monitor sum of bytes for specific IP addresses. Due to the large volume of data, I’ve fine-tuned each job, and the bucket span varies between 3 to 5 minutes for different jobs, depending on the data characteristics.

The ML jobs are running smoothly, and anomalies are being detected as expected—these are clearly visible in the Anomaly Detection charts. The issue arises with my alerting setup: I’ve configured anomaly detection alerts for each of these ML jobs, set to trigger email notifications when an anomaly is detected.

However, I’m receiving significantly fewer email alerts than expected. While anomalies are displayed correctly in the Single Metric Viewer, not all are translating into email notifications. Interestingly, the alerts themselves are marked as successful in Kibana, but the email notifications only go out sporadically.

Could this be an issue with how the alerting condition is set up (e.g., severity threshold, lookback range, or anomaly score)? I would greatly appreciate guidance on what might be causing this disparity and how to ensure that all relevant anomalies trigger email alerts.

Added elastic-stack-machine-learning

Hi @catarina ,

The “Notify” setting is a critical factor in the amount of notifications you’re going to receive. By default, notification actions are scheduled only on status change, so if there is an anomaly that satisfies the criteria, you’re only going to receive a notification about it once, and if there is an anomaly in the following bucket, you won’t receive another notification as the alert status remains “Active”.

Changing the value to “Every time alert is active” means that notifications are sent on every rule interval while there are anomalies in your data. The main drawback is that you receive duplicate notifications about the same anomaly for the entire duration of the bucket span. For example, with a bucket span of 15 minutes and check interval of 1 minute, you’re going to receive 15 notifications about the same anomaly.

You can learn more about the anomaly detection alerting rule in our blog post and the documentaiton page.

Hope it helps.
Dima