From the screenshot snippet, it's unclear what was the previous behavior in the data. However, the anomalies are preceded by a long period of 0 values, which were typical, and in the interval of interest, the difference between the actual value of 0 and the lower bound of the confidence interval apparently is not that large to motivate a higher score.
You can the *_low anomaly detection function if you are interested in drops
Consider setting an alert to a specific rule, if you are interested in being notified, e.g. if you receive 0 values after 7 am.
You can also set the anomaly score threshold to a lower value like 17 trigger alarm for such behavior.
Thanks for your anwers. I read the link but still confused by this topic.
I had a production problem so I need to prevent this happen again because no alerts was shot. The same day we had a lot of errors as you can see in the next picture:
I understand that was a multibucket impact (still a bit confusing) but what happen with the single bucket impact? or the value actual and the typical? and why the score was <1 (what means <1)?
I am sorry you feel confused about the scores. Here are some pointers to help you:
As a rule of thumb, an anomaly detector needs about 3 weeks of data to build a probabilistic model that describes the data. In your case, it appears that the data ingest started on 2023-07-11 and the anomalous behavior is observed only 6 days later. Therefore, you need to let the job run for a bit longer before trying to understand the "usual" score numbers that the anomaly detector would assign. Before seeing enough evidence (data), the anomaly detector would be reluctant to assign high anomaly scores, since the "typical" value is derived from an insufficient number of observations.
Anomaly detectors make sense on complex data with multiple seasonalities (e.g. different behavior over hour of the day, day of the week, month, etc.), trends, and so on. If you have data where you expect ~0 most of the time and want to be alerted when you get anything >100, then a simple alert rule may be more helpful.
There are many resources available online that dive deep into how anomalies are identified and scored.
As you can see in the image before my data has differents behaviors.
I receive errors evey day but I only want and alert when the amount of error are anomalous.
well, that's interesting... Can you please let us know what version of ES you are using? On the first screenshot, there is an annotation marked as [1] around 2023-07-02. Can you please tell me what this annotation is? It seems that before this annotation, the model score behaved as you would expect.
Also for the last spike can you please put the screenshot of anomaly score explanations (available if you click on the arrow in the anomaly table)?
You base how high you think the anomaly score should be on how different the value is from "typical" (this is the median prediction). Often the value is very different from typical, so by itself this is not necessarily a good indicator that something is anomalous because the data has a long positive tail.
Our score is trying to say whether these events are historically unusual. Looking at the history of the metric, this event does not look particularly unusual. However, one option you have is that if you continue to have problems, you can consider reverting to the snapshot and excluding the problem periods.
Perhaps this metric is not a good indicator of problems. Do you really have problems like this all the time (based on the screenshot)? Perhaps the metric you are analyzing for anomalies does not correlate well with problems in your system. Some metrics are better at detecting problems, like KPIs, and some can be useful for root cause and are too unpredictable to perform reliable anomaly detection for alerting.
If you like, you can create an alert based on what differs from typical, if you think this better detects the events you care about. A watch on the anomaly detection results would allow you to do this.
There's nothing that says you have to alert at a certain level, so you can create a rule that alerts at a lower score if that doesn't give you too many false positives.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.