Aletring based on anomaly duration

AmS · May 20, 2022, 3:30pm

Hello,

I m using Kibana machine Learning for anomaly detection.
Also using « watcher » for sending email when anomaly has been detceted by the ML.
I don’t want to send an email for each anomaly but just for continuous anomaly. For example I want to send an email when anomaly is detected on 3 consecutive bucket.
For example below a screenshot of my ML anomaly detection

I would like to notify onl the EV3 on my watcher since there is anomaly detected for more than 3 buckets . EV1 and EV2 are not interesting for me since they take only one bucket.
Other information i m using watcher based on example

github.com

elastic/examples/blob/master/Alerting/Sample Watches/ml_examples/default_ml_watch.json

{
    "trigger": {
      "schedule": {
        "interval": "82s"
      }
    },
    "input": {
      "search": {
        "request": {
          "search_type": "query_then_fetch",
          "indices": [
            ".ml-anomalies-*"
          ],
          "rest_total_hits_as_int": true,
          "body": {
            "size": 0,
            "query": {
              "bool": {
                "filter": [
                  {

This file has been truncated. show original

Is it possible to limit anomaly alerting on number of buckets ?

Thanks,

Regards
Amine

richcollier · May 21, 2022, 10:39am

Probably two options for you:

Create a different ML job with a bucket_span to 3x what it is now. If you're really only interested if something is anomalous over a longer period of time, then that time should be your bucket_span.
The most important part of the watch is this section:

                  {
                    "range": {
                      "timestamp": {
                        "gte": "now-30m"
                      }
                    }
                  },

As this defines how far back in time the Watch looks for anomalies in the results index (.ml-anomalies-*). This range, for looking for the "last anomaly written" should always be 2x the bucket_span width due to the way that results are timestamped and the delay/lag behind real-time. This example watch you show here assumes the bucket_span is 15m. If you want to look back for 3 consecutive anomalies in a row, then you're going to have to look farther back in time (4 bucket_span's worth - the normal 2 and then 2 more). In fact, probably more of query section of the Watch is going to need to change in order to only match 3 adjacent buckets, or the condition action of the Watch is going to have to employ that logic of matching 3 adjacent buckets.

So, before exploring option 2 (which is more work) - is option 1) something you'd consider? By the way, if you do choose option 1, then the range part of the watch shown above will also need to be adjusted so that it is looking back for 2x the new bucket_span.

AmS · May 24, 2022, 3:20pm

Hello,
Thanks,

I Already tried option 1, configuring the bucket_span to 45 minutes rather than 15m will not resolve the situation.

I will try option 2. But may be my problem is not very well explained:

I ccongigured a ML job with 15m = bucket_span
Each Day I notify via Watcher the last 24 hours anomlies
I select to notify anomalies which score is greater than 40.
=> The situation is that i have notification for anomalies that are raised only because we have higher value for some minutes less the the bucket span time.

I want to be notified only with anomlies that persists at least for 45 or 60 minutes or more.

Regards
Amine

richcollier · May 25, 2022, 5:36pm

Thanks for your information, but now I think your requirements have opened up more additional questions:

If your job has 15m bucket_span and if you run your Watch once per day, then there are 96 bucket_spans per day. So, what if there are multiple anomalous intervals that are 3 or more consecutive buckets for an entity? For example, entity EV3 was anomalous for an hour from 9:00-10:00, for 45 minutes from 12:00-12:45, and for one hour and fifteen minutes from 22:00-23:15. Are these 3 separate alerts or do you just want to know that EV3 was anomalous today?
Does it matter if during a 3 or more consecutive buckets duration, the anomaly score is drastically different? For example, if EV3 was anomalous for 45 minutes from 12:00-12:45 but the anomaly scores were 90, 30, and 85 - this would be 3 consecutive buckets, but one of the buckets is below your threshold of 40. Would this be an alert in your book?

AmS · June 2, 2022, 8:22am

Hello
Thanks for your interest to my use case. Below my answers for yours questions

I need to know just if EV3 is anomalous today. It does not matter if theres is One or Twice "at least 3 anamalous buckets" . For my case I want to be notified when at least on 45 consecutives minutes of anomalies was triggered on a day.
It does not matter if the score is different. I need in my case only continous anomalies.

Il would like to remove notification the shorts duration anomalies which are acceptable in my case. These shorts durations anomlies may have a high score but since they don't last a lot of time the score doesn't matter in this case .

Regards

richcollier · June 3, 2022, 12:55pm

Well, there's probably many ways this could be solved, but here's one approach - see example

Run once per day, look over the last 24 hours (the range in the example needs to be modified here because the example uses old data, not live data)
Filter your query by job_id and result_type:record
Do a terms aggregation on the partition field
Do a date_histogram sub-aggregation with an interval that matches the bucket_span of the job (the example shown had a 1m bucket span due to the data set being used in order to have consecutive anomalous buckets, so you would need to change to 15m)
Use the moving_fn aggregation to invoke a 3 bucket sliding window sum of the record_score
Use a bucket_selector aggregation to eliminate any individual values where the record_score is below some arbitrary value (I chose 40).
The condition script loops through and finds if any 3 bucket sliding window sum of the record_score is greater than some arbitrary value (I chose 120)
In the actions section, gather up all of the partitions that violated the threshold and print them with the latest timestamp at which they violated (obviously use your preferred action method)

An example output is:

          Anomalies:
          ==========
          AAL had 3 anomalies in a row at 2021-02-10T12:32:00.000Z
          AWE had 3 anomalies in a row at 2021-02-10T19:19:00.000Z
          AMX had 3 anomalies in a row at 2021-02-10T22:10:00.000Z

system · July 1, 2022, 12:55pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Send an email alert after 3 major/critical anomalies for a given time range Elasticsearch elastic-stack-machine-learning	3	467	August 12, 2019
Watcher not triggering alerts Elasticsearch elastic-stack-machine-learning	2	518	May 16, 2020
Watcher Alerting on multi-bucket anomaly? Kibana elastic-stack-machine-learning , elastic-stack-alerting	2	506	December 28, 2020
Machine Learning module is triggering alerts when there is no anomaly Elasticsearch elastic-stack-machine-learning	27	2902	July 1, 2019
ML - Not picking up anomalies? Elasticsearch elastic-stack-machine-learning	9	2252	October 7, 2017

Aletring based on anomaly duration

Related topics