Machine Learning module is triggering alerts when there is no anomaly

This all makes a ton of sense, and we'll create a watch that focuses on result_type:record, as opposed to the automatically generated watch that I guess focuses on bucket.

I'm still not understanding the anomaly scoring, though.

I understand that a record anomaly is created (and scored) for any entity identified by a defector and/or entity. But when you say "the bucket level score is the aggregation of all anomalies in that bucket," does that mean that it's simply adding together the anomaly score for all of the anomalies?

If I go back to my April 27th 2019 10:00 example, I see two record anomalies, each with an anomaly_score of 93.994. But the bucket anomaly for the same hour and day has a max severity of 59. How do two 93.994's become a 59?

Thanks again for all of your help.

The bucket level score isn't a simple sum() or max() or whatever of the constituent individual anomalies in that bucket. It is more like a "joint probability" (the likelihood that those things are unusual together).

Although, it can be more intuitive to think of it as "given all the anomalies (and their individual scores) that exist in that bucket, how unusual was that time bucket?" So, it is not just the number, but also the severity. Therefore, a small number of massive anomalous hosts should get a higher bucket score than a larger number of very minor anomalies.

Got it. Thanks very much, Rich.

1 Like

Hey Rich - does this look like query logic for a Watcher that will accomplish the goal of triggering on record level anomalies? Thanks!

{
  "trigger": {
    "schedule": {
      "interval": "106s"
    }
  },
  "input": {
    "search": {
      "request": {
        "search_type": "query_then_fetch",
        "indices": [
          ".ml-anomalies-*"
        ],
        "types": [],
        "body": {
          "size": 0,
          "query": {
            "bool": {
              "filter": [
                {
                  "term": {
                    "job_id": "sonicwall-anomalies"
                  }
                },
                {
                  "range": {
                    "timestamp": {
                      "gte": "now-120m"
                    }
                  }
                },
                {
                  "terms": {
                    "result_type": [
                      "record"
                    ]
                  }
                }
              ]
            }
          },
          "aggs": {
            "bucket_results": {
              "filter": {
                "range": {
                  "anomaly_score": {
                    "gte": 90
                  }
                }
              },
              "aggs": {
                "top_bucket_hits": {
                  "top_hits": {
                    "sort": [
                      {
                        "anomaly_score": {
                          "order": "desc"
                        }
                      }
                    ],
                    "_source": {
                      "includes": [
                        "job_id",
                        "result_type",
                        "timestamp",
                        "anomaly_score",
                        "is_interim"
                      ]
                    },
                    "size": 1,
                    "script_fields": {
                      "start": {
                        "script": {
                          "lang": "painless",
                          "inline": "LocalDateTime.ofEpochSecond((doc[\"timestamp\"].date.getMillis()-((doc[\"bucket_span\"].value * 1000)\n * params.padding)) / 1000, 0, ZoneOffset.UTC).toString()+\":00.000Z\"",
                          "params": {
                            "padding": 10
                          }
                        }
                      },
                      "end": {
                        "script": {
                          "lang": "painless",
                          "inline": "LocalDateTime.ofEpochSecond((doc[\"timestamp\"].date.getMillis()+((doc[\"bucket_span\"].value * 1000)\n * params.padding)) / 1000, 0, ZoneOffset.UTC).toString()+\":00.000Z\"",
                          "params": {
                            "padding": 10
                          }
                        }
                      },
                      "timestamp_epoch": {
                        "script": {
                          "lang": "painless",
                          "inline": "doc[\"timestamp\"].date.getMillis()/1000"
                        }
                      },
                      "timestamp_iso8601": {
                        "script": {
                          "lang": "painless",
                          "inline": "doc[\"timestamp\"].date"
                        }
                      },
                      "score": {
                        "script": {
                          "lang": "painless",
                          "inline": "Math.round(doc[\"anomaly_score\"].value)"
                        }
                      }
                    }
                  }
                }
              }
            },
            "influencer_results": {
              "filter": {
                "range": {
                  "influencer_score": {
                    "gte": 3
                  }
                }
              },
              "aggs": {
                "top_influencer_hits": {
                  "top_hits": {
                    "sort": [
                      {
                        "influencer_score": {
                          "order": "desc"
                        }
                      }
                    ],
                    "_source": {
                      "includes": [
                        "result_type",
                        "timestamp",
                        "influencer_field_name",
                        "influencer_field_value",
                        "influencer_score",
                        "isInterim"
                      ]
                    },
                    "size": 3,
                    "script_fields": {
                      "score": {
                        "script": {
                          "lang": "painless",
                          "inline": "Math.round(doc[\"influencer_score\"].value)"
                        }
                      }
                    }
                  }
                }
              }
            },
            "record_results": {
              "filter": {
                "range": {
                  "record_score": {
                    "gte": 3
                  }
                }
              },

(continued in next post)

          "aggs": {
            "top_record_hits": {
              "top_hits": {
                "sort": [
                  {
                    "record_score": {
                      "order": "desc"
                    }
                  }
                ],
                "_source": {
                  "includes": [
                    "result_type",
                    "timestamp",
                    "record_score",
                    "is_interim",
                    "function",
                    "field_name",
                    "by_field_value",
                    "over_field_value",
                    "partition_field_value"
                  ]
                },
                "size": 3,
                "script_fields": {
                  "score": {
                    "script": {
                      "lang": "painless",
                      "inline": "Math.round(doc[\"record_score\"].value)"
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}
  },
  "condition": {
    "compare": {
      "ctx.payload.aggregations.bucket_results.doc_count": {
        "gt": 0
      }
    }
  },

Seems like so - but you'll still need a way to manage the multiple results that you get (because there are likely going to be more than 1 anomaly records returned from each query).

I assume that you'll do something similar to the example I showed in:

Great, thanks Rich!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.