Multi-Bucket Scoring Machine Learning

Tom_Veasey · June 19, 2019, 9:14am

We're in the process of working on a more detailed blog of the multi-bucket feature, but I'll include some details here. We look at the difference between our predictions and the observed values over an extended period: a sliding window of the past twelve buckets. We learn a distribution for this feature and then compute anomalousness from this distribution as we do for single bucket features. We respect sidedness when we compute how unusual this feature value is, so if you've selected high_count the values have to be on average high over the sliding window (this doesn't mean they are high in the most recent bucket).

The impact factor is a sliding scale from "exclusively due to single bucket" when it's -5 to "exclusively due to multi-bucket" when it's 5. When the factor is 0 the contributions are around equal. In the UI the cross is displayed, i.e the multi-bucket effect is high, when this factor is greater than 2.

It is worth mentioning that we never rate limited anomaly scores and repeated anomalies can occur without multi-bucket as well. We always saw that this was a function of the alerting layer on top of the raw results: i.e. don't send an alert if the score is less than or equal the previous bucket or even if the severity is less than or equal. However, I realise multi-bucket has made it more likely that one will get an extended period of anomalies.

In our testing we found including the multi-bucket feature is useful because it allowed us to deal with misconfigured bucket lengths better and also detect important events which were missed altogether without considering it. However, we have had some feedback that it is currently rather sensitive. I've made a couple of changes aimed at reducing sensitivity, which will be available in 7.3, see this and this commit.

We intended that the impact factor could essentially be used to filter out the multi-bucket results as a proxy to disabling if the user wanted, but, in any case, we've considered adding more advanced configuration options in the past and this could be a good candidate.

Topic		Replies	Views
Watcher Alerting on multi-bucket anomaly? Kibana elastic-stack-machine-learning , elastic-stack-alerting	2	511	December 28, 2020
Problems with SCORE on Anomaly Detection JOB Kibana elastic-stack-machine-learning	11	879	September 14, 2023
Confusing with bucket score and record score in advanced multi metric Kibana elastic-stack-machine-learning	1	325	October 24, 2022
Detail questions regarding bucket and influencer scorings Elasticsearch elastic-stack-machine-learning	3	696	October 30, 2018
Anomaly Detection Categorization: Kibana Signs used for Severities(warning, minor, major, critical) Kibana elastic-stack-machine-learning	5	693	May 24, 2022

Multi-Bucket Scoring Machine Learning

Related topics