Alert by influencer on ML

I have a monitor that tracks 6 message queues that come into my system. I record the stats in an index capturing if the queue has been idle at all and what volume of data has been going through it.

If I make a ML job with a query that monitors the stats on those queues alone and set the names of the queues as influencers would that alert me if one of the individual queues showed an anomoly over the score listed in the watch or would the overall score need to breach before an alert was raised?

I've created the watch just by starting the data feed and telling elastic to create the watch for me so it maybe that I need to make a change to that default behaviour to get it alerting like I want but when I've tried to tweak down the thresholds it seems like it isn't alerting on individual queues only then the total goes over so I've been setting up targeted ML jobs on a queue by queue basis for those that I really need to observe.

What I'd really like though is a blanket job that will monitor all the queues with the queue name as an influencer, with the ML accpeting that not all queues are the same and then flag up anomolies that occur at the queue level.

If anyone has any advice on how to go about this it would be much appreceated.

King regards and thanks in advance
Ant

A few things here:

  • You need to be clear as to whether or not it is better to split the analysis by queue name so that you're getting independent analysis per queue. Without knowing more about your data, I cannot recommend if splitting the analysis per queue is better or not (I suspect it is).
  • Splitting is a different concept than influencers. An entity is identified by ML as an influencer if it has contributed significantly to the existence of the anomaly. This notion of deciding influential entities is completely independent of whether or not the job is split. An entity can be deemed influential on an anomaly only if an anomaly happens in the first place. If there is no anomaly detected, there is no need to figure out if there is an influencer.
  • It is recommended at fields used to split the analysis should always also be declared as influencers. However, it is okay to have fields declared as influencers that are not used as splits in the analysis.
  • Watches can search (and therefore alert on) the ML anomaly results at different levels of abstraction (record, influencer, and bucket). See more information in this blog: https://www.elastic.co/blog/machine-learning-anomaly-scoring-elasticsearch-how-it-works. The "default" watch that ML creates for you from the UI is only querying at the bucket level. If you want something different, you need to modify that Watch accordingly
1 Like

@richcollier You may be fast becoming my #1 elastic team member!

In my case the 6 queues I want to monitor each have different behaviour patterns and data will eb and flow on each at different rates at different times of day so from what you're saying split is almost certainly the way I want to be going so I'll set up some jobs with that to gain a bit of a better understanding on it.

I will take a look at the blog a bit later but looks like some good insights there just skimming over it. I thought that I might need to play about with the watches that were set up, when I first opened the default one though it was a bit overwhelliming but I will look into the documentation and see what I can find. If you have any advice on a good place to start with that it would be appreceated though.

Thank you again!
Ant

1 Like

Shameless self-promotion, but my book "Machine Learning with the Elastic Stack" is pretty much the definitive guide to all aspects of the Elastic ML functionality.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.