Unexpected zero value

I'm monitoring HTTP Status Code over time. Somehow the model built something for code "415". At some point in the past, we had some spikes of 415 but it disappeared as seen in this picture:

Zoomed in what changed the model:

Somehow after a long time, a value of zero was decided to be outside of norm and now constantly alarming.

How to prevent machine learning from learning bad stuff or just to simply ignore it? There is that scheduling calendar, but most of what I need to ignore is in the past.

I created some rules to skip result when actual are below 1. Will see if that works.

Steve

Hi Steve,

There are a variety of ways to accomplish what you want:

  1. exclude any status code from the ML job's datafeed. Create a saved search in Kibana that filters out what you don't want and then use that as the basis of your ML job
  2. Use a custom rule to exclude access codes you don't want to analyze - via scope and filter lists.
  3. Use non_zero_count function to not count (or model) anything other than non-zero values

I think in your situation, the easiest is to probably #2 - create a filter list:

Then add a Custom Rule to ignore (skip) results created when the status code is in that list:

I kind of like your suggestions although if we receive a spike of 415 errors again, we won't be alarmed on that condition. The model made itself inaccurate over time and wasn't an issue until 1 month later.

We a few other conditions that the model learn "bad behavior" over time and seems like we can't correct it. After an event occurred, this is when we should say "ignore this in the model, that's not normal".

Not to be pedantic, but remember that ML doesn't know what is "good" or "bad" behavior - it just knows what is "normal" or "abnormal" - If you had a steady rate of 415s, then that was normal. When it dropped to zero, that was "abnormal" (given the past behavior seen). Then, ML's models then were shown that zero occurrences of 415s were actually quite normal - so it learned that as this was reinforced for 7+ weeks. When they recurred, the act of them recurring, was again, abnormal.

While you currently cannot easily point-and-click on past time regions in the UI and ignore them, there are other ways in which you can "erase" windows of time:

  1. If in the future: Define calendar events for times to ignore times (this is available in both the UI and API) - https://www.elastic.co/blog/scheduled-events-and-the-amorous-anomaly-elasticsearch-machine-learning
  2. If in the past: Revert the job to a previous snapshot of the model, before the time in which you want to ignore (https://www.elastic.co/guide/en/elasticsearch/reference/current/ml-revert-snapshot.html)
1 Like

Misuse of words... good->normal and bad->abnormal.

#1 is difficult to predict what is going to be abnormal.
I think I can work with #2. I'll give it a shot and see what that gives in the future.

Would just be nice to use the UI to select the window of the abnormal event and have it magically ignored from the model.

Thanks for your help.

Steve

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.