I'm monitoring HTTP Status Code over time. Somehow the model built something for code "415". At some point in the past, we had some spikes of 415 but it disappeared as seen in this picture:
Somehow after a long time, a value of zero was decided to be outside of norm and now constantly alarming.
How to prevent machine learning from learning bad stuff or just to simply ignore it? There is that scheduling calendar, but most of what I need to ignore is in the past.
I created some rules to skip result when actual are below 1. Will see if that works.
There are a variety of ways to accomplish what you want:
exclude any status code from the ML job's datafeed. Create a saved search in Kibana that filters out what you don't want and then use that as the basis of your ML job
Use a custom rule to exclude access codes you don't want to analyze - via scope and filter lists.
Use non_zero_count function to not count (or model) anything other than non-zero values
I think in your situation, the easiest is to probably #2 - create a filter list:
I kind of like your suggestions although if we receive a spike of 415 errors again, we won't be alarmed on that condition. The model made itself inaccurate over time and wasn't an issue until 1 month later.
We a few other conditions that the model learn "bad behavior" over time and seems like we can't correct it. After an event occurred, this is when we should say "ignore this in the model, that's not normal".
Not to be pedantic, but remember that ML doesn't know what is "good" or "bad" behavior - it just knows what is "normal" or "abnormal" - If you had a steady rate of 415s, then that was normal. When it dropped to zero, that was "abnormal" (given the past behavior seen). Then, ML's models then were shown that zero occurrences of 415s were actually quite normal - so it learned that as this was reinforced for 7+ weeks. When they recurred, the act of them recurring, was again, abnormal.
While you currently cannot easily point-and-click on past time regions in the UI and ignore them, there are other ways in which you can "erase" windows of time:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.