What kind of fine tune is available for Anomaly Detection? I'll give an example:
Suppose we have a timeseries where we are expecting a certain events (eg: 1000-5000) per day;
On our evaluation, if we have a stream of values [..., 1000, 0, 0 ,0, ...], the detection will "detect" the first 0 as warning, but then it will assume the next zeroes as less severe and it seems that it will accept the 0's as "normal".
Is this behaviour configurable in any way? Or it's the nature of the unsupervised ML algorithm?
Truly, the nature of the unsupervised learning will mean that ML will make judgements without human intervention. But, with that said, you do have control over several aspects. For one, if you want to not include counts of zero in the modeling, you would choose the non_zero_count based functions over the count ones. For numerical fields, you have control over "one-sided" analyses (like choosing max if you don't care about numerical anomalies on the low side of things). You also have control over the bucket_span if it is desirable to "smooth out" the data before it is modeled.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.