I have collected data to be used for machine learning for more than 2 weeks and accidentally i ruined my data collection for 1 interval, it is getting huge jump in value for this interval.
now machine learning anomaly detection is being affected by this highly abnormal value and if i run forecasting it will consider this faulty data as well
Is there any way that i can exclude this particular interval from the machine learning ?
i tried to create a calendar mentioning this interval , but it doesn't have any effect , maybe because the interval was in the past ?
Below are the screenshot from the single metric viewer.
Anomaly detection is online learning, so we constantly update the model to reflect the data we have seen.
We store snapshots of this model along the way and it is possible to restore to a previous model snapshot. From 7.9, we have a UI for model snapshot management which will create a calendar event to skip the faulty period. Prior to that we have APIs that support restoring model state.
Alternatively you can clone the job, and run it again over the same data. Please remember to set a calendar event before hand.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.