As far as I can tell, calendars reference ML jobs by name or group name. Is there a way to add an 'All Jobs' flag to the calendar? There are windows when all our jobs will see anomalous data due to rolling OS patching. Rather than trying to keep up with new jobs and make sure they are in the calendar, it would be nice to have a * in the jobs list.
According to the docs, anomalies are generated during applicable events, but the anomaly score is set to zero. Does that mean the model is still being updated (polluted) with the anomalous data? If so, is the accepted way to deal with this stopping the datafeed during those time periods?
That's a nice suggestion - we can add that as an Enhancement request. The workaround, obviously, is to have an "all jobs" job Group name and apply the calendar to that - but you'd have to remember (or enforce) that new jobs also join that Group.
To answer your 2nd question - the model does not "learn" during the time that is defined during the calendar event. So it is not being "polluted".
I'll probably setup a script to run once a day and uses the API to add a 'default' group to any job that does not have a group defined. Hopefully we'll see the enhancement in a release before too long.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.