This job appears to be looking at a list of process create events to determine if a process is new or existed previously. The issue I think we are having is it is alerting us on a lot of processes that existed previously because the service/process has been running for an extended period of time (I.E. the server hasn't been restarted for a while).
I think to resolve this issue the timeframe the job compares the data with should be extended but I'm not sure how to adjust this.
So my question is:
Is the model snapshot retention days (default to 10) basically saying it will compare the previous 10 days? I read the articles but having a difficult time understanding. In my case, if those processes aren't restarted every 10 days then it is going to alert about it being a "new" process when it actually isn't. I would like to expand it to 30 days if that is what that value is.
Luckily we can add exclusions on the machine learning job section but figured there is a better way. I'm still confused on the timeframe it is using. Is it comparing to all documents in the time range you select when starting the job or is it only comparing it to the 10 model snapshots?
I cloned the job and made some changes to the retention days and also started putting in a bunch of exclusions. I guess that is all we can do but I wish there was a way it could monitor running processes and not just when they started from the sysmon logs
Changing the model snapshot retention days will not alter how data is modelled or the anomalies found.
Model snapshots are a "point in time" copy of the model. Consider them a bit like a backup. A snapshot is stored periodically to disk and used in the event of the job being restarted or moving due to a node failure or reverted. Changing the model snapshot retention days, changes the length of time for which old versions of the model are kept as backup.
The job analyses data from a starting point, going forwards in time. You can see the value of earliest_record_timestamp in the ML UI by expanding the row in the Job List and looking at the Counts tab. I'm not familiar with this particular job and am not sure if running processes are being logged or if it is just when a process is started, so I cannot comment on the results. However more information on how rare processes are modelled is described here. Detecting rare and unusual processes with OOTB machine learning | Elastic Blog
Hope this helps answer part of your question - concerning whether or not changing model snapshot retention days will change the results - unfortunately, it does not.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.