Export model

Assume the following scenario
I have the resources to collect only 60 days of data
My job has been running for more than 200 days
Today I found out we captured an anomaly and we wanted to remove this through the following steps

  1. Create calendar event that will have the timeline of anomaly occurance. This is to skip data from being learnt by the new job
  2. Clone the existing job and run the cloned job

The problem with above approach is that the cloned job is trained only on the last 60 days of data. However my old job was based on a model that learnt from 200 days of data. So is there a way to take a snapshot of the model at any point of time (in this case, I want the snapshot of the model just before anomaly)? If yes, can I import this model in the cloned job and start feeding more data?

You can use the ML Model Snapshots API to revert the job to a model that was saved before your anomaly occurred, and you could pass the delete_intervening_results flag to delete the anomaly.

After this, you could start the datafeed again, but choose the start time to be after the anomaly.

Is it possible to modify the "model_snapshot_retention_days" of an existing job?

Yes - it is a settable parameter in the create job API call:

model_snapshot_retention_days
(long) The time in days that model snapshots are retained for the job. Older snapshots are deleted. The default value is 1 , which means snapshots are retained for one day (twenty-four hours).

if I use create jobs api for an existing job, doesn't hit a resource_already_exists_exception? I need to know if modification is possible in an already existing job?

Correct, you cannot PUT a new configuration to an existing named job. You'll need a new name (job_id) for the job.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.