Export model

Bharath_Kumar_R · June 20, 2019, 12:57pm

Assume the following scenario
I have the resources to collect only 60 days of data
My job has been running for more than 200 days
Today I found out we captured an anomaly and we wanted to remove this through the following steps

Create calendar event that will have the timeline of anomaly occurance. This is to skip data from being learnt by the new job
Clone the existing job and run the cloned job

The problem with above approach is that the cloned job is trained only on the last 60 days of data. However my old job was based on a model that learnt from 200 days of data. So is there a way to take a snapshot of the model at any point of time (in this case, I want the snapshot of the model just before anomaly)? If yes, can I import this model in the cloned job and start feeding more data?

richcollier · June 20, 2019, 3:07pm

You can use the ML Model Snapshots API to revert the job to a model that was saved before your anomaly occurred, and you could pass the delete_intervening_results flag to delete the anomaly.

After this, you could start the datafeed again, but choose the start time to be after the anomaly.

Bharath_Kumar_R · June 21, 2019, 10:33am

Is it possible to modify the "model_snapshot_retention_days" of an existing job?

richcollier · June 21, 2019, 12:13pm

Yes - it is a settable parameter in the create job API call:

model_snapshot_retention_days
(long) The time in days that model snapshots are retained for the job. Older snapshots are deleted. The default value is 1 , which means snapshots are retained for one day (twenty-four hours).

Bharath_Kumar_R · June 21, 2019, 12:20pm

if I use create jobs api for an existing job, doesn't hit a resource_already_exists_exception? I need to know if modification is possible in an already existing job?

richcollier · June 21, 2019, 12:36pm

Correct, you cannot PUT a new configuration to an existing named job. You'll need a new name (job_id) for the job.

system · July 19, 2019, 12:49pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Deleted data used in model? Elasticsearch elastic-stack-machine-learning	4	410	July 18, 2019
Unusual Process For a Windows Host (rare_process_by_host_windows_ecs) SIEM elastic-stack-machine-learning	5	521	July 29, 2021
Skip Anomalous data in job Elasticsearch elastic-stack-machine-learning	2	469	July 16, 2019
Gaps in machine learning data when reverting to snapshot with high query delay Elasticsearch	9	1134	December 8, 2017
ML jobs with missing documents Elasticsearch elastic-stack-machine-learning	3	718	August 3, 2020

Export model

Related topics