Preserving ML "Education" and Jobs

michaelm14 · February 21, 2018, 11:36pm

We have multiple clusters running autonomously and we've created ML jobs within a Dev environment. I am admittedly a real newbie with ML and AI, but my understanding is the algorithms are continuously updating based on analyzed data for better accuracy. I see the blue areas showing upper and lower expectations, so I believe it is learning more as time progresses, correct?

If that's the case, how can I migrate my "experienced" algorithms from the Dev environment to a Production one without restarting the training? In other words, where do the algorithms live? Along with that, I would hope the process would allow me to backup/archive existing, "educated" jobs so I don't have to start over in the event of a rebuild. The JSON for existing jobs has parameters and configuration info, but I don't see any algorithm information and certainly no history of what it has already learned.

I know I'm really new at ML so if anyone has suggestions, please let me know. Thank you!

dmitri · February 23, 2018, 1:51pm

Hi Michael,

I see the blue areas showing upper and lower expectations, so I believe it is learning more as time progresses, correct?

Correct. The model learns online thus it keeps changing as data comes in and time moves forward.

If that's the case, how can I migrate my "experienced" algorithms from the Dev environment to a Production one without restarting the training? In other words, where do the algorithms live?

ML jobs store their state in an index in the cluster. Unfortunately, if you want to migrate your jobs to a different cluster, it is not possible. You will have to start over. Hopefully, it won't take long to process historic data and catch up with real-time analysis.

Along with that, I would hope the process would allow me to backup/archive existing, "educated" jobs so I don't have to start over in the event of a rebuild. The JSON for existing jobs has parameters and configuration info, but I don't see any algorithm information and certainly no history of what it has already learned.

You will notice that a job may have a model_snapshot_id. This links the job to its current model state. We have a set of APIs that allow you to manage past model snapshots including the ability to revert to a previous one. Model snapshots are taken periodically, based on the job config parameter background_persist_interval. Also, the parameters model_snapshot_retention_days dictates how old should a snapshot be before it is automatically removed.

You can read more about job config options in here.

You can read more about the model snapshot management APIs here.

michaelm14 · February 23, 2018, 2:27pm

dmitri,

Thank you very much for detailed answer! That helps a lot. I'll work on tuning the model snapshots to help with the backup strategy.

Thanks again!

system · March 23, 2018, 2:28pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Moving ML jobs and their data Elasticsearch elastic-stack-machine-learning	8	2170	October 13, 2021
Restoring ML Jobs using Elasticdump Elasticsearch elastic-stack-machine-learning	4	895	October 29, 2018
Export model Elasticsearch elastic-stack-machine-learning	6	599	July 19, 2019
Restoring machine learning config from a snapshot Elasticsearch elastic-stack-machine-learning	6	798	March 26, 2019
Migrate the machine learning job from one environment to other Elasticsearch elastic-stack-machine-learning	2	1238	October 29, 2018

Preserving ML "Education" and Jobs

Related topics