ML - deleting old indexes in Elasticsearch

Hello,
I cannot find any relevant information in the docs, so I decided to ask here.

Situation:
We have real-time ML job on specific indexes.
We are using daily indexes and our oldest index is two years old.
What if we will delete all indexes from 2017? Will be our ML model bounds more inaccurate then?
Are they old indexes needed for ML jobs which are already setup?

Hello Thomas,

I assume when you say "What if we will delete all indexes from 2017" you mean the indices that contain the raw data that the ML job is analyzing? If that's the case, then yes - you are perfectly safe to delete those indices - as the ML job's model does not need to look at that data again. Data is analyzed by ML in chronological order and is only viewed once. The cumulative model of behavior for that data is stored in the .ml-state index.

The ONLY downside of deleting the old data is that any newly created jobs can only learn from the historical data that you do have. But, if this data is very old (as you say, from 2017) it is less useful for that purpose anyway.

2 Likes

This is exactly what I want to know. Thank you :wink:

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.