We have an index in ES 2.4 that holds 120 million plus documents with different ttl values on each of them and we do insert upto 700k documents every day into the same index.
We are now migrating to ES 5.6 and are planning to cron delete by query on this index may be once a day which would delete close to 500k documents in every run.
My question is , will deleting 500k or more docs at a time by calling deletebyquery api in this cron affect ES performance, since it would run on a live index that is still indexing and running search requests. Also, any recommendations other than cron to delete the documents by calling deletebyquery api?
Are you just inserting new documents or are you also updating documents? If you are not performing any updates, you can index your documents into time-based indices based on when they expire. You can then once a day (if you have daily indices) just delete the index that have exceeded ttl, which is a lot faster and more efficient than running delete by query. That would have no measurable impact on performance while delete by query most likely will.
Along with insertion we also update the existing documents. Also every document have got different ttl value.
Then time based indices will not work so you will likely not be able to get away from the overhead and performance impact of delete-by-query.
how would delete-by-query affect the performance as compared to ttl that we have in older ES versions?
I believe the older version ES used to run the deletes like every minute to check the ttl and delete the docs.
also how often should we run the delete by query cron ?
It is indeed the same process so impact may be similar. Regarding how often to run I would recommend you to test. It may be better to delete little by little rather that a big job once a day.
Thank you. Do you have any recommendations for mandatory steps/checks that should be done before and after we run the delete by query cron on the huge live Index .
No, I am not aware of any special checks required.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.