Best way to reindex an ES index which is accepting updated

I have an ES index which is continuously receiving updates. Apart from inserts and deletes, my index accepts a lot of update operations. Over a period of time, I see a lot of disk IO performance issues because of which I am planning to reindex my cluster every 6 months or so.

I checked out the Reindex API, but this seems to use the current snapshot of the cluster to start copying data. What about the data that is continuously streaming in ? Whats the best way to handle that ?

Also, when you say ReIndex API copies data from one index to another, is it only the data that is copied ? Would disk defragments carry over to the new index ?

How does your application read from Elasticsearch?

Not sure what you mean there.

  • We have a python client which uses an index alias to read. So once the reindex completes, I am planning to just switch the alias.

  • When you restore a snapshot from elastic, the deleted documents seem to carry over. Which means that my disk IO performance remains the same after a snapshot and restore. Will the same thing happen even if I reindex ?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.