Currently I am working to reindex daily indices for 4 months to new indices. I want to know if there is some way to carry this out so that if any error occurs while reindexing a document, I can continue from that point again instead of starting the whole process again.
The best way is to split the process up along some natural boundaries. If you hit an error you can restart the part that hit the error. If the documents don't change it should be safe to just recopy the parts you've copied. It sounds like you are already splitting the process per index. It might also be useful to split it per hour in the day based on a time field. That way you have fewer to redo if there is a problem and each chunk will finish a little faster. You can add a range query to reindex to do that.
Thanks for the response. Just to confirm, splitting reindexing on hourly basis would need a for loop in place. Is there any other way to parallelize this?
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.