Reindexing with sliced scrolls VS. using custom ranges

Hi guys!

I'm reindexing a lot of data. To beef up the performance and also have more "elastic" error-handling, I've divided every source index into parts (ranges for a field). I then parallelize the process, by reindexing a number of ranges at a time. This has also the benefit of ranges being isolated, and so, if a range fails, I can investigate it and restart.

In Elasticsearch 5.1 we have reindexing with sliced scrolls. I have two questions:

  1. How does slicing compare with the approach above (manually specifying ranges) in terms of performance?
  2. How does slicing compare with the approach above (manually specifying ranges) in terms of error-handling? i.e. if there is a problem with a single slice, can I simply restart it, with the rest of documents still being reindexed?

Thanks in advance!

Haris

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.