Remote reindex of large index using manual slicing


#1

Hello,

We are trying to do a remote reindex from a source index on 5.1.1 to a destination index on 5.3.2. The source index has ~6 billion docs, so we want to manually slice to parallelize. I read over the reindex documentation and am a bit confused about the scroll context. It says that manually slicing uses scrolls, but I don't see a way to set the scroll context time setting? I tried adding "scroll=5m" to the URL, but get an exception saying it's unrecognized for the _reindex endpoint. In the documentation's example, they simply create 2 slices and seem to execute both at once, but we cannot do that. We can only submit a few slice IDs at a time else we get socket timeout exceptions (I have tried increasing this value with no luck).

Basically we want to manually slice (by say 100 slices), and then execute 10-20 slice IDs at a time, and continue doing 10-20 till we finish all 100 slices. However, these 10-20 slices will take hours to complete, and I am not sure how to keep the scroll context alive for that long using the reindex API.


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.