I'm reindexing 9.2TB-data index (~2bn documents) from a v2-created index (restored onto v5.6 cluster) into a v5 created 20-shard index (on same cluster). The elasticsearch cluster consists of 9-nodes, each with 32 GiB RAM, 8 cores and a 4TB SSD.
It's taken about 12 days so far and seems to have slowed right down. My netdata dashboard shows that CPU is not being taxed at all, disk utilisation is up (as expected) and RAM is in high use.
Reindex batch size was 10,000, and everything runs through a groovy script that converts the v5-incompatible document IDs to a SHA256 hashes of themselves. Reindexing rate on the index is being reported as ~300-400/sec, and query rate from the source index is <50/sec.
index.refresh_interval is set to
-1 (although I only did that today after some more rooting around).
I'm a little bit worried. I've got another reindex process to run on a 5-shard index that has ~3.2bn documents in it, although it's only 2.3TB data.
This is all taking much longer than expected.
My question is if I roll another node into the cluster, will it adversely affect the reindex process? I'm assuming that as soon as another node becomes available, ES will start balancing the shards. If that conflicts with the reindex process I'll be extremely distraught!
Please advise, if possible (!).