We often add many data nodes at once (say 10).
If I don't tweak the default setting, there are only 2 recovery tasks at any given moment and it'll take forever to rebalance.
If I tweak "cluster.routing.allocation.cluster_concurrent_rebalance" to 10, then I could get 10 tasks moving shards. Which will significantly shorten the rebalance time.
But I often see shards being move from one of the 10 new nodes even though it's shard count is still significantly smaller than existing old nodes (say 30 of them).
This creates an issue where leaving that setting to 10 will cause the cluster to never finish recovery/rebalance even after 24 hours. I am pretty sure the reason is shards are being moved back and forth between new and old nodes, instead of just from old nodes to new nodes.
I know the recommendation is not to tweak that value, but moving 2 shards at a time is too slow.
Has anybody figure out the best approach to add more data nodes without such weird bug?
Thanks.