We are on on version 7.15.
Still experiencing issues during recovery. The cluster will often (very likely) move shards from new node back to old nodes even though the new node(s) are still have way fewer shards (and lots more free spaces).
I did play with few parameters attempting to speed up the shard relocation threads.
"cluster.routing.allocation.balance.threshold":
"cluster.routing.allocation.cluster_concurrent_rebalance":
"cluster.routing.allocation.node_concurrent_recoveries":
I even turn them all off (set to null like default) and still see such behavior (new data node being the source of shard movement)
Am I missing something important here?
We had the same issue when adding a node to exclusion list as well. Shards will be move into the node being excluded.
Data eventually got all moved out so we can replace the node.
Now the same thing with newly added nodes. Am I supposed to configure another parameters to simply achieve data node replacement?
Any help will be appreciated.