What's the secret to fast recovery when adding a new node?

linkerc · September 19, 2023, 8:08pm

We are on on version 7.15.
Still experiencing issues during recovery. The cluster will often (very likely) move shards from new node back to old nodes even though the new node(s) are still have way fewer shards (and lots more free spaces).
I did play with few parameters attempting to speed up the shard relocation threads.
"cluster.routing.allocation.balance.threshold":
"cluster.routing.allocation.cluster_concurrent_rebalance":
"cluster.routing.allocation.node_concurrent_recoveries":

I even turn them all off (set to null like default) and still see such behavior (new data node being the source of shard movement)
Am I missing something important here?
We had the same issue when adding a node to exclusion list as well. Shards will be move into the node being excluded.
Data eventually got all moved out so we can replace the node.
Now the same thing with newly added nodes. Am I supposed to configure another parameters to simply achieve data node replacement?

Any help will be appreciated.

system · October 17, 2023, 8:09pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Graceful shard management? Elasticsearch	6	337	March 18, 2021
Weird rebalancing strategy Elasticsearch	4	345	October 23, 2021
Version 8.10 still has recovery/rebalance issue? Elasticsearch	5	156	April 1, 2024
Recovery after restart Elasticsearch	1	546	October 30, 2017
Remove a Node controlling the number of shards reallocating Elasticsearch	3	390	April 8, 2019

What's the secret to fast recovery when adding a new node?

Related topics