Is the guidance for cluster.routing.allocation.* settings still accurate?

BenB196 · January 7, 2025, 9:22pm

Hi All,

I was curious if anyone knows if the guidance for the cluster.routing.allocation.* settings (ex: cluster.routing.allocation.node_concurrent_recoveries & cluster.routing.allocation.node_initial_primaries_recoveries) still accurate?

The reason I ask, is because of this phrase:

Increasing this setting may cause shard movements to have a performance impact on other activity in your cluster, but may not make shard movements complete noticeably sooner. We do not recommend adjusting this setting from its default of X.

I was recently doing a rolling restart of a large cluster (each hot node has ~1.6k shards), and it was on average taking ~1 hour for the node to recover. After messing with the settings a bit:

cluster.routing.allocation.node_concurrent_recoveries: 2 -> 4, then 4 -> 6
cluster.routing.allocation.node_initial_primaries_recoveries: 4 -> 8, then 8 -> 10

I noticed a fairly "linear" increase in recovery speed, going from ~1h -> ~30m, then ~30m -> ~20m.

So, I'm a bit curious, with all of the recent improvements to Elasticsearch, is this guidance still accurate? Does anyone else adjust these settings?

For context, I'm currently on 8.16.2

Topic		Replies	Views
Cluster setting not affected Elasticsearch	1	319	October 17, 2018
Recommended cluster settings Elasticsearch	3	346	July 6, 2017
ES 6.2.4 Impact of a higher concurrent recovery settings Elasticsearch	1	766	July 1, 2018
Shard activity always happening in ElasticSearch 8.12 version Elasticsearch	7	159	August 23, 2024
Routing allocation , more then configured Elasticsearch	4	1077	February 27, 2017

Is the guidance for cluster.routing.allocation.* settings still accurate?

Related topics