Hi Team,
I'm trying to understand better how shard relocations due to imbalances work.
I assume, if one has good network bandwidth and good disk IO read and write speeds, then one might benefit from setting a higher setting for concurrent rebalance?
Is Elasticsearch smart enough to not "overshoot" the target host's available storage when concurrent rebalance is set to a high number?
What I mean by overshoot is let's assume we have two hosts, each have same size disks, but are currently imbalanced, host A with 10GB free and host B with 5GB free. Say concurrent rebalance is set to 10 shards and each shard is 1GB in size, if 10 shards from host B move concurrently to host A that would make the imbalance worse?
In general, what points/things should we consider or when choosing a value for cluster_concurrent_rebalance?
PS: More generally, for the shard allocation settings that we are allowed to tinker with, the docs do great job in describing what behaviour the setting is changing but what would be extra helpful is to have some description of when we decrease or increase the value of a particular setting what implications there are on various things such as system resources, implicit Elasticsearch behaviours, index management, etc... Eg: I, sort of, understand what node concurrent incoming recoveries does and that it defaults to 2. Now how do I make a decision whether I should use a different value and if I should then how do I make a decision of what value it should be? If the docs list some explicit items for consideration then it would help users to decide instead of speculating.
Cheers,