Rebalance primary shards, 2022

nisow95612 · April 21, 2022, 12:57pm

Hello,
I, too, would like my primary shards to be balanced across the cluster.

I often reduce the number of replicas on older indices to optimize disk space.
When the cluster is reasonably idle, shards will recover from just-maked-for-deletion copies.
But quite often this means hours of indices being relocated to nodes that just deleted them.

Search engine landed me here: rebalance-primary-shards/14470
Best explanation of problems is here: how-to-rebalance-primary-shards-on-elastic-cluster/176060/4

The requirement for rebalancing was also asked about in primary-shard-rebalancing/6173 and cluster-reroute-automatically-reroute-and-rebalance-all-primary-shards/161757

Is this the proper way to inject solutions into older topics?

nisow95612 · April 21, 2022, 12:58pm

Primary purpose of this topic was to link existing topics to the github issue - no solution yet:

github.com/elastic/elasticsearch

Primary shard balancing

opened 05:09PM - 25 Apr 19 UTC

aeftef

high hanging fruit :Distributed/Allocation Team:Distributed

# Primary shard balancing There uses cases where we should be able to have a mechanism to balance primary shards through all the nodes so the number of primaries is uniformly distributed. ## Problem description There are situations where the distribution of primary shards is unevenly distributed through the nodes, for instance when doing a rolling restart last node wont have any primary shards as the other nodes would assume primary shard role while the other node is down. This issue has pop up in other occassions, and the usual answer was that primary/replica role is not an issue becouse the workload a primary or a replica assume is similar. But there are important uses cases where this does not apply. For instance, in an index heavy scenario, where indexing must be implemented as an scripted upsert, the execution of the upsert logic falls onto primarie shards, and replicas just have to insert the result. In this cases having unbalanced primaries excerts a bigger workload on the nodes hosting the primaries, this can even overload the cluster capacity as the cluster bottleneck will be the capacity of the nodes hosting primaries and not the sum of the cluster nodes. [Related thread in official forum](https://discuss.elastic.co/t/how-to-rebalance-primary-shards-on-elastic-cluster/176060/3) ## Workarounds Actually there are some workarounds for this situations, but they are not efficient: 1. Once cluster primaries are unbalanced we could use the [Cluster reroute API](https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-reroute.html) to try to balance them, swapping, in a "reroute transaction", a replica with a primary. In order to do this, first we need to have more nodes in the cluster than replicas because shards cannot be rerouted to a node where a shard already exists. > As an example,consider simplified scenario 3 nodes 2 shard 3 replicas, where rerouting is not possible: > Node 0: Shard 0 (primary), Shard 1 (primary) > Node 1: Shard 0 (replica), Shard 1 (replica) > Node 2: Shard 0 (replica), Shard 1 (replica) > *Rerouting cannot be possible, we cannot swap shard 0 (primary) from Node 0 to Node 1 and shard0 (replica)* > Against a scenario like 3 nodes 3 shard 2 replicas where rerouting is possible: > Node 0: Shard 0 (primary), Shard 1 (primary) > Node 1: Shard 0 (replica), Shard 2 (primary) > Node 2: Shard 1 (replica), Shard 2 (replica) But even, when reroute is possible it means that shard data has to be moved from one node to another (I/O and network...). Also there is not an automatted way to detect what primaries are unbalanced, and what shards can be swapped and execute that rerouting in samlls chunks in order to dont overload node resources. But implementing a utility or script that does that is feasible (see possible solutions). 2. Simply "throw a bag of hardware" to the problem, having enough hardware to support an unbalanced scenario. Then we can limit the number of shards per node (https://www.elastic.co/guide/en/elasticsearch/reference/6.7/allocation-total-shards.html), both primaries + replicas, so they are distributed between nodes. This apporach is unecessary expensive, and impractical at certain scale levels. ## Possible solutions? (all of them imply new features to be implemented) So, lets consider possible solutions (take into account that I don't know elasticsearch internals): 1. Enhance Cluster reroute API so that you can "reroute" the role of a shard: lets say we reroute a primary shard from a node to another node that hosts a replica shard, the data is not moved between the nodes, but replica shard is elected as primary and primary as replica. If reroute API has this functionality somehow, it would be possible to develop a script that detects primary shard imbalance and reroute primary hard roles accordingly. 2. Modify cluster shard allocation protocol, so that primaries are automatically balanced when shards are assigned to the nodes. This could be active by default, or optional (configuring something new cluster settings, under cluster.routing...) 3. Any other ideas ¿?

nisow95612 · April 26, 2022, 8:12am

If it is currently not possible to demote a primary into a replica, would it be possible to bring up the replacement replica before failing the primary? Would that avoid the degradation of redundancy? I mean that it would work like ES does index clone - hardlinking the segments intead of copying.

system · May 24, 2022, 8:13am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to rebalance primary shards on elastic cluster Elasticsearch	5	13115	May 23, 2019
Primary shard rebalancing Elasticsearch	5	371	July 6, 2017
Rebalance primary shards Elasticsearch	4	3130	July 6, 2017
Homogeneous distribution of primary shards Elasticsearch	4	725	July 6, 2017
Distributing primary shards? Elasticsearch	8	9160	December 30, 2016

Rebalance primary shards, 2022

Related topics