Cluster shards disbalance

Hello. Thanks in advance for any help
I have a 3-nodes cluster containing different amount of shards on each node. Earlier it runs elasticsearch 8.4.3 and all was nearly-fine, but after upgrading to 8.11.1 things seems badlier.

Now, entering _cat/allocation?v here is what I see:

shards disk.indices disk.used disk.avail disk.percent host       ip         node
   591         19tb      20tb      4.1tb     24.2tb           82 es1
   652       20.9tb    21.9tb      2.2tb     24.2tb           90 es2
   601       19.4tb    20.4tb      3.7tb     24.2tb           84 es3

es2 has significantly more shards than es1 and es3 and less available disk space.

Health status of cluster is green.

Here's my allocation settings:

  "persistent": {
    "cluster": {
      "routing": {
        "allocation": {
          "allow_rebalance": "indices_all_active",
          "cluster_concurrent_rebalance": "5",
          "node_concurrent_recoveries": "5"
    "indices": {
      "recovery": {
        "max_bytes_per_sec": "200mb"

I see no issues when reading logs on es2 or any other health issues except low available disk space on es2. Any ideas how to balance overall shard count on my nodes?


In version 8.11, the balancing of shards has slightly changed. You can find references to this in the following documentation:

In your link I can see only new settings avalaible in "Shard balancing heuristics settings" such as cluster.routing.allocation.balance.disk_usage and cluster.routing.allocation.balance.write_load but other parameters have same default values on 8.4.3
If I want shard overall count on each node stay close, how can I manipulate those settings?


The cluster.routing.allocation.balance.shard setting defines the weight factor for the total number of shards allocated to each node. By increasing this value, Elasticsearch will try to equalize the total number of shards across all nodes.

However, please note that adjusting these settings should be done with caution. The default values are generally good for most use cases, and changing them might improve the current balance but could potentially cause problems in the future if the cluster or workload changes.


This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.