Elastic Cluster Balancing

Elk_huh · January 19, 2024, 3:48pm

ELK stack 8.11,

How do i get my cluster to balance by available disk space, 1 node keeps hitting the watermark while the other 3 nodes have 2TB available

Here are the Cluster settings

{
  "persistent": {
    "cluster": {
      "routing": {
        "rebalance": {
          "enable": "all"
        },
        "allocation": {
          "allow_rebalance": "indices_all_active",
          "cluster_concurrent_rebalance": "2",
          "node_concurrent_recoveries": "2",
          "disk": {
            "threshold_enabled": "true",
            "watermark": {
              "low": "200gb",
              "flood_stage": "10gb",
              "high": "100gb"
            }
          },
          "balance": {
            "index": "0.55f",
            "shard": "0.45f"
          }
        }
      }
    },

This table contains 6 rows out of 6 rows; Page 1 of 1.
Name	Alerts	Status	Roles	Shards	CPU Usage	Load Average	JVM Heap	Disk Free Space
Coordinating	Clear	Online	N/A	0	21%	0.79	50%	180.3 GB

Node1	Clear	Online	N/A	328	24%	1.53	54%	1.9 TB

Node2	Clear	Online	N/A	353	22%	1.94	35%	2.8 TB

Node3	Clear	Online	N/A	340	23%	1.74	59%	1.9 TB

Node4	Clear	Online	N/A	292	19%	1.67	29%	1.2 TB

type or paste code here

leandrojmp · January 19, 2024, 4:23pm

Elasticsearch will try to balance de shards by the number of shards, it will take in consideration the watermark levels and the shard size, but it is not possible to balance based on the disk free space.

What is your average shard size? Do you have many small shards?

Also, which node is hitting the watermark? All your nodes have more than 1 TB of free space and your low watermark is set to 200 GB.

Can you provide a little more context?

Elk_huh · January 23, 2024, 4:01pm

the average shard size is 50GB, yes we have small shards also ., node 4 is hitting the water mark, i recently raised watermark from 85% ( which is 1TB ) to 200GB to give me breathing room, we also added more disk space, but as you can see node 4 is has less space than the rest , and at this trend it will do what it did before all the modifications , node 4 will hit watermark as it has the least amount of shards on it so elastic will naturaully put more shards on node 4 to even out the shard count

intrepid1 · January 23, 2024, 5:18pm

Hi there,

We have a 12 node cluster and in order to spread the data as evenly as possible across all nodes, we use ILM to give us an optimum shard size and each index has 12 primary shards. This means that each data node has an equal amount of data.

In the past we have had indexes with less primaries. This meant that some of the data nodes naturally had more data on them, because certain indexes had less primaries. Some nodes would hit the 85% threshold whilst others wouldn't. When we moved to ILM, all indexes were set to have 12 primaries. Now all nodes use almost identical amount of disk space.

You might want to run a GET _cluster/allocation/explain. This may not return anything but it could.

It could be that the data is perfectly balanced in the eyes of Elasticsearch, especially if your cluster is green.

I hope that helps.

system · February 20, 2024, 5:18pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Shards not allocating based on disk space Elasticsearch	6	889	May 14, 2019
Balancing disk usage on large clusters? Elasticsearch	3	1501	September 24, 2020
Is there a way to rebalance data nodes by disk space and not shards? Elasticsearch	5	4278	July 1, 2021
Questions about watermarks and cluster.routing.allocation.cluster_concurrent_rebalance Elasticsearch	4	457	April 13, 2017
Shard allocation based on shard size Elasticsearch	14	913	January 18, 2021

Elastic Cluster Balancing

Related topics