How do i get my cluster to balance by available disk space, 1 node keeps hitting the watermark while the other 3 nodes have 2TB available
Here are the Cluster settings
{
"persistent": {
"cluster": {
"routing": {
"rebalance": {
"enable": "all"
},
"allocation": {
"allow_rebalance": "indices_all_active",
"cluster_concurrent_rebalance": "2",
"node_concurrent_recoveries": "2",
"disk": {
"threshold_enabled": "true",
"watermark": {
"low": "200gb",
"flood_stage": "10gb",
"high": "100gb"
}
},
"balance": {
"index": "0.55f",
"shard": "0.45f"
}
}
}
},
This table contains 6 rows out of 6 rows; Page 1 of 1.
Elasticsearch will try to balance de shards by the number of shards, it will take in consideration the watermark levels and the shard size, but it is not possible to balance based on the disk free space.
What is your average shard size? Do you have many small shards?
Also, which node is hitting the watermark? All your nodes have more than 1 TB of free space and your low watermark is set to 200 GB.
the average shard size is 50GB, yes we have small shards also ., node 4 is hitting the water mark, i recently raised watermark from 85% ( which is 1TB ) to 200GB to give me breathing room, we also added more disk space, but as you can see node 4 is has less space than the rest , and at this trend it will do what it did before all the modifications , node 4 will hit watermark as it has the least amount of shards on it so elastic will naturaully put more shards on node 4 to even out the shard count
We have a 12 node cluster and in order to spread the data as evenly as possible across all nodes, we use ILM to give us an optimum shard size and each index has 12 primary shards. This means that each data node has an equal amount of data.
In the past we have had indexes with less primaries. This meant that some of the data nodes naturally had more data on them, because certain indexes had less primaries. Some nodes would hit the 85% threshold whilst others wouldn't. When we moved to ILM, all indexes were set to have 12 primaries. Now all nodes use almost identical amount of disk space.
You might want to run a GET _cluster/allocation/explain. This may not return anything but it could.
It could be that the data is perfectly balanced in the eyes of Elasticsearch, especially if your cluster is green.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.