Rebalancing data between disks

Hi Can You give some tips how I can trigger rebalance for equal distribution of data depending on the size of the disk


node            shards disk.indices disk.used disk.avail disk.total disk.percent
es_data_hdd_1_2    470        1.8tb     1.8tb    298.5gb      2.1tb           86
es_data_hdd_4_2    510        2.8tb     2.8tb      3.6tb      6.5tb           43
es_data_ssd_5_1    338          2tb       2tb      1.4tb      3.4tb           58
es_data_hdd_4_1    510        2.8tb     2.8tb      3.7tb      6.5tb           43
es_data_hdd_7_2    510        2.5tb     2.5tb      3.9tb      6.5tb           39
es_data_ssd_2_3    338        1.7tb     1.7tb      1.7tb      3.4tb           48
es_data_hdd_3_1    479        1.8tb     1.8tb    294.5gb      2.1tb           86
es_data_ssd_1_3    339        1.8tb     1.8tb      1.6tb      3.4tb           53
es_data_ssd_3_2    339        1.6tb     1.6tb      1.8tb      3.4tb           47
es_data_ssd_1_2    339        1.8tb     1.8tb      1.6tb      3.4tb           52
es_data_ssd_4_2    339        1.9tb     1.9tb      1.4tb      3.4tb           57
es_data_hdd_7_1    510        2.6tb     2.6tb      3.8tb      6.5tb           40
es_data_ssd_2_1    339        1.7tb     1.7tb      1.7tb      3.4tb           50
es_data_ssd_2_2    339        1.5tb     1.5tb      1.9tb      3.4tb           44
es_data_ssd_5_2    338        1.5tb     1.5tb      1.9tb      3.4tb           44
es_data_hdd_1_3    478        1.8tb     1.8tb      308gb      2.1tb           86
es_data_ssd_3_1    339        1.9tb     1.9tb      1.5tb      3.4tb           56
es_data_ssd_4_3    338        2.1tb     2.1tb      1.3tb      3.4tb           62
es_data_ssd_5_3    338          2tb       2tb      1.4tb      3.4tb           58
es_data_hdd_3_2    477        1.8tb     1.8tb      303gb      2.1tb           86
es_data_ssd_3_3    339        1.7tb     1.7tb      1.6tb      3.4tb           51
es_data_hdd_4_3    510        2.6tb     2.6tb      3.8tb      6.5tb           41
es_data_ssd_4_1    339        1.9tb     1.9tb      1.5tb      3.4tb           55
es_data_hdd_2_1    477        1.8tb     1.8tb    325.1gb      2.1tb           85
es_data_hdd_3_3    475        1.8tb     1.8tb      301gb      2.1tb           86
es_data_hdd_1_1    478        1.8tb     1.8tb    302.6gb      2.1tb           86
es_data_hdd_7_3    510        2.7tb     2.7tb      3.7tb      6.5tb           42
es_data_ssd_1_1    339        1.8tb     1.8tb      1.6tb      3.4tb           52
es_data_hdd_2_2    488        1.8tb     1.8tb      327gb      2.1tb           85
es_data_hdd_2_3    476        1.8tb     1.8tb      311gb      2.1tb           86

I've already put it below setting but it didn't make a balance on disk

PUT _cluster/settings
{
  "persistent": {
     "cluster.routing.allocation.enable": "all",
     "cluster.routing.rebalance.enable": "all"
  }
}

For which one value I should change

" cluster.routing.allocation.balance.disk_usage
(float, Dynamic) Defines the weight factor for balancing shards according to their predicted disk size in bytes. Defaults to 2e-11f. Raising this value increases the tendency of Elasticsearch to equalize the total disk usage across nodes ahead of the other balancing variables."

Which version are you?

This rebalance based on disk_usage only works on 8.6+, also it takes into consideration the size of the shards, not the size of the disk.

Also, you have disks from different sizes, I don't think that is possible to have an equal distribution in this situation.

Your es_data_ssd_* nodes have disks with all the same size, it looks like that you are using them with the data_hot role and you can see that they are already balanced.

But your es_data_hdd_* nodes have disks with 2.1 TB and disks 3 times larger, if they have the same data role it will be not possible to have an equal distribution.

Elasticsearch takes in consideration the number of shards to balance and from 8.6 it also takes in consideration the size of the shard, but the size of the disk will only be taken in consideration when the watermark threshold are hit.

2 Likes

ah ok well I saw in https://github.com/elastic/elasticsearch/blob/main/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java

the this feature was introduced in 8.6 so, I understood when on only one disk hit the watermark this rebalance under disk size should be started. Yes all of hdd disks are used for warm phase, last time I have merged 9 disk into large one -> 3.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.