Practical issues related to disk usage

mloine · May 29, 2026, 2:14am

Hello, I am encountering a situation now:
Version: 6.8.1
Number of nodes: 20, of which 15 nodes are 500G and 5 nodes are 2000G
Now the 500G disk must be filled up to trigger the water level configuration of the disk. The 2000G node can only be used in the same amount as the 500G node. How can we make the cluster preferentially use the 2000G node?

I need to balance the disks according to the percentage of the node's corresponding capacity. Is there any way to achieve this?

leandrojmp · May 29, 2026, 4:46am

Elasticsearch will balance the data based on shards, not disk capacity, it basically assumes that all disks will have the same size.

When you have different hardware profiles, even if the difference is just the disk size, you should work with data tiering or use custom attributes to do shard allocation filtering.

Version 6.8.X does not have any native way to do data tiering (hot, warm, cold), so your only option would be to use custom attributes to do shard allocation filtering.

Basically you would need to create a custom attribute in your elasticsearch.yaml to group your 2000GB nodes and 500GB nodes, something like node.attr.disk_size: big and node.attr.disk_size: small.

Then you would need to edit your templates to use this attribute as described in the example in the documentation linked.

If you want to move the data from the 2000 GB nodes to the 500 GB you would need to use an ILM policy to move the data after some time.

But your main problem here is that you are using an ancient version, 6.8.1 was released 7 years ago and is not supported anymore, it may be even complicated to get help on any issue because people may not remember how it work since a lot has changed.

You should plan an upgrade as soon as possible, but given how old it is, it may be easier to spin up a new cluster on version 9.X.

DavidTurner · May 30, 2026, 8:42am

This is true, 6.8 is irresponsibly old these days, but upgrading won't fix this AFAIK. See e.g. these docs:

IMPORTANT: Elasticsearch assumes nodes within a data tier share the same hardware profile (such as CPU, RAM, disk capacity).

You could split those 2000G nodes up into 4x500G nodes, assuming they also have 4x the RAM and CPU of the smaller nodes. Or else make them a different data tier as Leandro suggests. But there's nothing automatic to help you here, this simply isn't something Elasticsearch is designed to handle.

Topic		Replies	Views
How to balance data between nodes by disk disk usage % Elasticsearch	0	2002	December 10, 2016
Elasticsearch data nodes - disk usage optimisation Elasticsearch	5	793	February 20, 2023
Different hardware capacity Elasticsearch	3	1289	November 11, 2014
Rebalancing data between disks Elasticsearch	3	1412	March 17, 2023
Managing ES servers with differing data disk sizes Elasticsearch	3	842	July 29, 2014

Practical issues related to disk usage

Related topics