Disk usage difference between data nodes

Hello,
we have a pretty big cluster (26 data nodes, almost 100TB), and I have a question about the disk usage distribution across data nodes.
I know that Elasticsearch takes to consideration the equality between number of shards rather than disk usage, and in case that shards sizes are not so averaged, it causes very big differences in nodes disk usage:

Is there any setting that can be changed in order to align disk usage between nodes, based on usage rather than shards count?

Thanks,
Lior

I would love to know your deployment configuration; we have similar requirement to build 50TB ES cluster.

Hey @porscheme ,
can you be more specific?
I'll try to expand as much as possible, hopefully it will answer your question.

Our cluster runs on EC2 nodes of type i3en.2xlarge, installed with RPM, configured max heap (31GB) per data nodes, and each data nodes has 5TB disk space.

Lior

  • Our data size at source is 50 TB, besides accounting for organic growth of data how much storage should we allocate for ES overhead?
  • We wanted use 10 VMs Azure Ls32 SKU (256 GB RAM, 32 CPUs, 4 X 2 TB NVMe premium SSD disks), 700 shards each shard 75 GB. Is this good?

Hey @porscheme,

First of all, I refrain from saying about myself that I am an expert,
but from my experience (maintaining this cluster over 3.5 years) I can tell that the maximum cluster sizing is 31GB for heap size, therefore you can have 64GB RAM on each node as maximum.
Plus, the shard size should also be close to the heap size, so I believe it will be better to stick around 30-40GB rather than 75GB.

If someone else has other insights/suggestions, I would like to hear them, but this is my opinion.

Lior

1 Like

Hello,

Please review my similar topic: Elasticsearch cluster uneven distribution of data

Regards,
Dan