I was wondering if there is an option to include shard size in the shard allocation decision process.
The Disk-based shard allocation settings and the number of shards/node works, and most of or shards are 50G in size, but sometimes a few 2-3G shards get in the mix. When this happens, ES only looks at the number of shards/node, not if the total shard size looks equal.
Is this something ES can take into account? Or will I need to write my own rebalancing logic for this kind of behaviour?
Well yes, I'm trying to keep as much data in ES as possible, but when nodes of a specific type (hot/warm/cold) are not evenly distributed, I waste a lot of disk space .
I guess I'll need to write a rebalancer based on shard sizes which runs every once and a while.
Since you have quite large disks you might like to consider configuring the disk watermarks differently. The high watermark default of 90% means that Elasticsearch tries to keep 500GB free on each 5TB disk. That's not totally silly, there is some belief that filesystem performance drops once disks get too full, but if you would rather run closer to the wire then that's your call.
Elasticsearch aims to keep nodes under the high watermark but will only move shards between nodes when necessary. This means that "evening out" disk usage is deliberately avoided but it doesn't mean that any space is wasted, even if your shards have rather different sizes.
Yes, I know , I already changed the high watermark, but as you can see in the example, at least 8 shards (of 50G) could be added in elkdatac001 when total shard size would be included in the rebalancing.
The first one has a lot of large shards, but as you can see, ES does not allocate any new incides to this node, but the disk numbers are still a problem.
That is why I think it is strange the total disk size is not taken into account in the relocating logic, but it is also strange I'm the only person running into this problem .
That in itself isn't a good reason to move shards around. Moving a shard is an expensive operation, it's not worth doing simply for the sake of tidiness.
This is what I'm not understanding. I see that the numbers aren't equal, but I don't see why this is a problem. How would your life be better if the numbers were closer together? Is there some operational issue that this unevenness is causing?
You're not the only person to experience this confusion, and we recently expanded the docs on this subject for that reason. Disk space absolutely is taken into account when relocating shards, but that doesn't imply we aim for equal disk usage across nodes. That goal is expensive and unnecessary.
After thinking about it for a while, I'll just gonna try to reconfigure our own host monitoring. As Elastic indeed has DiskUsage checks, the hosts will probably eventually become evenly distributed.
The problem we see is that in busy days, the apache filebeat logs grow a lot, so we need to have some buffer, but we want it as small as possible.
But I'll start by removing our own disk checks, thanks again for the feedback!
I see - that is a better reason for relocating shards
The usual solution is to set the gap between the low and high watermarks to be larger than the typical size of the day's indices on each node, and the gap between the high and flood-stage watermarks to be large enough to allow time to mitigate any overage before disks fill up. This largely works in practice, but it's not completely ideal.
Well, I basically have (had) two problems, the monitoring system (which I disabled for the ES data mount, because the watermark system works well for this) and the busy days.
Because I use ILM, the watermark system is not really helping me. When the hot nodes are full, I need to manually change the ILM config to make sure the indices get allocated to warm or cold nodes. But that is something I will look at in the future. For now a bigger buffer wil do.
I see, in which case it sounds like you might be looking for #47764. Please feel free to leave a comment (even just a +1) to let us know you'd like us to work on it.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.