I have 5 data nodes in my cluster, all of them split the data across two disks. There should be plenty of space for my data, but because elasticsearch allocates shards to disks based on shard count rather than size (and I understand you wouldn't know how big a shard is if it's still being written to), then I have situations where one disk gets really full, whilst the other one is only half full.
This triggers high watermark alerts and shards start bouncing between nodes.
My question is:
Is it possible to move shards between disks to more evenly balance the load?
One of the following would be great, in order of preference:
Get elasticsearch to consider bouncing shards between disks before it starts offloading them to other nodes
Have an endpoint that reallocates shards on-node between disks to even the load
This is not something you should worry about according to the docs:
NOTE: It is normal for nodes to temporarily exceed the high watermark from time to time.
Older versions did tend to make a lot of noise about this in the logs despite it being a normal occurrence in a cluster. Recent versions are much quieter.
I can see why one might expect this to be possible, but it isn't a thing today. It would be surprisingly complicated to implement. It's more usual to combine all the volumes together using LVM or RAID or similar, or else run one node per data path.
So the problem I have is that it's not making the most of my disks.
For eg: on one data node, one disk was 91% full, the other only 55%. This actually caused me some downtime - a longer and more complicated story.
There's more than enough total disk space for my cluster, but because of the way that shards are allocated to disks, then there isn't, and I can't control that in any way.
As long as there were other less-full nodes this should have been ok, Elasticsearch should have moved some of those shards elsewhere, but yes if you're tight on space you would do better to combine your storage into a single filesystem on each node.
It's not event that I'm tight on space. I think the problem is exacerbated by the fact that I have a few indices with wildly different shard sizes (3mb - 90gb).
In the most recent case, it seems elasticsearch decided to schedule all the small shards on one disk and all the large ones on the other.
It seems that, with my data, this problem will always occur if I have more than 1 paths.data.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.