Is there a tool to rebalance shards across multiple path.data paths?

ndtreviv · February 1, 2021, 3:36pm

I have 5 data nodes in my cluster, all of them split the data across two disks. There should be plenty of space for my data, but because elasticsearch allocates shards to disks based on shard count rather than size (and I understand you wouldn't know how big a shard is if it's still being written to), then I have situations where one disk gets really full, whilst the other one is only half full.

This triggers high watermark alerts and shards start bouncing between nodes.

My question is:
Is it possible to move shards between disks to more evenly balance the load?

One of the following would be great, in order of preference:

Get elasticsearch to consider bouncing shards between disks before it starts offloading them to other nodes
Have an endpoint that reallocates shards on-node between disks to even the load
Have a tool that moves shards between disks

Does anything exist?

DavidTurner · February 1, 2021, 3:42pm

This is not something you should worry about according to the docs:

NOTE: It is normal for nodes to temporarily exceed the high watermark from time to time.

Older versions did tend to make a lot of noise about this in the logs despite it being a normal occurrence in a cluster. Recent versions are much quieter.

I can see why one might expect this to be possible, but it isn't a thing today. It would be surprisingly complicated to implement. It's more usual to combine all the volumes together using LVM or RAID or similar, or else run one node per data path.

ndtreviv · February 1, 2021, 4:53pm

So the problem I have is that it's not making the most of my disks.

For eg: on one data node, one disk was 91% full, the other only 55%. This actually caused me some downtime - a longer and more complicated story.

There's more than enough total disk space for my cluster, but because of the way that shards are allocated to disks, then there isn't, and I can't control that in any way.

DavidTurner · February 1, 2021, 5:45pm

As long as there were other less-full nodes this should have been ok, Elasticsearch should have moved some of those shards elsewhere, but yes if you're tight on space you would do better to combine your storage into a single filesystem on each node.

ndtreviv · February 2, 2021, 8:24am

It's not event that I'm tight on space. I think the problem is exacerbated by the fact that I have a few indices with wildly different shard sizes (3mb - 90gb).

In the most recent case, it seems elasticsearch decided to schedule all the small shards on one disk and all the large ones on the other.

It seems that, with my data, this problem will always occur if I have more than 1 paths.data.

Thanks for your suggestions.

system · March 2, 2021, 8:25am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Spreading files across data paths Elasticsearch	2	374	May 11, 2020
Shard distribution using multiple path.data locations Elasticsearch	2	1179	May 26, 2017
Elasticsearch does not use all path.data Elasticsearch	3	414	May 19, 2020
How to balance data between nodes by disk disk usage % Elasticsearch	1	1984	January 7, 2017
How many disks are used by one shard? Elasticsearch	2	500	July 5, 2017

Is there a tool to rebalance shards across multiple path.data paths?

Related topics