Watermark and shard allocation

nevermind · June 16, 2017, 11:05pm

We have a fleet of approximately 200 nodes. We use ES 1.7.4 version.
Watermark.low equals to 80% and high to 85%.
Around 5-8% of the fleet has Disk Usage between 80% and 85% and 5% of nodes - around 20-30% of DU. When a node with ~79-83% Disk Usage goes down and then up, shards are not getting allocated back to the node and are getting spread across the fleet. Current delayed allocation timeout is 10 minutes.

So it looks like when a node comes back, ES tries to add a biggest shard from that node back to it, but it sums up the current disk usage with that shard size, which exceeds low watermark threshold and doesn't assign anything back to the node.

Is it the expected behavior? I thought ES shouldn't sum up the Disk Usage with a shard size which is already part of a Disk Usage, should it?

Thanks!

vb3 · June 19, 2017, 7:52pm

I wonder if it is somewhat related to Replica shards stuck in Initialization phase ?

@warkolm can you comment on this? Seems like a bug in 1.7.4

Thank you.

system · July 17, 2017, 7:52pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Total_shards_per_node and disk usage too high causes shards to stay unallocated Elasticsearch	5	12	December 18, 2024
Understanding Disk Shard Allocation Elasticsearch	2	316	May 9, 2019
Is there much value in having different values for cluster.routing.allocation.disk.watermark.low/high configs? Elasticsearch	2	502	January 11, 2019
Shards not allocating based on disk space Elasticsearch	6	892	May 14, 2019
Shard allocation based on shard size Elasticsearch	14	938	January 18, 2021

Watermark and shard allocation

Related topics