Index.blocks.read_only_allow_delete becomes true even if water mark is reached in one of the node in the cluster

I have multiple data nodes in my cluster with uneven disk capacities. While the data is being continuously being indexed, some nodes reach the 90% watermark (other nodes in the cluster do have sufficient remaining disk space), causing the cluster to go into a blocked state. That is, the index.blocks.read_only_allow_delete is set to true for all the indices and hence no more data gets indexed.
How to avoid this and maximise the disk usage.

Thank you

This is by design. If a node hits cluster.routing.allocation.disk.watermark.flood_stage (95% by default not 90%) then Elasticsearch must stop writing to all the indices on that node. It cannot carry on writing to some of the shard copies on other nodes and just avoid the ones on the full node, because all shard copies must contain the same data.

You can avoid this by giving Elasticsearch more space so it does not need to protect itself from a full disk by blocking indexing. You can also postpone the problem by increasing the flood_stage watermark.

Thank you @DavidTurner, but my doubt is why was the shards not moved away to a node that has free disk when it reached the cluster.routing.allocation.disk.watermark.high

Normally that would be the case, but if disk usage grows too quickly then there might not be time to relocate enough shards away before hitting the flood_stage watermark.

Ohhh that makes sense, thank you @DavidTurner .
Can you please help me with below 2 questions

  1. Will the shard relocation still happen even if the flood_stage watermark is hit.

  2. Is there ways to automatically set index.blocks.read_only_allow_delete to none if the remaining free disk space is below watermark levels

Yes, I think so.

No, but there is an open feature request for this.

Thank you @DavidTurner. It was very helpful

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.