Elasticsearch cluster goes down with 1 node capacity reached, resolutions

charvi23 · February 5, 2024, 3:17pm

I have a 2 node cluster, out of which 1 node has reached max capacity (96%). Both nodes have different capacity. It gives the following error:

TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark, index has read-only-allow-delete block

As per my current understanding ES works in this way, if any node goes to watermark stage the complete cluster (including all nodes) will stop ingesting new data.

So what can be possible solutions other than deleting existing data? Can I create a new data mount point and add it to the node's elasticsearch.yml ? Will this be resolved? Or is there any other solution to this?

Currently my replication factor is 1. 2 nodes in cluster.

Any help here would be appreciated

leandrojmp · February 5, 2024, 3:45pm

Not exactly, when a node reaches the flood stage watermark, it will mark every index that has at least one shard on that node as read only.

But in your case since you have just 2 nodes and have replicas, this means that all your indices will be marked as read-only.

I would say that in your case the easiest solution is to remove all replicas since a 2 node cluster does not have any resilience.

As mentioned, a 2 node cluster does not have any resilience, so it does not make much difference having replicas.

You can remove the replicas using the following request:

PUT /*/_settings
{
    "index" : {
        "number_of_replicas" : 0
    }
}

charvi23 · February 5, 2024, 4:03pm

Thanks for replying @leandrojmp . Can you please help explaining. Why is 2 node cluster not resilient? If 1 node's goes down will the second not be able to support it?

Just trying to understand things a little better for my perspective.

leandrojmp · February 5, 2024, 4:07pm

No, basically Elasticsearch needs a quorum to elect a master node, with just 1 node up you will not have that quorum.

You end up with 2 scenarios.

If the node that goes down is not the current master, your cluster will still work because the master is up.
If the node that goes down is the current master, your cluster will not work until the node gets back online.

You can read more here.

charvi23 · February 5, 2024, 4:27pm

Thank you!! This helps. So if I change the replication factor to 0, what will happen to the current data? And the future data that gets ingested on the nodes?

Topic		Replies	Views
What happens if my Elasticsearch cluster has only two nodes with a significant difference in disk storage space? Elasticsearch	2	154	October 11, 2023
Multiple disks on same node - shard allocation Elasticsearch	4	2861	May 9, 2020
DataNode disk full, despite on flood_stage configuration Elasticsearch	5	335	January 3, 2023
Documents are no longer saved after high disk watermark exceeded on an elasticsearch cluster Elasticsearch	10	846	October 17, 2022
Index.blocks.read_only_allow_delete becomes true even if water mark is reached in one of the node in the cluster Elasticsearch	7	1259	April 24, 2019

Elasticsearch cluster goes down with 1 node capacity reached, resolutions

Related topics