How to avoid sending data to a node close to its disk space limit?


My elasticsearch cluster has 4 nodes, but one node is close to its disk space limit.

I cannot add new nodes to my cluster due to budget constraints. So, I have to use space remaining on the other nodes.

What would you recommend ?

Should I stop the elastic search service on the node close its limit ?

Can I route the data to the other nodes with some setting ?

Should I mark the shards on the node as "ready-only" ?

Thanks !

By default, Elasticsearch will stop allocating new shards to a node once it exceeds 85% full, will relocate shards away from nodes that exceed 90% full, and will set things to read-only when 95% full. These settings are configurable:

Thank you for your answer

The documentation for cluster.routing.allocation.disk.watermark.flood_stage
mentions that the whole index will be put in read-only mode if one shard exceeds the limit.

Is there another way to handle this scenario ? One of my node has much less disk space, and I don't want this node to be a bottleneck. Still, I want to use its resources for my analysis.

This is the disk space available on my nodes:

  • node 1: 14 TB
  • node 2: 5 TB
  • node 3: 14 TB
  • node 4: 14 TB

I already set the index settings to 7 shards, so that node 1, 3 and 4 hosts two shards and node 2 only one. Still, node 2 is close to its limit.

What would you recommend then ?

Unfortunately not. It doesn't really make sense to mark a single shard as read-only. If Elasticsearch cannot write to one shard in an index, it cannot write to the index.

Rely on watermark.high to move shards away from the full node before they hit watermark.flood_stage.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.