Datastreams losing indices

Hi,

I'm facing some strange things with my ELK stack.
Since 2/3 days, some datastreams have lost some indices, storage size went to 0 docs and 0b ! Those indices are random in the timeline.
It also seems to not write anymore in new indices...

Any idea what could go wrong ?

P.S.: At the moment, the elastic disk storage is full at 90% (100Go left) on the hot node, and 85% (73Go left) on the warm node.

I would start from looking into logs for Elasticsearch - it's often pretty verbose, especially in DEBUG log level.

Can you look at the logs and maybe grep by the index names to see what Elasticsearch reported?

After investigating, it seems that when you reach the 90% watermark, elastic try to free space by moving some indices to another 'same' node (hot/warm/cold), and stop writing/indexing new entries.
But in my case, I only have 1 hot and 1 warm node, without replicas.

I managed to delete some old indices to gain free space, and elastic start rebalancing the indices.
To solve this with a more long term solution, I guess I have to increase the size of my data disks.

Is it possible to change those watermarks to 95% for example, because I loose near 100Go of disk space on the hot node with this one ?

You can indeed update watermarks following the docs: Fix watermark errors | Elasticsearch Guide [8.15] | Elastic

This should be the setting you're looking for: Cluster-level shard allocation and routing settings | Elasticsearch Guide [8.15] | Elastic

cluster.routing.allocation.disk.watermark.high logo cloud
(Dynamic) Controls the high watermark. It defaults to 90%, meaning that Elasticsearch will attempt to relocate shards away from a node whose disk usage is above 90%. It can alternatively be set to a ratio value, e.g., 0.9. It can also be set to an absolute byte value (similarly to the low watermark) to relocate shards away from a node if it has less than the specified amount of free space. This setting affects the allocation of all shards, whether previously allocated or not.

PUT _cluster/settings
{
  "persistent": {
    "cluster.routing.allocation.disk.watermark.high": "95%"
  }
}

However, it's still highly recommended to increase number of nodes or increase the storage size - at some point you'll reach this watermark and there will be no way out of it.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.