ILM index rollover error

Hi everyone,
Hope you will be able to help me another time.
I have an elasticsearch cluster with an ILM Policy to manage the index rollover.
The rollover is based on the index size.
Everything was running well. The coordinator node load balance to another master eligible node
when index size limit is reached OR when the disk is full.
But i'm facing a "strange" behavior that i don't understand.
When the index size is reached AND the disk is full at the same time, an error is logged :

policy [ilm-traffics-logs-policy] for index
[idx-aggregated-logs-000001] failed on step
[{"phase":"hot","action":"rollover",
"name":"check-rollover-ready"}]. Moving to ERROR step
java.lang.IllegalArgumentException: setting [index.lifecycle.rollover_alias]
for index [idx-aggregated-logs-000001] is empty or not defined

Documents are no more stored in elasticsearch, because the index was set to read-only mode .

[b9891m.prv] flood stage disk watermark [95%] exceeded on
[seMsQpW-QrylKpA0AUZvZA][b9891m.prv]
[/appli/elasticsearch/data_elasticsearch/nodes/0] free: 1mb[0%],
all indices on this node will be marked read-only

Is someone understand this and know how to fix it ?
Thanks for your help !

BR,
KS

Elasticsearch has built-in protection against filling up the disk as this could cause corruption and data loss. If you get too close to the limit, indices will be made read only. You therefore probably need to adjust your parameters so you rollover and move data off the node before this level is reached.

Sorry for my question but which parameters need to be adjusted to avoid my error ?

Your settings for rollover in the ILM policy.

here are the details of my ILM Policy :

{
  "ilm-traffics-logs-policy": {
    "version": 1,
    "modified_date": "2022-10-05T13:48:01.632Z",
    "policy": {
      "phases": {
        "hot": {
          "min_age": "0ms",
          "actions": {
            "rollover": {
              "max_size": "10mb",
              "max_age": "365d"
            }
          }
        },
        "delete": {
          "min_age": "3d",
          "actions": {
            "delete": {
              "delete_searchable_snapshot": true
            }
          }
        }
      }
    },
    "in_use_by": {
      "indices": [
        "idx-aggregated-logs-000001"
      ],
      "data_streams": [],
      "composable_templates": [
        "ilm-traffics-logs-template"
      ]
    }
  }
}

I don't see which ones to adjust to avoid the behavior i actually has.... :cry:
if i increase the index max_size value, the problem will occurs later, but will occurs

10MB is very small. This should typically be at least a few GB. 365 days is also a very long period. You typically set this as a fraction of your total retention period. If you want to keep data for only 3 days you should set this to 1 day. If you want to keep data in the cluster for 365 days a value of 30 days may be more appropriate.

How much disk space does the node have? How much data are you ingesting per day?

Hello Christian,
Thank you to take time to explain to me. i appreciate :slight_smile: !
The settings i have posted are for my sandbox environment.
I set the max size to 10mb just to reproduce the bug faster.
In reality, the index max_size is 50Gb, and the disk capacity is 400Gb.
The cluster ingest about 3000000 docs by day

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.