Index Rollover fails with index.lifecycle.rollover_alias does not point to index

I am using ELK version 7.5 and the ILM policy I have set up for a fluent-bit index
Fluent bit is managed by a helm-chart with the following config:

backend:
  type: es
  es:
    type: _doc
    logstash_format: "Off"
    logstash_prefix: ~
    index: fluent-bit-write

The ILM policy is as follows

PUT _ilm/policy/fluent-bit
{
        "policy": {
            "phases": {
                "hot": {
                    "actions": {
                        "rollover": {
                            "max_age": "24h",
                            "max_size": "250gb"
                        }
                    }
                },
                "warm": {
                    "actions": {
                        "allocate": {
                            "require": {
                                "box_type": "warm"
                            }
                        },
                        "forcemerge": {
                            "max_num_segments": "1"
                        },
                        "shrink": {
                            "number_of_shards": "1"
                        }
                    }
                },
                "delete": {
                    "min_age": 14d,
                    "actions": {
                        "delete": {}
                    }
                }
            }
        }
}

The fluent-bit template is as follows:

PUT _template/fluent-bit
{
        "index_patterns": ["fluent-bit-*"],
        "settings": {
            "index": {
                "lifecycle.name": index,
                "lifecycle.rollover_alias": "fluent-bit-write",
                "number_of_shards": 4
            }
        }
}

The first index was created as follows:

PUT %3Cfluent-bit-%7Bnow%2Fd%7D-1%3E
{
        "aliases": {
            "fluent-bit-write": {}
        }
}

The first couple of indices went through the ILM cycle just fine, but recently, I have been noticing that when the rollover condition is met, a new index is created, the new index's rollover_alias is set to fluent-bit-write, the previous index's rollover_alias is set to none. And the previous index fails with index.lifecycle.rollover_alias does not point to index on the rollover attempt. The error makes sense given that the write alias is no longer pointed at the previous index, but I don't understand why whatever rollover step is happening under the hood occurs after the rollover_alias has been switched. I'm not sure what else needs to be done to address this. I have the same exact setup on a smaller cluster that has a lot fewer shard activities. I am wondering if the general load on the cluster is causing some steps to stall.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.