Rollover Failure - Unable to auto set lifecycle.rollover_alias after rollover - Moving to ERROR step

Getting a "Moving to ERROR step" while rollover from an alias when "max_age" paramter is met

Steps to reproduce

  1. Create an Index Template
{
      "name": "temp-test_idx_template",
      "index_template": {
        "index_patterns": [
          "temp-test*"
        ],
        "template": {
          "settings": {
            "index": {
              "lifecycle": {
                "name": "temp-test_policy"
              },
              "analysis": {
                "analyzer": {
                  "domain_name_analyzer": {
                    "filter": "lowercase",
                    "type": "custom",
                    "tokenizer": "domain_name_tokenizer"
                  }
                },
                "tokenizer": {
                  "domain_name_tokenizer": {
                    "type": "char_group",
                    "tokenize_on_chars": [
                      ".",
                      "$"
                    ]
                  }
                }
              },
              "number_of_shards": "1",
              "number_of_replicas": "1"
            }
          },
          "mappings": {
            "properties": {
              "field_1": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword"
                  }
                }
              },
              "field_2": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword"
                  }
                }
              }
            }
          }
        },
        "composed_of": []
      }
    }
  1. test-temp_ilm_policy
{
  "temp-test_policy": {
    "version": 12,
    "modified_date": "2023-06-16T12:37:31.561Z",
    "policy": {
      "phases": {
        "hot": {
          "min_age": "0ms",
          "actions": {
            "set_priority": {
              "priority": 26
            },
            "rollover": {
              "max_primary_shard_size": "50gb",
              "max_age": "1m"
            }
          }
        },
        "delete": {
          "min_age": "10h",
          "actions": {
            "delete": {
              "delete_searchable_snapshot": true
            }
          }
        }
      }
    },
    "in_use_by": {
      "indices": [
        "temp-test-000001",
        "temp-test-000002"
      ],
      "data_streams": [],
      "composable_templates": [
        "temp-test_idx_template"
      ]
    }
  }
}
  1. Create an alias
PUT temp-test-000001
{
    "settings": {
      "index.lifecycle.name": "temp-test_policy",
      "index.lifecycle.rollover_alias": "temp-test"
    },
    "aliases": {
      "temp-test":{
        "is_write_index": True 
      }
    }
  }
  1. Ingest some sample data
  2. After ingestion, ilm "max_age" parameter met and new alias "temp-test-000002" got created. But started receiving ERRORs in the logs
{"@timestamp":"2023-06-16T14:01:52.471Z", "log.level":"ERROR", "message":"policy [temp-test_policy] for index [temp-test-000002] failed on step [{\"phase\":\"hot\",\"action\":\"rollover\",\"name\":\"check-rollover-ready\"}]. Moving to ERROR step", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[elasticsearch-master-2][trigger_engine_scheduler][T#1]","log.logger":"org.elasticsearch.xpack.ilm.IndexLifecycleRunner","elasticsearch.cluster.uuid":"ibThozLaQAmiKpvog933SA","elasticsearch.node.id":"rw4LuAM8S-2uuETb14g0hw","elasticsearch.node.name":"elasticsearch-master-2","elasticsearch.cluster.name":"elasticsearch","error.type":"java.lang.IllegalArgumentException","error.message":"setting [index.lifecycle.rollover_alias] for index [temp-test-000002] is empty or not defined","error.stack_trace":"java.lang.IllegalArgumentException: setting [index.lifecycle.rollover_alias] for index [temp-test-000002] is empty or not defined\n\tat org.elasticsearch.xcore@8.7.0/org.elasticsearch.xpack.core.ilm.WaitForRolloverReadyStep.evaluateCondition(WaitForRolloverReadyStep.java:107)\n\tat org.elasticsearch.ilm@8.7.0/org.elasticsearch.xpack.ilm.IndexLifecycleRunner.runPeriodicStep(IndexLifecycleRunner.java:233)\n\tat org.elasticsearch.ilm@8.7.0/org.elasticsearch.xpack.ilm.IndexLifecycleService.triggerPolicies(IndexLifecycleService.java:427)\n\tat org.elasticsearch.ilm@8.7.0/org.elasticsearch.xpack.ilm.IndexLifecycleService.triggered(IndexLifecycleService.java:355)\n\tat org.elasticsearch.xcore@8.7.0/org.elasticsearch.xpack.core.scheduler.SchedulerEngine.notifyListeners(SchedulerEngine.java:185)\n\tat org.elasticsearch.xcore@8.7.0/org.elasticsearch.xpack.core.scheduler.SchedulerEngine$ActiveSchedule.run(SchedulerEngine.java:219)\n\tat java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:577)\n\tat java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)\n\tat java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)\n\tat java.base/java.lang.Thread.run(Thread.java:1589)\n"}
{"@timestamp":"2023-06-16T14:11:52.470Z", "log.level": "INFO", "message":"policy [temp-test_policy] for index [temp-test-000002] on an error step due to a transient error, moving back to the failed step [check-rollover-ready] for execution. retry attempt [32]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[elasticsearch-master-2][trigger_engine_scheduler][T#1]","log.logger":"org.elasticsearch.xpack.ilm.IndexLifecycleRunner","elasticsearch.cluster.uuid":"ibThozLaQAmiKpvog933SA","elasticsearch.node.id":"rw4LuAM8S-2uuETb14g0hw","elasticsearch.node.name":"elasticsearch-master-2","elasticsearch.cluster.name":"elasticsearch"}

  1. Now, Verify temp-test-000001 settings & temp-test-000002 settings. After verifying, I see "rollover_alias": "temp-test" is missing for temp-test-000002
GET temp-test-000001/_settings

{
  "temp-test-000001": {
    "settings": {
      "index": {
        "lifecycle": {
          "name": "temp-test_policy",
          "rollover_alias": "temp-test",
          "indexing_complete": "true"
        },
        "routing": {
          "allocation": {
            "include": {
              "_tier_preference": "data_content"
            }
          }
        },
        "number_of_shards": "1",
        "provided_name": "temp-test-000001",
        "creation_date": "1686919464827",
        "analysis": {
          "analyzer": {
            "domain_name_analyzer": {
              "filter": "lowercase",
              "type": "custom",
              "tokenizer": "domain_name_tokenizer"
            }
          },
          "tokenizer": {
            "domain_name_tokenizer": {
              "type": "char_group",
              "tokenize_on_chars": [
                ".",
                "$"
              ]
            }
          }
        },
        "priority": "26",
        "number_of_replicas": "1",
        "uuid": "t_AJ7kEiSG6dzBxR-ckzPQ",
        "version": {
          "created": "8070099"
        }
      }
    }
  }
}

GET temp-test-000002/_settings
{
  "temp-test-000002": {
    "settings": {
      "index": {
        "lifecycle": {
          "name": "temp-test_policy"
        },
        "routing": {
          "allocation": {
            "include": {
              "_tier_preference": "data_content"
            }
          }
        },
        "number_of_shards": "1",
        "provided_name": "temp-test-000002",
        "creation_date": "1686922884639",
        "analysis": {
          "analyzer": {
            "domain_name_analyzer": {
              "filter": "lowercase",
              "type": "custom",
              "tokenizer": "domain_name_tokenizer"
            }
          },
          "tokenizer": {
            "domain_name_tokenizer": {
              "type": "char_group",
              "tokenize_on_chars": [
                ".",
                "$"
              ]
            }
          }
        },
        "priority": "26",
        "number_of_replicas": "1",
        "uuid": "oN7F7GeDSLaxUNfGTizirg",
        "version": {
          "created": "8070099"
        }
      }
    }
  }
}

Right now, I am ending up in creating duplicate index templates.
Please help me out with your valuable inputs. Thank you

First testing with max Age 1m ... 1 minute is probably not going to yield valid / proper results as ILM runs as a background process and is not intended to run on such short time frames so that may or may not have something to do with this. You may be encountering a race condition of some sort, you might be trying to rollover 00002 again before it is ready etc..

Clean up, Set it to 10 minutes or so, and wait and see if that works.

Exactly how? What are you using? Assume you are writing to the write alias?

To be precise a new index temp-test-000002 was created not a new alias ... A new alias was not created and that is the issue... there should be a new write alias set to 00002 and the old alias should no longer be a write alias to 000001

There is another way to test this...

Get your test up and running and then simply force a rollover

POST my-write-alias/_rollover

You can also use the _ilm/explain endpoint to see what is going on

GET temp-test-000001/_ilm/explain
GET temp-test-000002/_ilm/explain

Thank you @stephenb for responding.
I have increased the max_age to 15m. Still it didn't worked

error.message":"setting [index.lifecycle.rollover_alias] for index [temp-test-000002] is empty or not defined"

GET temp-test-000001/_ilm/explain

{
  "indices": {
    "temp-test-000001": {
      "index": "temp-test-000001",
      "managed": true,
      "policy": "temp-test_policy",
      "index_creation_date_millis": 1687227995363,
      "time_since_index_creation": "34.31m",
      "lifecycle_date_millis": 1687229376263,
      "age": "11.3m",
      "phase": "hot",
      "phase_time_millis": 1687227995415,
      "action": "complete",
      "action_time_millis": 1687229376463,
      "step": "complete",
      "step_time_millis": 1687229376463,
      "phase_execution": {
        "policy": "temp-test_policy",
        "phase_definition": {
          "min_age": "0ms",
          "actions": {
            "rollover": {
              "max_primary_shard_size": "50gb",
              "max_age": "15m"
            },
            "set_priority": {
              "priority": 26
            }
          }
        },
        "version": 13,
        "modified_date_in_millis": 1687228106981
      }
    }
  }
}

GET temp-test-000002/_ilm/explain

{
  "indices": {
    "temp-test-000002": {
      "index": "temp-test-000002",
      "managed": true,
      "policy": "temp-test_policy",
      "index_creation_date_millis": 1687229376421,
      "time_since_index_creation": "12.82m",
      "lifecycle_date_millis": 1687229376421,
      "age": "12.82m",
      "phase": "hot",
      "phase_time_millis": 1687229456292,
      "action": "rollover",
      "action_time_millis": 1687229376463,
      "step": "check-rollover-ready",
      "step_time_millis": 1687229456292,
      "is_auto_retryable_error": true,
      "failed_step_retry_count": 4,
      "phase_execution": {
        "policy": "temp-test_policy",
        "phase_definition": {
          "min_age": "0ms",
          "actions": {
            "rollover": {
              "max_primary_shard_size": "50gb",
              "max_age": "15m"
            },
            "set_priority": {
              "priority": 26
            }
          }
        },
        "version": 13,
        "modified_date_in_millis": 1687228106981
      }
    }
  }
}

Also, I checked

GET _alias
{
    "temp-test-000001": {
        "aliases": {
            "temp-test": {
                "is_write_index": false
            }
        }
    },
    "temp-test-000002": {
        "aliases": {
            "temp-test": {
                "is_write_index": true
            }
        }
    }
}

Actually, I didn't setup rollover_alias at index template level. I directly provided at alias creation for very first time.
Also, I don't want to provide at index template level due to daily aliases to be created automatically.
Please let me know if I miss anything here

Apologies, I'm not sure I understand.

We Can see with the second ILM explain that the 00002 index is failing to rollover.

It's not exactly clear to me at this time what's keeping it from rolling over.

In your template you need to name the rollover_alias alias just as the error message says That is how it will get set

You are right @stephenb,
If I provide "rollover_alias":"temp-test" at index template level, everything works fine.
But My Index aliases should start with dated ones like
temptest-2023-06-20-000001 and rollover should happen to temptest-2023-06-20-000002
temptest-2023-06-21-000001 and rollover should happen to temptest-2023-06-21-000002
temptest-2023-06-22-000001 and rollover should happen to temptest-2023-06-22-000002

And I have reuse the same index template towards all my different aliases. So I provided rollover_alias while creating index alias

PUT temp-test-2023-06-18-000001
{
    "settings": {
      "index.lifecycle.name": "temp-test_policy",
      "index.lifecycle.rollover_alias": "temp-test"
    },
    "aliases": {
      "temp-test":{
        "is_write_index": True 
      }
    }
  }

Updated the bootstrap

The correct way is


PUT temp-test-2023.06.18-000001 <!--- NOTE the dots not dashes
{
    "settings": {
      "index.lifecycle.name": "temp-test_policy",
      "index.lifecycle.rollover_alias": "temp-test"
    },
    "aliases": {
      "temp-test":{
        "is_write_index": True 
      }
    }
  }

I am confused ... do not try to do daily indices + rollover that will not work as far as I know...

It should look like this, and this will work automatically as long as you set max-age 1d

temptest-2023.06.20-000001 and rollover should happen to temptest-2023.06.20-000002
temptest-2023.06.21-000003 and rollover should happen to temptest-2023.06.21-000004
temptest-2023.06.22-000005 and rollover should happen to temptest-2023.06.22-000006

1 Like

I got your point @stephenb
Even though If I provide like below, it didn't worked.

PUT temp-test-2023.06.18-000001
{
    "settings": {
      "index.lifecycle.name": "temp-test_policy",
      "index.lifecycle.rollover_alias": "temp-test"
    },
    "aliases": {
      "temp-test":{
        "is_write_index": True 
      }
    }
  }

Also, Our requirement is that we do have to create daily aliases(temp-test-2023-06-22, temp-test-2023-06-23, temp-test-06-24) etc.,

And Underlying indices would be like:
temp-test-2023.06.22-000001,temp-test-2023.06.22-000002,temp-test-2023.06.22-000003 temp-test-2023.06.23-000001,temp-test-2023.06.23-000002,temp-test-2023.06.23-000003 temp-test-2023.06.24-000001,temp-test-2023.06.24-000002,temp-test-2023.06.24-000003
etc.,

I am thinking there is glitch while rollover. Because, the rollover happens perfect but its unable to assign the rollover_alias to the newly created underlying index.
From temp-test-2023.06.22-000001 to temp-test-2023.06.22-000002 is being rolled over. But rollover_alias is not assigned to temp-test-2023.06.22-000002

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.