How to fix Index Lifecycle Rollover Alias is empty or not defined

How did that index get created?

Did you change your log stash output to what I showed you?

Looks like you're still not using write alias in logstash.

Looks like you're still writing daily indices

So your output index should be using the write alias

index => "os-linux"

I changed the output to be index => "os-linux"

However the new index (os-linux) seems to not be associated with any index lifecycle policy. Here is the top portion of the settings where the policy should be:

{
  "settings": {
    "index": {
      "routing": {
        "allocation": {
          "include": {
            "_tier_preference": "data_content"
          }
        }
      },
      "number_of_shards": "1",
      "provided_name": "os-linux",
      "creation_date": "1706226036921",
      "number_of_replicas": "1",
      "uuid": "shQ4er9QS-iWpnNjFDXDWw",
      "version": {
        "created": "7170999"
      }
    }
  },

Hi @roman-tasi

Feel like we are going in circles a bit...

Create the template.

Create the bootstrap index

Then set logstash to point to os-linux that should not create a new index

I am not sure what you are doing but it does not seem you are following the steps in order...

If the bootstrap index with the write alias is created the it will not create a new index with that name os-linux m.. it can not since the alias exists... it will write to the bootstrap index which matches the template which will have the ILM policy...

I suggested you read the documents that also share the steps.

After restarting and following these steps, I now have an index called os-linux-2024.01.26-000001 associated with os-linux-policy. Hopefully it doesn't give a rollover alias error (I'll keep an eye on it).

1 Like

@stephenb

os-linux-2024.01.26-000001 still exists three days later and no new os-linux index has been created. All os-linux output logs are going into the os-linux-2024.01.26-000001. The current phase is hot and the current action is rollover. Keep in mind my os-linux-policy goes from hot phase to cold phase after 20 hours, then delete phase after 2 days.

Run

GET /os-linux-2024.01.26-000001/_ilm/explain

And

GET _ilm/policy/os-linux-policy

Show the complete results of both

And do you have a node with the role data_cold?

@stephenb -

GET /os-linux-2024.01.26-000001/_ilm/explain :

{
  "indices" : {
    "os-linux-2024.01.26-000001" : {
      "index" : "os-linux-2024.01.26-000001",
      "managed" : true,
      "policy" : "os-linux-policy",
      "lifecycle_date_millis" : 1706303495029,
      "age" : "3.15d",
      "phase" : "hot",
      "phase_time_millis" : 1706303495228,
      "action" : "rollover",
      "action_time_millis" : 1706303495837,
      "step" : "check-rollover-ready",
      "step_time_millis" : 1706303495837,
      "phase_execution" : {
        "policy" : "os-linux-policy",
        "phase_definition" : {
          "min_age" : "0ms",
          "actions" : {
            "set_priority" : {
              "priority" : 100
            },
            "rollover" : {
              "max_primary_shard_size" : "50gb",
              "max_age" : "30d"
            }
          }
        },
        "version" : 1,
        "modified_date_in_millis" : 1705694441892
      }
    }
  }
}

GET _ilm/policy/os-linux-policy :

{
  "os-linux-policy" : {
    "version" : 1,
    "modified_date" : "2024-01-19T20:00:41.892Z",
    "policy" : {
      "phases" : {
        "cold" : {
          "min_age" : "20h",
          "actions" : {
            "set_priority" : {
              "priority" : 0
            }
          }
        },
        "hot" : {
          "min_age" : "0ms",
          "actions" : {
            "set_priority" : {
              "priority" : 100
            },
            "rollover" : {
              "max_primary_shard_size" : "50gb",
              "max_age" : "30d"
            }
          }
        },
        "delete" : {
          "min_age" : "2d",
          "actions" : {
            "delete" : {
              "delete_searchable_snapshot" : true
            }
          }
        }
      }
    },
    "in_use_by" : {
      "indices" : [
        "os-linux-2024.01.26-000001"
      ],
      "data_streams" : [ ],
      "composable_templates" : [ ]
    }
  }
}

Can you explain how to find this?

Well first your max age for rollover is 30d so and currently the index is only 3+ days old and it is not 50 GB shards

So the index will not rollover until 50GB or 30d days.

Then really important the next phases are calculated from rollover

So right now your policy says cold 20h after rollover

Then delete 2 days after rollover

And your index has not even rolled over yet... Because it is not 50gb shard or 30 days old

So everything is working as expect but your policy is not right... You need to set the rollover properly...

Read this here

Why don't you just set rollover to 1 day max the delete at 3 days ... Not sure what cold is doing for you....

Update the policy

Force rollover and it should start working

You can force rollover with

POST os-linux/_rollover

@stephenb

Running GET /os-linux-2024.01.30-000002/_ilm/explain outputs:

{
  "indices" : {
    "os-linux-2024.01.30-000002" : {
      "index" : "os-linux-2024.01.30-000002",
      "managed" : true,
      "policy" : "os-linux-policy",
      "lifecycle_date_millis" : 1706729701942,
      "age" : "1.99d",
      "phase" : "cold",
      "phase_time_millis" : 1706802301657,
      "action" : "migrate",
      "action_time_millis" : 1706802301858,
      "step" : "check-migration",
      "step_time_millis" : 1706802302276,
      "step_info" : {
        "message" : "Waiting for all shard copies to be active",
        "shards_left_to_allocate" : -1,
        "all_shards_active" : false,
        "number_of_replicas" : 1
      },
      "phase_execution" : {
        "policy" : "os-linux-policy",
        "phase_definition" : {
          "min_age" : "20h",
          "actions" : {
            "set_priority" : {
              "priority" : 0
            }
          }
        },
        "version" : 3,
        "modified_date_in_millis" : 1706642876728
      }
    }
  }
}

Just wondering why it is "Waiting for all shard copies to be active"

No Clue .. .

Could be lots of reasons... You should open a separate topic.

Reasons could be ...
You have no appropriate cold node.
You do not have disk room?
Something else...

GET _cat/shards/os-linux-2024.01.30-000002

Then, use that info to run... this on whatever shard is not allocated.

GET _cluster/allocation/explain
{
  "index": "os-linux-2024.01.25-000002",
  "shard": 0, 
  "primary": false <
}

Open a new thread with that info... not going to debug in this thread..

It managed to eventually delete itself later that day so I'll mark this thread as solved. Thanks for all the help!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.