Index Lifecycle Policy Not Deleting Indices

I have an index lifecycle policy set up which I want to delete indices after 30 days. In order to achieve this the policy has two phases:

  1. Hot Phase (with 'Delete data after this phase' selected)

  1. Delete Phase (indices moved here 30 days after creation)

The Elasticsearch request displayed in the UI to create this policy is as follows:

PUT _ilm/policy/bounce-logs-rollover-30-days
{
  "policy": {
    "phases": {
      "hot": {
        "min_age": "0ms",
        "actions": {
          "set_priority": {
            "priority": 100
          }
        }
      },
      "delete": {
        "min_age": "30d",
        "actions": {
          "wait_for_snapshot": {
            "policy": "bounce-daily-snapshots"
          },
          "delete": {
            "delete_searchable_snapshot": true
          }
        }
      }
    }
  }
}

As far as I can tell everything is set up correctly. The snapshot policy which is being waited upon is also executing successfully. However, when I search through my indices I am finding indices older than 30 days still existing and am having to manually delete them.

This index policy is attached to 5 index templates which is linked to ~400 indices. You can see an example of the index template attachment below:

Note: I don't want to perform any rollover to any other phases, just a straightforward deletion of indices after 30 days.

Anyone have any idea?

Can you post the explain output for one of the indices? You can get it with:

GET /<index-name>/_ilm/explain?human

Hi Lee,

Please find below the explain output for the index prod-api-logs-2022.01.01 (more than 30 days old and should have been deleted by the policy)

{
  "indices" : {
    "prod-api-logs-2022.01.01" : {
      "index" : "prod-api-logs-2022.01.01",
      "managed" : true,
      "policy" : "bounce-logs-rollover-30-days",
      "lifecycle_date" : "2022-01-01T00:00:00.365Z",
      "lifecycle_date_millis" : 1640995200365,
      "age" : "31.66d",
      "phase" : "hot",
      "phase_time" : "2022-01-11T12:34:03.322Z",
      "phase_time_millis" : 1641904443322,
      "action" : "rollover",
      "action_time" : "2022-01-01T00:00:03.168Z",
      "action_time_millis" : 1640995203168,
      "step" : "check-rollover-ready",
      "step_time" : "2022-01-11T12:34:03.322Z",
      "step_time_millis" : 1641904443322,
      "is_auto_retryable_error" : true,
      "failed_step_retry_count" : 758,
      "phase_execution" : {
        "policy" : "bounce-logs-rollover-30-days",
        "phase_definition" : {
          "min_age" : "0ms",
          "actions" : {
            "set_priority" : {
              "priority" : 100
            }
          }
        },
        "version" : 6,
        "modified_date" : "2022-01-11T12:26:41.119Z",
        "modified_date_in_millis" : 1641904001119
      }
    }
  }
}

For a newish index (prod-api-logs-2022.01.28) please see below:

{
  "indices" : {
    "prod-api-logs-2022.01.28" : {
      "index" : "prod-api-logs-2022.01.28",
      "managed" : true,
      "policy" : "bounce-logs-rollover-30-days",
      "lifecycle_date" : "2022-01-28T00:00:06.073Z",
      "lifecycle_date_millis" : 1643328006073,
      "age" : "4.67d",
      "phase" : "hot",
      "phase_time" : "2022-01-28T00:00:07.206Z",
      "phase_time_millis" : 1643328007206,
      "action" : "complete",
      "action_time" : "2022-01-28T00:00:08.207Z",
      "action_time_millis" : 1643328008207,
      "step" : "complete",
      "step_time" : "2022-01-28T00:00:08.207Z",
      "step_time_millis" : 1643328008207,
      "phase_execution" : {
        "policy" : "bounce-logs-rollover-30-days",
        "phase_definition" : {
          "min_age" : "0ms",
          "actions" : {
            "set_priority" : {
              "priority" : 100
            }
          }
        },
        "version" : 6,
        "modified_date" : "2022-01-11T12:26:41.119Z",
        "modified_date_in_millis" : 1641904001119
      }
    }
  }
}

Okay, this sounds like you hit this bug: ILM: Retrying a failed step refreshes the cached ILM phase · Issue #81921 · elastic/elasticsearch · GitHub which was fixed in 7.17.0+ in ILM step retry safe refresh of the cached phase by andreidan · Pull Request #82613 · elastic/elasticsearch · GitHub.

Your newer indices doesn't have this problem, as it originally stemmed from the older index being in the rollover step, which had an error, and then the rollover was removed from your ILM policy (which means new indices will not suffer from this problem).

To fix that index in particular, you'll need to use move-to-step, which I would recommend with the following:

POST /_ilm/move/prod-api-logs-2022.01.01
{
  "current_step": {
    "phase": "hot",
    "action": "rollover",
    "name": "check-rollover-ready"
  },
  "next_step": {
    "phase": "hot",
    "action": "complete",
    "name": "complete"
  }
}

You were spot on. Seems to be rolling over my indices fine now, thank you.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.