ILM: empty indices didn't age-out

Hi,

I create an ILM policy and apply it to my index template, everything works fine at first until after a day, there will be an empty index left

Here is my ILM policy

{
  "test-policy" : {
    "version" : 4,
    "modified_date" : "2023-01-09T08:58:04.738Z",
    "policy" : {
      "phases" : {
        "warm" : {
          "min_age" : "4m",
          "actions" : {
            "shrink" : {
              "number_of_shards" : 1
            },
            "forcemerge" : {
              "max_num_segments" : 1
            }
          }
        },
        "hot" : {
          "min_age" : "0ms",
          "actions" : {
            "rollover" : {
              "max_primary_shard_size" : "100mb",
              "max_age" : "3m",
              "min_docs" : 1
            }
          }
        },
        "delete" : {
          "min_age" : "5m",
          "actions" : {
            "delete" : {
              "delete_searchable_snapshot" : true
            }
          }
        }
      },
      "_meta" : {
        "managed" : true,
        "description" : "test ILM policy"
      }
    },
    "in_use_by" : {
      "indices" : [
        ".ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.11-2023.01.12-000533",
        ".ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.11-2023.01.12-000535",
        ".ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.09-2023.01.10-000707",
        ".ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.11-2023.01.12-000537",
        ".ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.10-2023.01.11-000721"
      ],
      "data_streams" : [
        "ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.11",
        "ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.10",
        "ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.09"
      ],
      "composable_templates" : [
        "ie2c4nrngq4c25ljnrqs25tjmm======"
      ]
    }
  }
}

and I set the poll_interval to 1 minute for testing.

{
  "persistent" : { },
  "transient" : {
    "indices" : {
      "lifecycle" : {
        "poll_interval" : "1m"
      }
    }
  }
}

The rollover was working fine for the day, each index changed the phase normally, but the index was stuck in the hot phase after a day or the datastream rollover to a new one.

this is my indices

health status index                                                             uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   .ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.09-2023.01.10-000707 NSNisOVOSRahrnkHCV0D2Q   1   0          0            0       225b           225b
green  open   .ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.10-2023.01.11-000721 69dcpu3DQqq_tk7g5g9HAg   1   0          0            0       225b           225b
green  open   .ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.11-2023.01.12-000555 XsuEb-3dT62oPtt_gEEmXA   1   0        173            0      104kb          104kb
green  open   .ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.11-2023.01.12-000557 OT6_Z5xTTZCBes39uDh7og   1   0        977            0    250.3kb        250.3kb
green  open   .ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.11-2023.01.12-000559 Sj8-2AASTqq0GBoH2U8Vfw   1   0         13            0     65.1kb         65.1kb

I create a custom index template with

{
	"index_templates": [
		{
			"name": "ie2c4nrngq4c25ljnrqs25tjmm======",
			"index_template": {
				"index_patterns": [
					"ie2c4nrngq4c25ljnrqs25tjmm======-*"
				],
				"template": {
					"settings": {
						"index": {
							"lifecycle": {
								"name": "test-policy",
								"rollover_alias": "ie2c4nrngq4c25ljnrqs25tjmm======"
							},
							"mapping": {
								"total_fields": {
									"limit": "10000"
								}
							},
							"refresh_interval": "5s",
							"number_of_shards": "1",
							"max_docvalue_fields_search": "200",
							"query": {
								...
							},
							"number_of_replicas": "0"
						}
					},
					"mappings": {
						...
					}
				},
				"composed_of": [],
				"priority": 150,
				"data_stream": {
					"hidden": false,
					"allow_custom_routing": false
				}
			}
		}
	]
}

and the stuck index ILM explain see below. The age is 1.77 days but it still stays in the hot phase.

{
  "indices" : {
    ".ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.09-2023.01.10-000707" : {
      "index" : ".ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.09-2023.01.10-000707",
      "managed" : true,
      "policy" : "test-policy",
      "index_creation_date_millis" : 1673337803379,
      "time_since_index_creation" : "1.77d",
      "lifecycle_date_millis" : 1673337803379,
      "age" : "1.77d",
      "phase" : "hot",
      "phase_time_millis" : 1673337804232,
      "action" : "rollover",
      "action_time_millis" : 1673337804843,
      "step" : "check-rollover-ready",
      "step_time_millis" : 1673337804843,
      "phase_execution" : {
        "policy" : "test-policy",
        "phase_definition" : {
          "min_age" : "0ms",
          "actions" : {
            "rollover" : {
              "max_primary_shard_size" : "100mb",
              "max_age" : "3m",
              "min_docs" : 1
            }
          }
        },
        "version" : 4,
        "modified_date_in_millis" : 1673254684738
      }
    }
  }
}

Is there any setting I am missing or any setup wrong?
and I found that the data stream didn't be removed after a day, is this related?

Thanks

Welcome to our community! :smiley:

ILM timings of >10 min will usually not work, as it doesn't poll that often. I would increase those times and then see if it works.

Thanks for your reply.

Should I increase hot, warm, and delete time?
like

hot : {
   min_age : 0ms,
   ...
},
warm : {
   min_age : 10m,
   ...
},
delte: {
   min_age : 20m,
   ...
}

or increase the time interval

{
  "persistent" : { },
  "transient" : {
    "indices" : {
      "lifecycle" : {
        "poll_interval" : "10m"
      }
    }
  }
}

Hi

I increase the time for ILM

{
  "test-policy" : {
    "version" : 5,
    "modified_date" : "2023-01-12T03:17:34.856Z",
    "policy" : {
      "phases" : {
        "warm" : {
          "min_age" : "15m",
          "actions" : {
            "shrink" : {
              "number_of_shards" : 1
            },
            "forcemerge" : {
              "max_num_segments" : 1
            }
          }
        },
        "hot" : {
          "min_age" : "0ms",
          "actions" : {
            "rollover" : {
              "max_primary_shard_size" : "100mb",
              "max_age" : "10m",
              "min_docs" : 1
            }
          }
        },
        "delete" : {
          "min_age" : "20m",
          "actions" : {
            "delete" : {
              "delete_searchable_snapshot" : true
            }
          }
        }
      },
      "_meta" : {
        "managed" : true,
        "description" : "test ILM policy"
      }
    },
    "in_use_by" : {
      "indices" : [
        ".ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.11-2023.01.12-000607",
        ".ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.09-2023.01.10-000707",
        ".ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.11-2023.01.12-000602",
        ".ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.12-2023.01.12-000001",
        ".ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.12-2023.01.12-000002",
        ".ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.10-2023.01.11-000721",
        ".ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.11-2023.01.12-000605"
      ],
      "data_streams" : [
        "ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.11",
        "ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.10",
        "ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.09",
        "ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.12"
      ],
      "composable_templates" : [
        "ie2c4nrngq4c25ljnrqs25tjmm======"
      ]
    }
  }
}

and also increase the time interval of lifecycle

{
  "persistent" : { },
  "transient" : {
    "indices" : {
      "lifecycle" : {
        "poll_interval" : "10m"
      }
    }
  }
}

It happened again after 5 hours.

health status index                                                             uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   .ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.09-2023.01.10-000707 NSNisOVOSRahrnkHCV0D2Q   1   0          0            0       225b           225b
green  open   .ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.10-2023.01.11-000721 69dcpu3DQqq_tk7g5g9HAg   1   0          0            0       225b           225b
green  open   .ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.11-2023.01.12-000605 Y8pEB2UQSY6YJP0y06GAkg   1   0       1401            0    400.3kb        400.3kb
green  open   .ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.11-2023.01.12-000607 29WGumA1SsyzZKEfL_CyUg   1   0          0            0       225b           225b
green  open   .ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.12-2023.01.12-000001 wz2XJozbTe6PPi_f-JXjzw   1   0       1581            0    519.5kb        519.5kb
green  open   .ds-ie2c4nrngq4c25ljnrqs25tjmm======-2023.01.12-2023.01.12-000002 2wnX6AZKTt-qzJmZFQesWg   1   0        485            0    311.7kb        311.7kb

I don't where I set it up wrong. The empty indices all stuck in the hot phase, and I believe it was in the delete phase, and the document has already been deleted but the indices are still there.

Could someone give me a hand?

Hi @Adam_Lin Welcome to the community.

ILM is not meant to be based on minutes and / or MB... It's meant for GB , hours and days, and it works very well for production use cases.

ILM is a background/ lower priority task and run based on the policy opportunistically.

You are one of many people that have tried to build ilm policies built on minutes or small sizes... It's not going to work as you expect it to.

Perhaps look at these threads

It will work as you're expected to for actual production volumes and timelines.

I assume you are just testing to see how it works..

Go ahead and set for realistic values 10-50GB and / or multiple hours or daily and it will rollover within a small percentage of the policy

Also not sure how you set up your cluster, do you have warm nodes that you are expecting the indices to move to? If so how did you define them? If there is nowhere valid to move the index it will stay in the current phase... so no warm nodes... no warm phase... then no delete phase...

Hi setphenb,

Thanks for your reply.

There is only just one node in my cluster for testing.

I expect that the index will be roll-over by following the ILM like:
index create -> go to warm phase after 1 day -> deleted after 7 days.

For the convenience of testing, I reduce the time from 1days, 7 days to 15, 20 mins.

I'll try to increase the time for the ILM to see if it gonna work.

If you only have 1 node there is no need for warm just Hot with rollover then delete.

So hot > 1 day rollover > Delete after 7 days from rollover.. which will be 8 days total (delete is calculated from rollover)

And to repeat 10-20 min probably not going to be consistent.. set to 1 day and watch it work.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.