Recommended setup using ILM to implement a document retention of n days

What is the recommended setup using ILM to implement a document retention of n days?

I'm trying to get ILM working with data retention based on number of days since document @timestamp to offer a log service where n days of log data is available in Kibana.
Pre-ILM, I've used a cron job that curled a DELETE request to the index with the date n days ago.
That solution required the index name (logstash-2019.12.29 for example) to match the documents timestamp, thus requiring the data source to pick the index name based on the document data.
Since ILM came around I could use it to delete indices n days old, which works fine as long as the indices are created on the date in the name.

Now when I upgraded our filebeat agents and ES to 7.5.1 and activated ILM, it started to use index names like filebeat-2019.12.20-000001 and the next day filebeat-2019.12.20-000002 with a rollover set to max_age 1d. This means that the date part isn't changing in the index name.
I also discovered that delete phase min_age now acts on date for rollover and not index creation.
My conclusion is that I should turn off rollover in the ILM policy, but keep the delete phase at age of n days, and make sure our agents continue to use the index pattern filebeat-Y.m.d without ILM enabled in the filebeat config to make this work as per our requirements.

This results in very different sizes of indices from day to day but if you use rollover + delete I can't see how I can provide n days of log data because the delete phase then acts on rollover date (could be 1h after creation some days with max size set) and not creation date.

I think what you need is:

index.lifecycle.parse_origination_date

(available since v7.5.0). As you use daily indices, this should do the trick and it will take the time from the index name and not from the start of the phase.

We sometimes use monthly indices, and unfortunately you cannot set the date pattern. But you also have the option to set:

index.lifecycle.origination_date

although it seems you need to set it manually. It would be nice to have an option to select a timestamp field and some option to select value 'youngest' or 'oldest' (like curator has a filter direction).

We're also testing ILM (with LS) and the behaviour we get i 7.5.1 is as expected:

  • (We create a custom policy)
  • LS bootstraps the index (to test we use ilm_pattern => "{now/m{yyyy.MM.dd.HH.m}}-000001" )
  • The rollover phase of ILM uses the index.provided_name from the index (I assume) and we see that a new index get the correct name (if the new index is created in minute 45, the index gets name rollover_alias-2020.01.02.10.45-000002)

So I would expect it to work with daily indices as well. You can check how filebeat bootstraps the index. Maybe "filebeat-2019.12.20" is hardcoded and is not "filebeat-{now/d}". Check the index setting index.provided_name after filebeat creates the first index.

@gjelu, thank you very much for your answer. It helped me a lot.
Nothing in my tests made sense to me until I finally re-read the last sentence of what you wrote:

"Check the index setting index.provided_name after filebeat creates the first index."

I didn't get it until I tried out the example of using date math from the docs:
https://www.elastic.co/guide/en/elasticsearch/reference/7.5/indices-rollover-index.html#_using_date_math_with_the_rollover_api

Then I understood what you meant by using a minute-based pattern for testing. In my first attempt I turned on "parse_origination_date" and created an index with the hard-coded name of yesterday's date and checked the result with a dry_run rollover. The parse_origination_date must of cource be turned off in the template when

I used the example from the docs above with your pattern like this:

PUT /%3Clogs-%7Bnow%2Fm%7Byyyy.MM.dd.HH.m%7D%7D-1%3E 
{
  "aliases": {
    "logs_write": {}
  }
}

And then tried the rollover like this:

POST /logs_write/_rollover?dry_run=true 
{
  "conditions": {
    "max_docs":   "1"
  }
}
....
{
  "old_index" : "logs-2020.01.04.13.57-1",
  "new_index" : "logs-2020.01.04.14.13-000002",
}

So today's big learning is that the index.provided_name of the current write index MUST contain some date math in order to make use of the date (and time) part when creating the new index with the rollover feature. You can't create an index with yesterday's date hard coded in the name to test it. Using a minute based pattern is easiest for testing, to avoid the wait for the next date :wink:

I also learned that the rollover API accepts a target index name that is useful for testing different date math patterns, thus a simplification or first having to create an index with the pattern to test defined in its provided_name.

Example using the filebeat default rollover alias:

POST filebeat-7.5.1/_rollover/%3Cfilebeat-7.5.1-%7Bnow%2Fm%7Byyyy.MM.dd.HH.m%7D%7D-1%3E?dry_run=true
{
  "conditions": {
    "max_age": "1d"
  }
}
....
{
  "old_index" : "filebeat-7.5.1-2019.12.27-000011",
  "new_index" : "filebeat-7.5.1-2020.01.04.23.21-1",
}

The reason for my issues is after upgrading ES+Beats to 7.x where the Beats default is to use ILM if it's enabled in ES. The name of the write index in use at the point of the upgrade was not created using dath math, thus the reason for ILM not to change the date part during rollover.

So lesson 2 of today is to check the index settings provided name of the current write index when manually enabling the rollover phase in an ILM policy.

The reason for my issue with the hard-coded sticky date between rollovers (2019.12.20-000001, 2019.12.20-000002, ... and not 2019.12.20-000001, 2019.12.21-000001) was because I didn't use the rollover API initially when migrating away from the pattern filebeat-7.5.1-000001, which was the default ILM pattern that was activated when upgrading to Beats 7.x. Instead, I used the clone API to rename the index. This way, the date was hard-coded in the index name / provided name, thus lacking the date math in the latter.

Expected behavior

  • Use current date in index name
  • Rollover based on date in index name

Solution

Step 1: Manual rollover using date math to get the date math pattern into index provided_name

POST filebeat-7.5.1/_rollover/%3Cfilebeat-7.5.1-%7Bnow%2Fd%7D-000001%3E
{
}
....
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "old_index" : "filebeat-7.5.1-2019.12.20-000012",
  "new_index" : "filebeat-7.5.1-2020.01.04-000001",
  "rolled_over" : true,
  "dry_run" : false,
  "conditions" : { }
}

Step 2: Update template to enable parsing of origination date from index name

Edit the filebeat-7.5.1 index template (matching filebeat-*) and add the parse_origination_date setting to the index settings:

{
  "index": {
    "lifecycle": {
      "name": "filebeat-7.5.1",
      "rollover_alias": "filebeat-7.5.1",
      "parse_origination_date": "true"
    },
    ...

Tests

  • Assert "parse_origination_date": "true" of indices created by rollover
  • Assert origination_date of indices created by rollover (must not be -1)
  • Assert use of date math in provided_name of indices created by rollover
{
  "settings": {
    "index": {
      "lifecycle": {
        "name": "filebeat-7.5.1",
        "parse_origination_date": "true",
        "rollover_alias": "filebeat-7.5.1",
        "origination_date": "1578096000000" # 2020.01.04 00:00 UTC
      },
      ...
      "provided_name": "<filebeat-7.5.1-{now/d}-000001>",
      ...

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.