Sudden rollover of apm indices

Hi,
I am new to elastic apm and have been facing issue with ilm policies.

Following tasks are done:

  1. Edited metrics-apm.app_metrics-default_policy and saved as custom-metrics-apm.app_metrics-default_policy
  2. Updated ilm policy in metrics-apm.app@custom component template. Applied and saved it.
  3. Then chose the option to wait for next rollover and apply while saving the template.

Next day, suddenly all the metrics-apm-default indices are rolled-over, even though the condition of 30 days was not met.

To rule out the interference of fleet, I created a new ilm policy new-metrics-apm.app_metrics-default_policy, with 90d or 50GB max primary shard condition in hot phase; and 180d deletion phase age.
Then I have updated it in metrics-apm.app@custom component template. Applied and saved it.
This time while saving I chose Rollover now and apply policy.
All the indices got rolled-over immediately, as expected.

But after ~5 min, all the metrics-apm-default indices got rolled-over automatically.

Please help me understand this abrupt rollover behaviour.

Elasticsearch version: 8.8.0
APM server version: 8.8.1
No fleet server was used in the installation.

Please reply. This issue is creating lot of issues, as all the indices of metrics-apm.app are rolled-over almost every other day

Hello,

You need to share your policies, both the original one and the one that you changed.

Default policy

{
  "metrics-apm.app_metrics-default_policy": {
    "version": 1,
    "modified_date": "2024-03-26T10:16:30.889Z",
    "policy": {
      "phases": {
        "hot": {
          "min_age": "0ms",
          "actions": {
            "rollover": {
              "max_age": "30d",
              "max_size": "50gb"
            },
            "set_priority": {
              "priority": 100
            }
          }
        },
        "delete": {
          "min_age": "90d",
          "actions": {
            "delete": {
              "delete_searchable_snapshot": true
            }
          }
        }
      },
      "_meta": {
        "package": {
          "name": "apm"
        },
        "managed_by": "fleet",
        "managed": true
      }
    },
    "in_use_by": {
      "indices": [],
      "data_streams": [],
      "composable_templates": []
    }
  }
}

New Policy

{
  "new-metrics-apm.app_metrics-default_policy": {
    "version": 1,
    "modified_date": "2024-03-15T09:34:30.393Z",
    "policy": {
      "phases": {
        "hot": {
          "min_age": "0ms",
          "actions": {
            "rollover": {
              "max_age": "90d",
              "max_primary_shard_size": "50gb"
            },
            "set_priority": {
              "priority": 100
            }
          }
        },
        "delete": {
          "min_age": "180d",
          "actions": {
            "delete": {
              "delete_searchable_snapshot": true
            }
          }
        }
      }
    },
    "in_use_by": {
      "indices": [".ds-metrics-apm.app.test-default-2024.03.26-000015"],
      "data_streams": ["metrics-apm.app.test-default"],
      "composable_templates": [
        "metrics-apm.app"
      ]
    }
  }
}

Please reply on this. It is causing lot of issue in my elasticsearch cluster.

Hello,

I see no issues in the policy, the new policy should rollover after 90 days or if a primary shard reaches 50 gb, so not sure what is happening, you will need to provide more evidence.

What is the return of GET _cat/shards? This will show your shards to help understand how your indices looks like.

Also, check the elasticsearch logs and see if there is some logs about ILM.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.