Sudden rollover of apm indices

Hi,
I am new to elastic apm and have been facing issue with ilm policies.

Following tasks are done:

  1. Edited metrics-apm.app_metrics-default_policy and saved as custom-metrics-apm.app_metrics-default_policy
  2. Updated ilm policy in metrics-apm.app@custom component template. Applied and saved it.
  3. Then chose the option to wait for next rollover and apply while saving the template.

Next day, suddenly all the metrics-apm-default indices are rolled-over, even though the condition of 30 days was not met.

To rule out the interference of fleet, I created a new ilm policy new-metrics-apm.app_metrics-default_policy, with 90d or 50GB max primary shard condition in hot phase; and 180d deletion phase age.
Then I have updated it in metrics-apm.app@custom component template. Applied and saved it.
This time while saving I chose Rollover now and apply policy.
All the indices got rolled-over immediately, as expected.

But after ~5 min, all the metrics-apm-default indices got rolled-over automatically.

Please help me understand this abrupt rollover behaviour.

Elasticsearch version: 8.8.0
APM server version: 8.8.1
No fleet server was used in the installation.

Please reply. This issue is creating lot of issues, as all the indices of metrics-apm.app are rolled-over almost every other day

Hello,

You need to share your policies, both the original one and the one that you changed.

Default policy

{
  "metrics-apm.app_metrics-default_policy": {
    "version": 1,
    "modified_date": "2024-03-26T10:16:30.889Z",
    "policy": {
      "phases": {
        "hot": {
          "min_age": "0ms",
          "actions": {
            "rollover": {
              "max_age": "30d",
              "max_size": "50gb"
            },
            "set_priority": {
              "priority": 100
            }
          }
        },
        "delete": {
          "min_age": "90d",
          "actions": {
            "delete": {
              "delete_searchable_snapshot": true
            }
          }
        }
      },
      "_meta": {
        "package": {
          "name": "apm"
        },
        "managed_by": "fleet",
        "managed": true
      }
    },
    "in_use_by": {
      "indices": [],
      "data_streams": [],
      "composable_templates": []
    }
  }
}

New Policy

{
  "new-metrics-apm.app_metrics-default_policy": {
    "version": 1,
    "modified_date": "2024-03-15T09:34:30.393Z",
    "policy": {
      "phases": {
        "hot": {
          "min_age": "0ms",
          "actions": {
            "rollover": {
              "max_age": "90d",
              "max_primary_shard_size": "50gb"
            },
            "set_priority": {
              "priority": 100
            }
          }
        },
        "delete": {
          "min_age": "180d",
          "actions": {
            "delete": {
              "delete_searchable_snapshot": true
            }
          }
        }
      }
    },
    "in_use_by": {
      "indices": [".ds-metrics-apm.app.test-default-2024.03.26-000015"],
      "data_streams": ["metrics-apm.app.test-default"],
      "composable_templates": [
        "metrics-apm.app"
      ]
    }
  }
}

Please reply on this. It is causing lot of issue in my elasticsearch cluster.

Hello,

I see no issues in the policy, the new policy should rollover after 90 days or if a primary shard reaches 50 gb, so not sure what is happening, you will need to provide more evidence.

What is the return of GET _cat/shards? This will show your shards to help understand how your indices looks like.

Also, check the elasticsearch logs and see if there is some logs about ILM.