ILM with rollover for APM indices on Elastic Cloud

Hi everyone;

I've been trying to setup ILM with Rollover for APM Indices and I'm still not able to achieve what I want and I don't even know if it is at all possible on Elastic Cloud.

The context:
Our deployment is used for APM only, it is deployed on a HOT-WARM architecture as this seemed the most adequate architecture for our usage.
We had setup ILM policy to move indices from HOT to WARM after 3 days, reducing the number of replicas at the same time and we then delete indices after 14 days.
14 days retention on our APM is good enough for now and we will look at optimising this later down the road, that's not the focus for now.

APM is configured out of the box to split indices by day.
On a normal day, our span index can contain nearly 20M documents which translate to a size upward of 30GB.
After discussing with the support on some performance issues that we were encountering from time to time with our deployment, we ended up looking for the ability to rollover our APM indices to split them in smaller ones.
Lucky for us, the 7.2 version came out recently and provide ILM with rollover out of the box and there's even a documentation page dedicated to ILM + Rollover + APM: https://www.elastic.co/guide/en/apm/server/7.2/manual-ilm-setup.html
Unfortunately though, this page contains some instructions that cannot be performed on Elastic Cloud (point 7).

Support pointed me to https://www.elastic.co/guide/en/cloud/current/ec-configure-index-management.html but that barelly mention rollover.

The goal:

  • ILM with rollover for APM-* indices, especially: span & transaction, but if it could be applied to error & metric as well that would make things easier.
  • Phases definitions
    • HOT
      • Rollover conditions
        • size > 10GB
        • doc # > 5M
        • age > 1d
      • Priority 100
    • WARM
      • 3 days from rollover
      • 1 replica (instead of 2 that are set by default via the index template)
      • Change allocation requirement to WARM nodes
      • Priority 50
    • DELETE
      • 14 days from rollover

This is what I've done so far:

Define an APM Policy with Rollover

GET /_ilm/policy/apm-policy-with-rollover
{
  "apm-policy-with-rollover": {
    "version": 1,
    "modified_date": "2019-07-23T13:31:52.635Z",
    "policy": {
      "phases": {
        "hot": {
          "min_age": "0ms",
          "actions": {
            "rollover": {
              "max_size": "10gb",
              "max_age": "1d",
              "max_docs": 5000000
            },
            "set_priority": {
              "priority": 100
            }
          }
        },
        "delete": {
          "min_age": "14d",
          "actions": {
            "delete": {}
          }
        },
        "warm": {
          "min_age": "3d",
          "actions": {
            "allocate": {
              "number_of_replicas": 1,
              "include": {},
              "exclude": {},
              "require": {
                "data": "warm"
              }
            },
            "set_priority": {
              "priority": 50
            }
          }
        }
      }
    }
  }
}

I have created aliases for the APM Indices:

GET /_alias/apm-*
{
  "apm-7.2.0-error-000001": {
    "aliases": {
      "apm-7.2.0-error": {
        "is_write_index": true
      }
    }
  },
  "apm-7.2.0-error-2019.07.24": {
    "aliases": {
      "apm-7.2.0-error": {}
    }
  },
  "apm-7.2.0-metric-2019.07.24": {
    "aliases": {
      "apm-7.2.0-metric": {}
    }
  },
  "apm-7.2.0-metric-000001": {
    "aliases": {
      "apm-7.2.0-metric": {
        "is_write_index": true
      }
    }
  },
  "apm-7.2.0-span-000001": {
    "aliases": {
      "apm-7.2.0-span": {
        "is_write_index": true
      }
    }
  },
  "apm-7.2.0-span-2019.07.24": {
    "aliases": {
      "apm-7.2.0-span": {}
    }
  },
  "apm-7.2.0-transaction-2019.07.24": {
    "aliases": {
      "apm-7.2.0-transaction": {}
    }
  },
  "apm-7.2.0-transaction-000001": {
    "aliases": {
      "apm-7.2.0-transaction": {
        "is_write_index": true
      }
    }
  }
}

edited for brevity I have many more "day" indices, not just today's

And I've setup new Index template to setup the future indices with lifecycle, rollover alias & index alias

GET /_template/apm-7.2.0-*
{
  "apm-7.2.0-span": {
    "order": 2,
    "index_patterns": [
      "apm-7.2.0-span-*.*.*"
    ],
    "settings": {
      "index": {
        "lifecycle": {
          "name": "apm-policy-with-rollover",
          "rollover_alias": "apm-7.2.0-span"
        }
      }
    },
    "mappings": {},
    "aliases": {
      "apm-7.2.0-span": {}
    }
  },
  "apm-7.2.0-transaction": {
    "order": 2,
    "index_patterns": [
      "apm-7.2.0-transaction-*.*.*"
    ],
    "settings": {
      "index": {
        "lifecycle": {
          "name": "apm-policy-with-rollover",
          "rollover_alias": "apm-7.2.0-transaction"
        }
      }
    },
    "mappings": {},
    "aliases": {
      "apm-7.2.0-transaction": {}
    }
  },
  "apm-7.2.0-metric": {
    "order": 2,
    "index_patterns": [
      "apm-7.2.0-metric-*.*.*"
    ],
    "settings": {
      "index": {
        "lifecycle": {
          "name": "apm-policy-with-rollover",
          "rollover_alias": "apm-7.2.0-metric"
        }
      }
    },
    "mappings": {},
    "aliases": {
      "apm-7.2.0-metric": {}
    }
  },
  "apm-7.2.0-error": {
    "order": 2,
    "index_patterns": [
      "apm-7.2.0-error-*.*.*"
    ],
    "settings": {
      "index": {
        "lifecycle": {
          "name": "apm-policy-with-rollover",
          "rollover_alias": "apm-7.2.0-error"
        }
      }
    },
    "mappings": {},
    "aliases": {
      "apm-7.2.0-error": {}
    }
  }
}

And on today's indices I have

GET /apm-7.2.0-span-2019.07.24
{
  "apm-7.2.0-span-2019.07.24": {
    "aliases": {
      "apm-7.2.0-span": {}
    },
    "mappings": {...},
    "settings": {
      "index": {
        "mapping": {...},
        "auto_expand_replicas": "false",
        "provided_name": "apm-7.2.0-span-2019.07.24",
        "query": {...},
        "creation_date": "1563926406411",
        "priority": "100",
        "number_of_replicas": "2",
        ...
        "lifecycle": {
          "name": "apm-policy-with-rollover",
          "rollover_alias": "apm-7.2.0-span"
        },
        "codec": "best_compression",
        "routing": {
          "allocation": {
            "require": {
              "data": "hot"
            }
          }
        },
        "number_of_shards": "1"
      }
    }
  }
}

Again edited for brevity.

Can someone help me find what piece I'm missing to make this whole thing work?

The piece I've added today are the apm-7.2.0-*-000001 indices and setting them as write_index, I hope this was the only thing missing and that this will magically start working tonight...
But if you have other ideas I'm all ears.

So it did not magically started working and I still have my APM indices being created as apm-7.2.0-*-2019.07.25 and the apm-7.2.0-*-000001 remain unused.
I'm starting to think that without being able to change the APM Server configuration as highlighted in the on premise documentation (point 7), it is not possible to achieve the setup I'm looking for.

Update, now my past indices (apm-7.2.0-*-2019.07.24 & apm-7.2.0-*-2019.07.25) have a lifecycle error that says

illegal_state_exception: no rollover info found for [apm-7.2.0-span-2019.07.25] with alias [apm-7.2.0-span], the index has not yet rolled over with that alias

I don't know what to do about that... I guess it is linked to the same issue above

For anyone that would be interested, I found out I needed to change a setting in APM server:
https://www.elastic.co/guide/en/cloud/current/ec-manage-apm-settings.html#ec-apm-settings

apm-server.ilm.enabled

Enables index lifecycle management (ILM) for the indices created by the APM Server. Defaults to false . Please make sure that, for existing APM server, you add that setting together with: setup.template.overwrite: true (defaults to false ). Otherwise the index template will not be overridden and ILM changes will not take effect.

Once set to true and index template overriden/reconfigured as shared above, it all started working and I now can have ILM with Rollover with specific configuration for each APM indices.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.