ILM timing - based on index pattern name

I think, ILM is a kind of user-friendly replacement for Curator :slight_smile: Am I right?
That would be great to have all the curator's features in ILM!
For instance, I'd like to define a policy to delete old indices, based on their name index.YYYY-MM-DD and not on their creation date, like ILM currently uses (timing docu).
Is it possible to implement this with the current ILM version?
Thanks!

Short answer: No. Not as yet, anyway.

I do not wish to give false hope, as it may never be added to ILM. ILM presently identified indices in two different ways: creation_date (which you already pointed out), and time since rollover. Since it’s evident you are using daily indices and not rollover-style, you are outside what has become best-practices for time-series data. You have considerably less control over shard and index sizing when you use exclusively date-based retention. Elasticsearch can efficiently use single shard sizes in the 10-50gb size. An Elasticsearch node also begins to suffer memory constraints when you exceed a count of 20 shards per gigabyte of heap space. One of the reasons rollover indices were created was to help reduce shard count per node, allowing them to grow to a larger, expected size before creating a new index. This is because we often saw newer users who were unfamiliar with these constraints with many, many more shards per node than Elasticsearch was able to comfortably maintain (based on the aforementioned math). My recommendation is to plan to migrate to rollover indices in your cluster, and then the ILM approach will begin to make more sense.

If, however, your concern about index date ranges is because you frequently ingest older data and want it to remain in named indices so you can identify it by index name, you should continue to use Curator, as it has many more index selection/filtering options for exactly this sort of thing.

Thank you very much, Aaron, that makes sense.
Yes, this particular use-case is not about indexing time-series data, as you mentioned - we'd need to index older data from time to time.
I wanted to confirm, Curator would be the best choice at the moment. Thanks again!

We do have this functionality coming in 7.5.0.

https://github.com/elastic/elasticsearch/pull/46561 added the ability to set a custom origination date for an index (a date by which the age is calculated rather than creation or rollover date)

https://github.com/elastic/elasticsearch/pull/46755 adds an additional setting (index.lifecycle.parse_origination_date) where the index.lifecycle.origination_date can be automatically parsed from the index's name, so an index named foo-2019-03-23 would have March 23rd as the date by which its age is calculated.

2 Likes

Great news! It's really cool!

P.S. it makes fun to work with Elastic (when Elastic migrates all Logstash plugins to Ingestion Node, the whole stack will be just perfect :))

“All” is not likely to happen. Some, yes. But many will not be, particularly input plugins which listen on a port. Some of these are being mitigated with new inputs in filebeat (which are then forwarded to the ingest nodes) but others will not easily. E migrated.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.