Why ILM is impractical for managing logs / what curator can and cannot do better

In our setup we have been using curator since a long time, but recent features (namely cold nodes with searchable snapshots), are not supported by it. I want to point out the issues we are having with ILM and the workaround approaches we tried out.

the general problem

Unlike curator, ILM only handles indexes individually. If you want ILM to delete the indexes with your oldest logs, for example, you can just do that based on the age of the index. There is no way that this prevents your cluster from running out of space when your log volume increases unexpectedly.

With that, you are either forced to overprovision your cluster (increasing your permanant costs for resources that are unused most of the time), or by getting someone woken up in the night whenever your cluster is running full (who will most likely just delete the oldest logs manually then!)

issues addressing this problem have unfortunately no progress since almost 2 years:

what curator can do (better)

you can still use curator if you just need to perform basic actions like:

  • rollover when indexes get too big
  • allocate to different nodes (hot to warm)*
  • delete old indexes*

*unlike ILM, you can do these actions using space filters based on the disk usage in your cluster. This can reliably ensure that your cluster doesn't run out of disk space, independent of the ingest volume of logs!

what curator cannot do

Unfortunately, development on curator has basically stopped. Because of that, new features of elasticsearch are only supported by ILM.

A recent feature where this is the case are searchable snapshots, used on cold data nodes.

The action would require multiple steps to be executed, including mounting of the snapshot, which curator cannot do.

using curator together with ILM

Since curator can update index settings, you can actually use it to apply a certain ILM policy to an index. But that introduces a few problems:

The approach sound like it would solve all the issues discussed, by just creating multiple ILM policies and using curator to switch between them. Unfortunately, changing the lifecycle policy of an index has no effect by design: The index won't leave the phase of its old policy until that ends, and that ending condition can again just be time based - or it never ends in case it is the final phase of the policy.

Another thing I tried, to let curator move an index to the next phase of an ILM policy, is updating its index.lifecycle.origination_date, which ILM should use to determine the index age. But that is also not useable as a workaround.

I'd be happy for any feedback and other ideas that may work better than the approaches we tried so far!

1 Like

One pitfall of deleting indices by free space is that if you have something that ingests your available space "today", you delete all your old indices. If the "new" stuff was a bug, you've burnt your bridge.

Many of us have retention required by some policy, even if we're out of space. (That's the "enterprise model", demands are made without funding to meet them :slight_smile: )

But you make good points, if that usage fits your policy, Curator lives!

I apologize for the slow pace of Curator development and releases. I am no longer in a development role, but I still do what I can, when I have time (which is unfortunately not as often as I'd like).

It's not exactly orphaned, but it will take time to get some of these features in place (frozen tier/cold nodes, etc.). A big change is coming in the next few days (time permitting) that will remove support for Python 2.7 (AWS and the boto3 package are forcing this). My immediate focus is a release which takes care of this.

@theuntergeek thanks for the outlook and for still giving curator some support!
It would just be nice if elastic could provide some dedicated development resources to the tool when you basically have other tasks now that consume most of your time.

I appreciate that. You should know, though, that Curator has always been my project. I am the only "resource" that has ever been dedicated to it. I'm the only maintainer that has been officially attached to the project. With ILM being the de facto standard now, I'm not sure Curator will ever be allocated any other resources.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.