Elasticsearch Curator - deleting indices by priorities (with python api)

I want to delete indices when my cluster storage reaches a fixed water mark (for example - 1TB).

While I can sympathize with your desire for this, this is not now, nor has it ever been a recommended approach to data retention in Elasticsearch. A search through the issues in the source code repository for Curator will reveal I added filter_by_space (or the older disk_space variant before v4) with reservations frequently expressed (and included as caveats in the Curator documentation) Why? Because shard allocation can result in unequal distribution of data, which means that while it might be just fine for some users to use this approach, it would be wholly inadequate, if not outright dangerous for others. This becomes even more of a problem when dealing with differing indices, containing different data in different amounts.

The second reason this is not recommended is that Elasticsearch is unable to report the amount of space consumed by indices in closed state, which could render usage report API calls completely inaccurate. This could result in closed indices being deleted erroneously, or worse, open indices getting deleted when that behavior was not desired.

The third reason this approach is not recommended is that a high shard count per node will affect a cluster's performance, regardless of how much disk space is used (or not used). Generally, on a node with a 30G heap (not system RAM size), you should not exceed 600 shards per node. This value scales (not necessarily linearly) with your heap size. A smaller heap means a smaller number of shards per node before things start to go south (index speed decreases, search performance decreases, garbage collections increase in frequency and duration, memory pressure increases). Setting an arbitrary watermark ignores these constraints. Users who haven't learned them or been affected by them believe they should be able to fit as many shards on a node as there is disk space to accommodate, which can lead to memory pressure, followed by a cascade of failures.

For these reasons, deleting indices exceeding a disk space watermark is not likely to be added to Curator until it becomes a practice marked as either acceptable or recommended by the core Elasticsearch developers. While you might be able to successfully argue that your particular use case may be a safe one in which to use this approach, providing it as an out-of-the-box feature in Curator makes it look like it is not only acceptable, but completely normal. I just can't do that.

My personal recommendation would be to stick with the hard limits you suggested, deleting anything exceeding the last 30 days of group 1, 90 days of group 2, while ignoring group 3. I know that's not likely to be viewed as an improvement on how you are doing things, but that is my considered opinion, presaged by all of the reasons why.

1 Like