Curator should be able to shrink indices by size

Hi all, I've been experimenting with Elasticsearch and curator for a bit, but am stuck on figuring out how to shrink indices by their sizes.

My current problem is as such: I have been handed a cluster with over 2000 shards per node (clearly, some poor cluster management here!), and am trying to reduce the shard count by shrinking the number of shards back to indices. The issue is that the shards are all highly imbalanced in size, ranging from a few KBs to 100+ GBs. I have created templates for the incoming indices already, but there are still tens of thousands of shards to handle downstream. What I'm planning to do is to shrink shards smaller than 25GB in size back to a single shard. From what I have explored so far, curator seems to have no option to do curate by index size - the closest I see is the "space" filter. When I tried it, curator kept terminating the moment the next index was >25GB, due to the cumulative sum implementation of the disk_space function.

What I do suggest is either 1) curator should be able to have a filtertype by index size, or 2) in the implementation of disk_space, allow sorting by index size

At the same time, I'd like to also verify with the community if the way I'm approaching this issue is appropriate. The cluster details are as such: Elasticsearch v6.3.2, Curator 5.5.4, 3 Master 3 Client 9 Data Nodes, all run on Kubernetes.

You mean, like the disk space filter? You need to do more filtering besides space. Like a combination of pattern filters and age filters to identify indices within a certain time frame to identify indices of a certain kind and size.

It's unlikely that Curator will be modified to do the kind of disk-consumption-based approach you seek, as it's an edge case. You're probably better off re-indexing multiple small indices into larger ones, rather than try to winnow them out by size and then go from there.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.