Hi all, I've been experimenting with Elasticsearch and curator for a bit, but am stuck on figuring out how to shrink indices by their sizes.
My current problem is as such: I have been handed a cluster with over 2000 shards per node (clearly, some poor cluster management here!), and am trying to reduce the shard count by shrinking the number of shards back to indices. The issue is that the shards are all highly imbalanced in size, ranging from a few KBs to 100+ GBs. I have created templates for the incoming indices already, but there are still tens of thousands of shards to handle downstream. What I'm planning to do is to shrink shards smaller than 25GB in size back to a single shard. From what I have explored so far, curator seems to have no option to do curate by index size - the closest I see is the "space" filter. When I tried it, curator kept terminating the moment the next index was >25GB, due to the cumulative sum implementation of the disk_space function.
What I do suggest is either 1) curator should be able to have a filtertype by index size, or 2) in the implementation of disk_space, allow sorting by index size
At the same time, I'd like to also verify with the community if the way I'm approaching this issue is appropriate. The cluster details are as such: Elasticsearch v6.3.2, Curator 5.5.4, 3 Master 3 Client 9 Data Nodes, all run on Kubernetes.