What's the best way to use Shrink Indicies API?

My cluster has the usual 5 primary shards with 1 replica of each. Disks are running low on space and I'm looking at my options.

Do I:

  1. Delete the oldest indices after doing a backup to cloud storage like AWS S3 or Glacier?
  2. Straight up delete old indices
  3. Attempt to shrink the indices down to 1 primary shard and 1 replica

Focusing on option 3 for a minute, the first step is to move all primary and replica shards for the single index onto the same node. Immediate implication of doing this will be that one node's disk space will be out of balance from the cluster. Other implication, with the defaults, you can only shrink down to 1 primary/replica shard if you started out with 5. Single point of failure, but better to have data a little longer than nothing at all?

Is the best practice to have a data node that's only used for this shrink purpose? Then maybe have a filter on that node, so that the only thing ever written to it are indices that are manually being shrunk?

How do you automate this when you have hundreds of indices that you want to re-allocate to shrink?

Any suggestions appreciated!


Points 1, 2 and 3 are very different and you'll chose one or the other depending of your use case.

For example, if you don't care at all about your "old" indices then the best option is to delete them directly. If you need to be able to restore them in the future, then option 1 is obvious. The option 3, using the Shrink API, will help you to decrease the number of shards but you won't gain that much on disk space.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.