Unable to Split Large Index

I have an index with 21 primary shards and 2 replicas that has grown such that each shard is between 90-120 GB. This is spread between 9 nodes that sit at around 37-40% disk usage right now.

I attempted a split operation that grew the number of primary shards to 168 (increase by a factor of 8) with the goal to reduce shard size to around 10-15GB. The result was that the split created and started all 168 primary shards, but they were all the original sizes. The replica shards never got allocated though, and the shards did not shrink. I reduced the number of replicas down to 0 and then the cluster began to move the large shards around to balance. I saw that this was going to take forever given the size, and could potentially make the cluster run out of disk space since I was now copying all that data.

I believe the shard size is expected as the split must be creating duplicates and then deleting the data that isn't needed in each shard. However, I don't think my cluster has enough resources to handle a split by a factor of 8. Does anyone know if I would have more luck splitting by a factor of 2 and then repeating this a couple more times after the merge has taken care of the extra dup docs?

I believe the difficulties with the split operation came down to the factor by which I was splitting. I think it can take a while for ES to clean up the new shards, especially if the cluster is busy. Elasticsearch appeared to not want to allocate these many oversized shards when the index was so huge (even though actual disk usage was lowish). I did the following to successfully split:

  1. Scale the cluster up a bit so that disk usage was below 30%
  2. Split by a factor of 2 (double number of primary shards) on the weekend when stuff is less busy.
  3. Upgrade the replica count to 2 (like the original index) after the split has finished.

I'm not exactly sure how long the actual split took as I left it Saturday morning and checked on it Saturday afternoon, but increasing the replica count to 2 (the split defaults to 1, even if the original index has 2) only took an hour or so. At the end of the day all of the shards were even smaller than expected, because the split operation appears to have cleaned up a lot of deleted documents.

1 Like