Documentation bug on splitting shards -- free disk space requirement?

This doc says elasticsearch requires enough free disk space for a second copy of the index when splitting shards.

  • The node handling the split process must have sufficient free disk space to accommodate a second copy of the existing index.

But this discussion seems to indicate that it depends on how many shards are being created. When splitting a 5 shard index to 20 shards it requires 4x the disk space, right?

Should I file a github issue to get the documentation fixed? Or am I misunderstanding, maybe the hard links splitting creates makes it seem like it's using more space than it really is?

Yeah I think you're misunderstanding. The total size of all the shards will indeed go up by a large amount, but this metric double-counts any files that are hard-linked between shards. The actual disk space needed should only be approximately 2x the size of the original shards.

1 Like

Thanks David. To be clear, splitting shards from 5 => 20 does not even temporarily require 4x the disk space?

Correct