Cloned/Split Indexes Take Double Disk Space When Increasing Shards

Currently I have some large index's (100GB - 500GB) that have been allocated 5 shards. I would like to increase these to 10 - 20 shards each.

If I use the Split API or Clone API it takes 1-2 minutes for the new index to be created, but the disk space doubles if I set it to 10 shards, and quadruples if I set it to 20 shards. (Replicas are still 0). Is there a way to reduce the disk space? The document counts are the same, Im guessing it has to do with how it links to the physical files at the OS level. (If I mess around by opening/closing index etc. some times it will shrink closer to the initial value)

If I use the re-index api it uses about the same disk size, but it takes a very long time.(5-7 min/s per GB). Is there a way to speed this up?

It should reduce after a bit of time, once the split has been done and it cleans up after itself. Is that not what you are seeing?

The initial index size was 165GB, after the split it was 288GB for 10 hours.

After 10 hours I cloned it and deleted the split one, and the clone dropped down to 235GB and has been that way for 4 hours.

Tried with a 5GB index with 5 shards, increased it to 20 shards, disk space increased to 20GB for for 4-5 minutes, then it dropped down to 6GB.

Next tried a 5 Shard, 200 GB Index, Rolled Over to 440 with 20 shards, after 2 minutes dropped to 420

Seems like I am getting inconsistent results.

Are you asking about the total disk consumption as reported by the OS (e.g. using df) or do you mean just for the cloned/split index (e.g. using GET _cat/indices)? The latter double-counts the actual disk space used because of the use of hard links.

GET _cat/indices should report the size of a clone to be identical to the size of the original index.

Splitting the index works by cloning all the shards (multiple times) and then effectively running a delete-by-query on them, which certainly increases the reported size until merging cleans up the deleted docs. If you're still writing to this index then that'll happen in time; if you're not still writing to this index then you can try force-merging to make it happen sooner. There's also some per-shard disk space overhead -- particularly the terms dictionary tends to be large and not to get much smaller after a split since most shards contain roughly the same set of terms.

2 Likes

I'm using the size reported in Kibana or via GET _cat/indicies. (I currently do not have access to the underlying file system).

I attempted a Force Merge Index command but that did not see to help, I also verified I set both indexes to writable, but no new data has been added to them.

I will wait another day to see if they get smaller or not.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.