Does _split actually splits data or just copies it across shards

I want to create a new index with bigger amount of shards and copy my data across. _reindex would probably be the best solution, but I expected split to do what I wanted.
So I have 38gb 1shard index and I wanted to split it into 4 shards index. I expected new index to have same amount of store.size and 4 shards, but instead store.size of a new index turned out to be 97gb, what's the reason behind it and how _split actually works.

Hi Danyil,

The number of splits is determined by the index setting index.number_of_routing_shards. The docs entry here include more details and examples that should help explain how splitting works.

Hope that helps.

I would recommend looking at the discusion in this recent thread.

1 Like

I understand the concept, but it's still confusing to me.
In a nutshell, if I split 1shard index into 4 shards index, does it split that shard's data or instead copies it across all those newly created shards?

Looking at this answer in another thread, which is referenced in the discussion posted by @Christian_Dahlqvist, the shards are cloned and then docs are deleted in a similar way to delete by query.

The thread referenced above does walk through an example which may help explain further.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.