Reindex vs Split index speeds

Pratik_Wadodkar · December 9, 2022, 8:04am

Hi Daithi,
Its better to have shard size around 50gb for best performance. If we have index with shard size in TB then it quite difficult to handle when you have take the snapshot or rolling over from one data tier to another data tier.
Split vs reindex:
If you go with reindexing it will take so much time for example like if you want reindex 1GB of data will take around 4-5 minutes so in case of TB data it will gona take days for reindexing on the other hand if you go with split api it will quickly split the index with desired primary shard that you have provide in the split api.
Also one thing when you apply split api make sure you must have good amount of storage because in the begening index try to allocated all the shard on different node and then allocate the data so in this process you might see your storage get incresed by maybe 3-4 times but it will come to its original state by some time.
here are the link for reference

Topic		Replies	Views
Reindex vs Split Speed and Storage Requirements Elasticsearch	2	212	March 7, 2024
Does _split actually splits data or just copies it across shards Elasticsearch	5	334	October 24, 2022
Reindex 1 index to multiple indexes Elasticsearch	8	554	June 15, 2023
Split API: shard sizing issue post split process Elasticsearch	2	377	February 17, 2021
Unable to Split Large Index Elasticsearch	1	29	August 26, 2024

Reindex vs Split index speeds

Related topics