I am currently using ES v6.1.2. I have an index of size 1.3 tb consisting of 16 shards (8 primaries).
The primary shard size amounts to around 700GB.
I am planning to perform a forcemerge operation on the index.
Any suggestions, what should be the max_num_segment value. As far as I know, by default, it takes the value as 1.
I want my search speed to improve. So what should be the ideal value for max_num_segments.
Sorry for the typing error before.
Just to confirm, should I go ahead with max_seg_count as 1?
Or we should start with some higher number for this volume of data?
Performing a forcemerge on shards that large will use up a lot of resources. I would recommend trying it once and see if it brings the benefits you are hoping for. I think it should be OK to set max)num_segments to 1, so that is what I would try with.
Thanks for your prompt response.
Apart from the process being resource intensive, I would like to know if it has any other disadvantages.
Since the data set is quite large, just want to be sure about it.
It can end up taking a long time and use a lot of disk I/O, so could affect users while it is progressing. For that reason I would recommend trying it out in a test environment rather that trying it in production.
Thank you for the help. I completed the force_merge operation on 1.3 TB index.
The segment count is now 1 per shard.
It took 3.5 hours to complete force_merge process.
We did force_merge 1 year after creating the index. Now, if we plan to do force_merge every 6 months, will it take the same amount of time or it should be lesser?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.