I need to run force_merge frequently to have max_num_segments = 1. It takes lot of time. I want to make force_merge faster. What are the possible option do I have?
I read and found that max size of threads available for force_merge is 1. I changed that to 4 as I have 8 thread machine by changing elasticsearch.yml config file. I can see the respective change as well in node setting.
I checked during force_merge execution, it shows as 1 active thread even though current thread size is 4 for force_merge.
I am planning to use Ultrawarm AWS feature. For that I wanted to run force merge first on all the available indices.
Currently force-merge uses one thread only. I have added more threads using yml file but while running force merge it is showing only one active thread.
You will likely need to approach AWS support for help with this question, AWS Elasticsearch is different from the official Elasticsearch that we support here.
Can you reproduce the issue outside of AWS Elasticsearch? The only people who have access to the AWS Elasticsearch code are AWS themselves, and it's pretty tricky to debug this kind of thing without access to the code that you're actually running.
If I recall correctly each shard is forcemerged in a single thread. Increasing the thread pool allows multiple shards to be forcemerged in parallel but would as far as I know not speed up a single forcemerge. I am not aware of any setting that would speed it up, but the time it takes is however generally proportional to the shard size so you could try to reduce the shard size.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.