Force_merge Thread_pool

Hi Everyone,

I need to run force_merge frequently to have max_num_segments = 1. It takes lot of time. I want to make force_merge faster. What are the possible option do I have?

I read and found that max size of threads available for force_merge is 1. I changed that to 4 as I have 8 thread machine by changing elasticsearch.yml config file. I can see the respective change as well in node setting.

I checked during force_merge execution, it shows as 1 active thread even though current thread size is 4 for force_merge.

How many indices and shards are you forcemerging? Why do you need to run it so often? What is the size of the cluster?

I am planning to use Ultrawarm AWS feature. For that I wanted to run force merge first on all the available indices.

Currently force-merge uses one thread only. I have added more threads using yml file but while running force merge it is showing only one active thread.

You will likely need to approach AWS support for help with this question, AWS Elasticsearch is different from the official Elasticsearch that we support here.

Hey David,

I know that feature is from AWS. But I am looking to make Force_Merge faster.

For now force_merge to max_segments = 1 with default setting taking 25 mins for 50GB shard which has around 45 segments. I want to make it faster.

What will be a good way to do this?

Thanks in advance.

Can you reproduce the issue outside of AWS Elasticsearch? The only people who have access to the AWS Elasticsearch code are AWS themselves, and it's pretty tricky to debug this kind of thing without access to the code that you're actually running.

I am not sure if I can reproduce it but for now I am looking for parameters which will help to make force_merge faster.

Parameters such as

  • threads assigned for force_merge though THREAD_POOL node setting
  • index.merge.scheduler.max_thread_count

If I recall correctly each shard is forcemerged in a single thread. Increasing the thread pool allows multiple shards to be forcemerged in parallel but would as far as I know not speed up a single forcemerge. I am not aware of any setting that would speed it up, but the time it takes is however generally proportional to the shard size so you could try to reduce the shard size.

Yes you are correct. Increasing thread pools allows to run FM on multiple shards.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.