Curator: Too many Forcemerge Tasks

I have enabled forcemerge management via curator and it is causing us an issue. That is, after enabling it late on Thursday we now have 1400+ running tasks. This is not a large cluster and so is causing widespread issues.
Config file looks like this:

  action_forcemerge.yml: |-
        action: forcemerge
        description: "Perform forcemerge operation on rolled indices"
          max_num_segments: 1
          timeout_override: 21600
          ignore_empty_list: true
          delay: 300
          disable_action: true
        - filtertype: age
          source: creation_date
          direction: older
          unit: days
          unit_count: 1
        - filtertype: forcemerged
          max_num_segments: 1
          exclude: true

It's triggered from a cronjob which runs once per day...

However, the tasks that are running are not cancellable. So, how can I clear them down? They don't seem to be active in that segment counts have not fallen to 1 per shard with 4 merges currently running against today's index (1 per shard).


Action 1 will not run with this set. I don't think your forcemerges are even happening.

Also, if your Curator filters select 10 indices, Curator will only request one forcemerge at a time, as sending those concurrently would be too much I/O for Elasticsearch to try to manage at once. I repeat, Curator only permits a single forcemerge to run at a time. Additionally, once a single forcemerge request has been made to Elasticsearch, others will be blocked until the first one is finished. Elasticsearch will only handle 1 forcemerge request at a time.

I'm not sure where you got the 1400 number, but forcemerges do not show up in the _tasks API that I have ever seen. I looked to see if I could detect them that way, and found that I could not.

Ah sorry, this is an updated version of the action that is now disabled. It was active until this morning. Sorry for that :slight_smile:

@theuntergeek So, treat that config as if it is active...

The rest of my initial answer does treat it as if it is active. My answer is still applicable.

Perhaps there's something happening with the container running the job. I'll focus on that for now. Thank you...

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.