Curator: Too many Forcemerge Tasks


#1

Hi,
I have enabled forcemerge management via curator and it is causing us an issue. That is, after enabling it late on Thursday we now have 1400+ running tasks. This is not a large cluster and so is causing widespread issues.
Config file looks like this:

  action_forcemerge.yml: |-
    ---
    actions:
      1:
        action: forcemerge
        description: "Perform forcemerge operation on rolled indices"
        options:
          max_num_segments: 1
          timeout_override: 21600
          ignore_empty_list: true
          delay: 300
          disable_action: true
        filters:
        - filtertype: age
          source: creation_date
          direction: older
          unit: days
          unit_count: 1
        - filtertype: forcemerged
          max_num_segments: 1
          exclude: true

It's triggered from a cronjob which runs once per day...

However, the tasks that are running are not cancellable. So, how can I clear them down? They don't seem to be active in that segment counts have not fallen to 1 per shard with 4 merges currently running against today's index (1 per shard).

Regards,
D


(Aaron Mildenstein) #2

Action 1 will not run with this set. I don't think your forcemerges are even happening.

Also, if your Curator filters select 10 indices, Curator will only request one forcemerge at a time, as sending those concurrently would be too much I/O for Elasticsearch to try to manage at once. I repeat, Curator only permits a single forcemerge to run at a time. Additionally, once a single forcemerge request has been made to Elasticsearch, others will be blocked until the first one is finished. Elasticsearch will only handle 1 forcemerge request at a time.

I'm not sure where you got the 1400 number, but forcemerges do not show up in the _tasks API that I have ever seen. I looked to see if I could detect them that way, and found that I could not.


#3

Ah sorry, this is an updated version of the action that is now disabled. It was active until this morning. Sorry for that :slight_smile:


#4

@theuntergeek So, treat that config as if it is active...


(Aaron Mildenstein) #5

The rest of my initial answer does treat it as if it is active. My answer is still applicable.


#6

Perhaps there's something happening with the container running the job. I'll focus on that for now. Thank you...


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.