While using curator for bulk size elasticsearch data. curator throw error

i am using curator in production . i have total 12 actions in my actions.yml
but curator throws exception at action 9 .So the remaining actions are not executed
following exception are come.

2020-05-19 07:32:20,257 INFO      Trying Action ID: 9, "delete_indices": Delete indices older than (based on index name), for logstash- prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly.
2020-05-19 07:32:22,191 INFO      Deleting 3 selected indices: ['logstash-2020.05.18.14', 'logstash-2020.05.18.13', 'logstash-2020.05.18.12']
2020-05-19 07:32:22,191 INFO      ---deleting index logstash-2020.05.18.14
2020-05-19 07:32:22,191 INFO      ---deleting index logstash-2020.05.18.13
2020-05-19 07:32:22,191 INFO      ---deleting index logstash-2020.05.18.12
2020-05-19 07:32:24,460 ERROR     Failed to complete action: delete_indices.  <class 'curator.exceptions.FailedExecution'>: Exception encountered.  Rerun with loglevel DEBUG and/or check Elasticsearch logs for more information. Exception: Failed to get indices. Error: NotFoundError(404, 'index_not_found_exception', 'no such index [logstash-2020.05.18.14]', logstash-2020.05.18.14, index_or_alias)

please help me how to solve this issue

my config.yml is

    port: 9200
    use_ssl: False
    ssl_no_validate: False
    http_auth: "elastic:elastic"
    timeout: 30
    master_only: False
    loglevel: INFO
    logformat: default
    blacklist: ['elasticsearch', 'urllib3']

What this implies is that your cluster is being very slow to update the cluster state during/after the index delete steps, such that even after Curator has received an "okay, the index is deleted" message from the client connection, on a subsequent API call, the index still appears to be present. In such a case, Curator attempts to delete it again. Somehow, in the very few moments between Curator finding that the index still appears to be present and the attempt to re-delete the index, the cluster state finally refreshes and the index is truly gone, so you get a 404, index not found error.

As stated, this is a relatively rare occurrence in a fully performant cluster. This only tends to happen on clusters which are overtaxed and/or overloaded, which can be from one or more of the following (or other scenarios, too):

  • Too many shards per node
  • The cluster state is too large from having too many fields in one or more index mappings
  • The master nodes are both master & data, which can result in long Java garbage collection pauses on a master node, leading to the cluster state update race condition mentioned

You will need to set loglevel: DEBUG and include the more verbose information to demonstrate the retry I mentioned. A considerable amount more debugging becomes necessary to track down why your cluster state is slow to update, resulting in that race condition.

thank you @theuntergeek for your valueable reply .
but for now i can not change any thing except updating the config-maps.
can i use some tricky solutions which prevent the error .
idle value for "timeout" in config.yml
or some other solution.

There are no tricks to fixing an overtaxed cluster other than to eliminate the bottlenecks. You either fix it, or it keeps causing problems.

As far as trying to make Curator proceed in the face of these cluster state update delays, you could (though I emphatically do not recommend it) configure Curator to ignore errors and just keep proceeding through actions. That's not a fix at all, though, and it could lead to other errors and unexpected outcomes.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.