ILM Retry API not honoring timeouts

Himanshu_Mishra · August 18, 2021, 6:32am

I'm try to retry ILM steps for all indexes, using following command, however

POST */_ilm/retry
{
  "timeout": "30m",
  "master_timeout": "30m"
}

Still following error is thrown. Is there another param to override process_cluster_event_timeout_exception?

{
  "error" : {
    "root_cause" : [
      {
        "type" : "process_cluster_event_timeout_exception",
        "reason" : "failed to process cluster event (ilm-re-run) within 30s"
      }
    ],
    "type" : "process_cluster_event_timeout_exception",
    "reason" : "failed to process cluster event (ilm-re-run) within 30s"
  },
  "status" : 503
}

DavidTurner · August 18, 2021, 7:06am

The parameters go in the URL: POST */_ilm/retry?master_timeout=30m&timeout=30m. However if it's taking more than 30s to do this then there's something very wrong with your cluster, your master is far too busy doing other things. You should investigate (e.g. look at cluster pending tasks) and fix that, not just pile more work onto an already-overloaded master.

Himanshu_Mishra · August 18, 2021, 9:01am

Ouch Thanks a lot.

Himanshu_Mishra · August 18, 2021, 9:07am

A quick noob question, should our logstash point to master nodes, or data nodes? We've pointed it to master node with the assumption that master decides on which shard should the data be kept.

If we send it to data node, can we simply point it to dns of pool of data nodes?

DavidTurner · August 18, 2021, 9:43am

You should always prefer to send traffic to non-master nodes.

DavidTurner · August 19, 2021, 7:10am

I opened a bug report since I think it would have helped to reject the request rather than silently ignore the bits we didn't expect:

github.com/elastic/elasticsearch

REST handlers silently ignore request body if unexpected

opened 07:10AM - 19 Aug 21 UTC

DaveCTurner

>bug :Core/Infra/REST API Team:Core/Infra

In https://discuss.elastic.co/t/ilm-retry-api-not-honoring-timeouts/281772 a use…r reported confusion over the way that Elasticsearch ignored the parameters in this request: ``` POST */_ilm/retry { "timeout": "30m", "master_timeout": "30m" } ``` The immediate solution was to move the parameters into the URL where they belong, but the question highlights a broader problem: if an API endpoint expects a request body then we parse the body fairly strictly and reject requests that contain unexpected things, but if the endpoint doesn't expect a body at all then we seem to leniently accept requests with a body anyway. Silently ignoring input like this is confusing, and I think we should have rejected this request with a `400 Bad request` instead.

system · September 16, 2021, 7:11am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ElasticSearch : observer: timeout notification from cluster service Elasticsearch	10	11225	July 5, 2017
Elasticsearch 1.5.2 master unresponsive Elasticsearch	1	403	July 6, 2017
Shard timeout problem on AWS Elasticsearch	8	435	July 6, 2017
Elasticsearch index creation high master timeout causing index creation retries internally Elasticsearch	2	855	July 5, 2017
ClusterBlockException retries with Elasticsearch java client Elasticsearch	1	465	July 10, 2017

ILM Retry API not honoring timeouts

Related topics