Retrying failed action with response code 429

I'm seeing following message in elasticsearch's log:

elk | [2017-07-08T23:59:51,847][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 429 ({"type"=>"es_rejected_execution_exception", "reason"=>"rejected execution of org.elasticsearch.transport.TransportService$7@1228229a on EsThreadPoolExecutor[bulk, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@b45c186[Running, pool size = 8, active threads = 8, queued tasks = 200, completed tasks = 16346]]"})

I did following change, yet it did not help as I'm still seeing error:

thread_pool.search.queue_size: -1

Please advise.

I also get these as well:

elk | [2017-07-10T16:53:17,399][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://elk:9200/, :path=>"/"}
elk | [2017-07-10T16:53:17,447][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>#Java::JavaNet::URI:0x77e2d1fb}
elk | [2017-07-10T16:53:45,021][WARN ][logstash.outputs.elasticsearch] Marking url as dead. Last error: [LogStash::Outputs::Elasticsearch::HttpClient::Pool::HostUnreachableError] Elasticsearch Unreachable: [http://elk:9200/][Manticore::SocketTimeout] Read timed out {:url=>http://elk:9200/, :error_message=>"Elasticsearch Unreachable: [http://elk:9200/][Manticore::SocketTimeout] Read timed out", :error_class=>"LogStash::Outputs::Elasticsearch::HttpClient::Pool::HostUnreachableError"}
elk | [2017-07-10T16:53:45,026][ERROR][logstash.outputs.elasticsearch] Attempted to send a bulk request to elasticsearch' but Elasticsearch appears to be unreachable or down! {:error_message=>"Elasticsearch Unreachable: [http://elk:9200/][Manticore::SocketTimeout] Read timed out", :class=>"LogStash::Outputs::Elasticsearch::HttpClient::Pool::HostUnreachableError", :will_retry_in_seconds=>32}

Hey,

you should not configure an unlimited queue size (this was on a wrong thread pool as well), as you are just delaying your problem. Elasticsearch cannot keep up with the amount of documents that you send via logstash. Also it seems not to be reachable according to the second exception.

You should try to reduce the load, increase the cluster or find way to increase the throughput of your existing cluster to solve this problem.

--Alex

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.