Logstash: retrying failed action with response code: 429

hasan_sharekhan · October 5, 2017, 10:13am

I am get the below error continuously in Logstash (v5.5)

Pushing logs using FileBeat (v5.5) with 2-3 logs files then it works fine.
But when using logs using Filebeat with 150+ logs files then I get below error. (since we are pushing logs for the first time, these are past dated logs)

Each log file have single days logs and for each Date new index is created in Elasticsearch v5.5

I have 3 nodes of ES & 1 LS instance each having 4GB RAM

[2017-10-04T20:05:36,111][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 429 ({"type"=>"es_rejected_execution_exception", "reason"=>"rejected execution of org.elasticsearch.transport.TransportService$7@85be457 on EsThreadPoolExecutor[bulk, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@538c9d8a[Running, pool size = 16, active threads = 16, queued tasks = 200, completed tasks = 685]]"})

guyboertje · October 5, 2017, 11:12am

This is logged at the level INFO meaning its not an error.
From: Elasticsearch get stuck while indexing a few record · Issue #16224 · elastic/elasticsearch · GitHub

This is the index thread pool inside Elasticsearch rejecting requests because its task queue is full. The EsRejectedExecutionException is a way of communicating to your client thread that it needs to back off.

This is exactly what the elasticsearch output is doing - backing off and its "telling" you so (OK, admitted, it is unclear).

Unfortunately, you need to do some work on the ES performance.

Open a new discussion in the elasticsearch channel by asking for help improving ES responsiveness when getting {"type"=>"es_rejected_execution_exception", "reason"=>"rejected execution of org.elasticsearch.transport.TransportService$7@85be457 on EsThreadPoolExecutor[bulk, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@538c9d8a[Running, pool size = 16, active threads = 16, queued tasks = 200, completed tasks = 685]]"}

hasan_sharekhan · October 5, 2017, 2:24pm

Thanks for your reply.

thread_pool.bulk.queue_size: 2000

I have added above line in ES yml file and it solved my problem to large extend, now data is getting process from FileBeat > Logstash > Elasticsearch and Shards and Indices are getting created and logs is getting pushed in Elasticsearch,

But now i am getting below error in Logstash

[2017-10-05T19:10:17,423][ERROR][logstash.outputs.elasticsearch] Attempted to send a bulk request to elasticsearch, but no there are no living connections in the connection pool. Perhaps Elasticsearch is unreachable or down? {:error_message=>"No Available connections", :class=>"LogStash::Outputs::Elasticsearch::HttpClient::Pool::NoConnectionAvailableError", :will_retry_in_seconds=>8}

[2017-10-05T19:16:27,987][WARN ][logstash.outputs.elasticsearch] UNEXPECTED POOL ERROR {:e=>#<LogStash::Outputs::Elasticsearch::HttpClient::Pool::NoConnectionAvailableError: No Available connections>}

and it seems the reason is below error which is coming in Elasticsearch

[2017-10-05T18:11:45,663][DEBUG][o.e.a.a.i.m.p.TransportPutMappingAction] [node0_tcl] failed to put mappings on indices [[[tt_ts-270917/IDP1PR87QKWsvxQ7FDCX5A]]], type [logs]
org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException:failed to process cluster event (put-mapping) within 30s
at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.lambda$null$0(ClusterService.java:255) ~[elasticsearch-5.5.0.jar:5.5.0]
at java.util.ArrayList.forEach(ArrayList.java:1249) ~[?:1.8.0_131]
at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.lambda$onTimeout$1(ClusterService.java:254) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-5.5.0.jar:5.5.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]

Christian_Dahlqvist · October 5, 2017, 2:33pm

Consider configuring Filebeat to generate monthly rather than daily indices so you do not overload your cluster. Having large number of very small indices and shards is very inefficient, and the smaller your heap is the more careful you need to be.

guyboertje · October 5, 2017, 5:07pm

A ten fold increase in the bulk.queue_size ES setting, are you sure?

From that same link (elasticsearch issue), Jason wrote:

Investigate increasing the size of the index queue, but only within reason. Increasing queue size is not a panacea for constantly-full queues. If you're shoving data into Elasticsearch faster than your hardware can handle, increasing the queue size will not solve your problem because that new queue size will just fill up. Stuffed queues are almost always a sign of another problem.

Jason knows his stuff.

You really, really, really should move this conversation to the elasticsearch discuss channel to get help with tuning your ES cluster.

system · November 2, 2017, 5:07pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to Improve ES responsiveness when getting warning: retrying failed action with response code: 429 Logstash	3	1257	July 5, 2022
How to rectify this error? Logstash	6	1350	May 17, 2019
Error in logstash logs: " retrying failed action with response code: 429 Elasticsearch	6	4582	October 15, 2019
Retrying failed action with response code 429 Elasticsearch	3	3716	August 8, 2017
Retrying failed action with response code Logstash	4	3234	July 25, 2018

Logstash: retrying failed action with response code: 429

Related topics