Es_rejected_execution - elasticsearch not indexing or processing data

msvechla · May 27, 2017, 1:13pm

Hello,

we are using elasticsearch to create daily indices of our logfiles. This means every night at 00:00 UTC all new indices are being created for the day.

Since two days we are having issues in one of our clusters at exactly that point in time.
Some new indices are created, but they have no data in them. The rest of the indices are not being created at all. Also the pending task queue is growing significantly with not a lot of tasks being processed. Cluster health is green the whole time, with 0 shards being relocated, initialized, unassigned or delayed.

The first day I was able to get ES processing data again by executing:
curl -XPUT localhost:19210/_cluster/settings -d '{ "transient" : { "threadpool.bulk.queue_size" : 1000 } }'

After that all new indices were being created, as well as new data was flowing in again.
The next day the exact same problem occurred again. I ran the same command again, but set the queue size to 1100 and it fixed the issues again.

However I am not sure if this is really related to the command, or it simply started working again because the command flushed the task queue. I also tried performing rolling restarts of the master nodes to flush the task queue. However when it flushed the queue then, it wasn't helping.

So either it is related to the bulk queue size, or I got lucky two times flushing the queue at the right point in time.

ES Version: 2.4.1
Logstash Versions: 5.3.0, 2.4.1 running in parallel at the moment

Additional log infos:

Very high number of pending tasks with:
"tasks" : [ { "insert_order" : 183826, "priority" : "URGENT", "source" : "create-index-template [metricbeat], cause [api]", "executing" : true, "time_in_queue_millis" : 5655, "time_in_queue" : "5.6s" }, { "insert_order" : 183830,
and

{ "insert_order" : 183939, "priority" : "HIGH", "source" : "_add_listener_", "executing" : false, "time_in_queue_millis" : 108, "time_in_queue" : "108ms" }

Logstash logs filled with:
[2017-05-27T12:38:44,365][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 429 ({"type"=>"es_rejected_execution_exception", "reason"=>"rejected execution of org.elasticsearch.transport.TransportService$4@7cdd3880 on EsThreadPoolExecutor[bulk, queue capacity = 1000, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@562cd8c1[Running, pool size = 32, active threads = 32, queued tasks = 1852, completed tasks = 8490655]]"})

Can you help me out here? It looks like increasing the value was only a short fix and the problem will reoccur again.

Best Regards!

warkolm · May 27, 2017, 11:28pm

How many nodes, shards, how much data?

msvechla · May 28, 2017, 8:39am

16 Nodes
6 Data nodes
10609 Active primary shards
31827 Active shards
about 6 TB of data

warkolm · May 28, 2017, 8:41am

That's waaaaaayyy too many shards and likely causing your issues.

msvechla · May 28, 2017, 9:04am

As I described we are creating daily indices for storing our logs. The index sizes range from a few mb up to 20gb daily size. Do you recommend switching to weekly indices to decrease the number?

We also have 2 shards and 2 replicas configured.

dadoonet · May 28, 2017, 9:18am

1 shard is probably enough. Then you can think of using the rollover API so you will maximize the number of docs per shard.

Christian_Dahlqvist · May 28, 2017, 9:27am

I would recommend aiming for shard sizes between a few GB and a few tens of GB in size as a general rule of thumb. For indices with low volumes of data you should therefore consider either consolidating them and/or switching to weekly or even monthly indices.

system · June 25, 2017, 9:28am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Es_rejected_execution exception with index failure changing Elasticsearch	7	1558	February 20, 2019
Bulk Rejection in Elasticsearch Elasticsearch	2	6594	May 15, 2018
ESRejectedExecutionException Elasticsearch	4	423	July 6, 2017
Any idea what these errors mean version 2.4.2 Elasticsearch	8	8606	February 2, 2017
Bulk queue_size Elasticsearch	9	12699	July 5, 2017

Es_rejected_execution - elasticsearch not indexing or processing data

Related topics