Elasticsearch cluster overloaded

fsperling · February 7, 2018, 3:21pm

Hi,

Our elasticsearch cluster get overloaded from time to time.
We see thread_pool rejections on elasticsearch and logstash has error messages like the below

[logstash.outputs.elasticsearch] retrying failed action with response code: 429 ({"type"=>"es_rejected_execution_exception", "reason"=>"rejected execution of org.elasticsearch.transport.TransportService$7@57757508 on EsThreadPoolExecutor[bulk, queue capacity = 500, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@127e2b61[Running, pool size = 32, active threads = 32, queued tasks = 507, completed tasks = 575009786]]"})

(Increasing the thread_pool.bulk.queue_size to 500 had helped a bit)

Our cluster has 26TB of data, 8 hot, 5 warm and 10 cold nodes. We have ~2000 indices across 6800 primary shards, replicated once. We are running elasticsearch 5.5. Each elasticsearch instance has 30GB of memory.
The hot nodes have a max of 80 shards each.

Looking at our servers the cpu load and IO are very low. CPU is around 15%, IO 30% with peaks of 50%. File/ulimits are also fine.

We are wondering why elasticsearch isn't using more of the resources if it's under load / overloaded. And if there are settings in elasticsearch to improve the performance and get rid of those errors.

It seems the most load is from indexing so we increased indices.memory.index_buffer_size to 30%

Any tips would be appreciated.
Cheers,
Felix

val · February 7, 2018, 4:36pm

Lots of indices and shards, indeed... This might be worth reading: https://www.elastic.co/blog/why-am-i-seeing-bulk-rejections-in-my-elasticsearch-cluster

system · March 7, 2018, 4:36pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ThreadPoolExecutor overused Elasticsearch	8	3405	July 5, 2017
org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of org.elasticsearch.transport.TransportService Elasticsearch	25	8450	February 27, 2018
Performance issue on some requests Elasticsearch	4	1654	April 10, 2017
Rejected execution Elasticsearch	9	6276	August 11, 2020
Bulk Rejection in Elasticsearch Elasticsearch	2	6578	May 15, 2018

Elasticsearch cluster overloaded

Related topics