Es_rejected_execution exception with index failure changing

Santi · January 23, 2019, 10:21am

Hello team,

After restarting the Elasticsearch service several time, and increasing the thread_pool.bulk.queue_size, when I try to interact with Elasticsearch with the following query :

q = {
  "query": {
    "match_all": {}
   }
}

q = es.search(index='my_index', body=q)

I get the following message :

{'took': 38,
 'timed_out': False,
'num_reduce_phases': 3,
'_shards': {'total': 1396,
 'successful': 1395,
 'failed': 1,
 'failures': [{'shard': 1,
'index': 'SOME_OTHER_INDEX',
'node': 'My_node',
'reason': {'type': 'es_rejected_execution_exception',
 'reason': 'rejected execution of org.elasticsearch.transport.TransportService$7@37f5c738 on EsThreadPoolExecutor[search, queue capacity = 1000, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@5c51cea6[Running, pool size = 13, active threads = 13, queued tasks = 996, completed tasks = 4347]]'}}]},
 'hits': {'total': 0, 'max_score': None, 'hits': []}}

And when I relaunch my query, I get a failure on another different index. I tried to check the cluster's health but everything seems normal as you can see :

curl localhost:9201/_cluster/health?pretty
 % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                             Dload  Upload   Total   Spent    Left  Speed
100   472  100   472    0     0  10260      0 --:--:-- --:--:-- --:--:-- 10260{
 "cluster_name" : "elasticsearch",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 1396,
"active_shards" : 1396,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}

Any thouyghts on why this is happening?

Best.

DavidTurner · January 23, 2019, 10:31am

You have far too many shards. Elasticsearch creates a task for each shard to be searched, so that's 1396 search tasks (plus some for coordination). In order to protect the cluster, Elasticsearch rejects attempts to make so many tasks. The solution is to reduce the number of shards in your cluster. Here is an article on this subject:

Santi · January 23, 2019, 11:09am

I have hundreds of indices, is it possible to have indices sharing one shard ? And once I know that I must shrink indices, is it possible to shrink several of them at once ? The Shrink index page seems to show that it can only be done one by one.

DavidTurner · January 23, 2019, 11:13am

Shrinking an index can be used to combine shards within that index, but not across indices, because different indices may have different mappings and may contain multiple documents with the same ID.

If you want to combine some of your indices together, I think the best way forward would be to reindex them.

Santi · January 23, 2019, 1:07pm

I have exactly 0 intention of combining indices, I want to know how I am supposed to deal with the fact that, according to you, I have too many shards but in the meantime I cannot shrink them accross indices. What would be a good range for a number of shards per cluster? How can I increase the queue capacity so that my elasticsearch will be functioning again?

DavidTurner · January 23, 2019, 1:38pm

This is covered in the article to which I linked, but here's some choice quotes:

Aim to keep the average shard size between at least a few GB and a few tens of GB. For use-cases with time-based data, it is common to see shards between 20GB and 40GB in size.

A good rule-of-thumb is to ensure you keep the number of shards per node below 20 per GB heap it has configured.

The queue capacity seems correct to me, it's the shard count that needs work.

Santi · January 23, 2019, 2:26pm

I will modify my structure to fit the best practices, thank you for your time.

system · February 20, 2019, 2:26pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Rejected Execution exception Elasticsearch	3	3806	October 18, 2018
Rejected execution Elasticsearch	9	6276	August 11, 2020
Error response in search Elasticsearch	2	931	August 18, 2020
Getting error org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of org.elasticsearch.transport.TransportService$4@2397ca1d on EsThreadPoolExecutor[index, queue capacity = 200 Elasticsearch	2	4260	May 24, 2019
Indexer failures - ES rejected execution exception Elasticsearch	10	1427	September 7, 2020

Es_rejected_execution exception with index failure changing

Related topics