EsThreadPoolExecutor err

hi there!

this error i can't solve while using kibana

rejected execution of org.elasticsearch.action.search.FetchSearchPhase$1@3f6d8e9e on EsThreadPoolExecutor[search, queue capacity = 1000, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@71704db3[Running, pool size = 2, active threads = 2, queued tasks = 1000, completed tasks = 259214]]

this is GET /_cat/thread_pool?v result

node_name name active queue rejected
xVfqIa9 bulk 0 0 0
xVfqIa9 fetch_shard_started 0 0 0
xVfqIa9 fetch_shard_store 0 0 0
xVfqIa9 flush 0 0 0
xVfqIa9 force_merge 0 0 0
xVfqIa9 generic 0 0 0
xVfqIa9 get 0 0 0
xVfqIa9 index 0 0 0
xVfqIa9 listener 0 0 0
xVfqIa9 management 1 0 0
xVfqIa9 refresh 0 0 0
xVfqIa9 search 0 0 106502
xVfqIa9 snapshot 0 0 0
xVfqIa9 warmer 0 0 0
jZURHTt bulk 0 0 0
jZURHTt fetch_shard_started 0 0 0
jZURHTt fetch_shard_store 0 0 0
jZURHTt flush 0 0 0
jZURHTt force_merge 0 0 0
jZURHTt generic 0 0 0
jZURHTt get 0 0 0
jZURHTt index 0 0 0
jZURHTt listener 0 0 0
jZURHTt management 1 0 0
jZURHTt refresh 0 0 0
jZURHTt search 0 0 40538
jZURHTt snapshot 0 0 0
jZURHTt warmer 0 0 0
hmMrMMb bulk 0 0 0
hmMrMMb fetch_shard_started 0 0 0
hmMrMMb fetch_shard_store 0 0 0
hmMrMMb flush 0 0 0
hmMrMMb force_merge 0 0 0
hmMrMMb generic 0 0 0
hmMrMMb get 0 0 0
hmMrMMb index 0 0 0
hmMrMMb listener 0 0 0
hmMrMMb management 1 0 0
hmMrMMb refresh 0 0 0
hmMrMMb search 0 0 213796
hmMrMMb snapshot 0 0 0
hmMrMMb warmer 0 0 0
xaGXJwM bulk 0 0 0
xaGXJwM fetch_shard_started 0 0 0
xaGXJwM fetch_shard_store 0 0 0
xaGXJwM flush 0 0 0
xaGXJwM force_merge 0 0 0
xaGXJwM generic 0 0 0
xaGXJwM get 0 0 0
xaGXJwM index 0 0 0
xaGXJwM listener 0 0 0
xaGXJwM management 1 0 0
xaGXJwM refresh 0 0 0
xaGXJwM search 0 0 41379
xaGXJwM snapshot 0 0 0
xaGXJwM warmer 0 0 0

plz point out to me how to fix EsThreadPoolExecutor err.

It means your cluster is overloaded and cannot process the requests that Kibana is sending.

You'd need to describe more about your cluster and the use case for us to provide further assistance.

Can you please provide the output of the cluster stats API?

GET _cluster/stats?pretty

{
"_nodes": {
"total": 9,
"successful": 9,
"failed": 0
},
"cluster_name": "066789247273:cluster-name",
"timestamp": 1535525504791,
"status": "green",
"indices": {
"count": 194,
"shards": {
"total": 1932,
"primaries": 966,
"replication": 1,
"index": {
"shards": {
"min": 2,
"max": 10,
"avg": 9.958762886597938
},
"primaries": {
"min": 1,
"max": 5,
"avg": 4.979381443298969
},
"replication": {
"min": 1,
"max": 1,
"avg": 1
}
}
},
"docs": {
"count": 121074770,
"deleted": 4
},
"store": {
"size_in_bytes": 269101490152,
"throttle_time_in_millis": 0
},
"fielddata": {
"memory_size_in_bytes": 14201952,
"evictions": 0
},
"query_cache": {
"memory_size_in_bytes": 367089553,
"total_count": 8813864,
"hit_count": 4746820,
"miss_count": 4067044,
"cache_size": 205870,
"cache_count": 205870,
"evictions": 0
},
"completion": {
"size_in_bytes": 0
},
"segments": {
"count": 21829,
"memory_in_bytes": 1265913086,
"terms_memory_in_bytes": 1157292887,
"stored_fields_memory_in_bytes": 63663248,
"term_vectors_memory_in_bytes": 0,
"norms_memory_in_bytes": 3968,
"points_memory_in_bytes": 2535659,
"doc_values_memory_in_bytes": 42417324,
"index_writer_memory_in_bytes": 0,
"version_map_memory_in_bytes": 0,
"fixed_bit_set_memory_in_bytes": 0,
"max_unsafe_auto_id_timestamp": 1535445184828,
"file_sizes": {}
}
},
"nodes": {
"count": {
"total": 9,
"data": 9,
"coordinating_only": 0,
"master": 9,
"ingest": 9
},
"versions": [
"5.5.2"
],
"os": {
"available_processors": 9,
"allocated_processors": 9,
"names": [
{
"count": 9
}
],
"mem": {
"total_in_bytes": 35488604160,
"free_in_bytes": 1055694848,
"used_in_bytes": 34432909312,
"free_percent": 3,
"used_percent": 97
}
},
"process": {
"cpu": {
"percent": 484
},
"open_file_descriptors": {
"min": 829,
"max": 1812,
"avg": 1263
}
},
"jvm": {
"max_uptime_in_millis": 86833492,
"mem": {
"heap_used_in_bytes": 5919426048,
"heap_max_in_bytes": 14417068032
},
"threads": 707
},
"fs": {
"total_in_bytes": 949996781568,
"free_in_bytes": 677929021440,
"available_in_bytes": 629459644416
},
"network_types": {
"transport_types": {
"netty4": 9
},
"http_types": {
"filter-jetty": 9
}
}
}
}

GET _cluster/settings?pretty

{
"persistent": {
"cluster": {
"routing": {
"allocation": {
"cluster_concurrent_rebalance": "2",
"node_concurrent_recoveries": "2",
"disk": {
"watermark": {
"low": "14.7gb",
"high": "9.8gb"
}
},
"node_initial_primaries_recoveries": "4"
}
},
"blocks": {
"create_index": "false"
}
},
"indices": {
"recovery": {
"max_bytes_per_sec": "20mb"
}
}
},
"transient": {
"cluster": {
"routing": {
"allocation": {
"cluster_concurrent_rebalance": "2",
"node_concurrent_recoveries": "2",
"disk": {
"watermark": {
"low": "14.7gb",
"high": "9.8gb"
}
},
"exclude": {},
"node_initial_primaries_recoveries": "4"
}
}
},
"indices": {
"recovery": {
"max_bytes_per_sec": "20mb"
}
}
}
}

GET _cluster/health

{
"cluster_name": "066789247273:cluster-name",
"status": "green",
"timed_out": false,
"number_of_nodes": 9,
"number_of_data_nodes": 9,
"active_primary_shards": 966,
"active_shards": 1932,
"relocating_shards": 2,
"initializing_shards": 0,
"unassigned_shards": 0,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 1,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 100
}

Given the amount of data in the cluster, you seem to have a very large number of shards. Each visualisation will generate an aggregation and each shard this addresses will basically take up one search queue slot. You should therefore be able to get rid of this error by significantly reducing the number of shards in your cluster. Read this blog post for guidance on shards and sharding.

1 Like

To reduce shards, i've changed from daily-indices-format to monthly-indices-format.
Then 1412 shards active after deleting two months indices doc.
Others are I can't delete.
So, now i have daily and month indices but i still have EsThreadPolExecutor err
Could you advise me what kind of api should i use to reduce shards??
And may I know which indices format is suitable for 3GB daily usage.

You might be able to use the shrink index API to reduce the number of primary shards per index. Another option is to reindex the data you have in daily indices into monthly ones instead.

1 Like

thank you @Christian_Dahlqvist

i tried re-index but i got Request failed to get to the server (status code: 504) msg
how to solve this problem?
my daily indices size are generally 2Gi
i use aws es

I do not know why the reindex request failed. I believe AWS ES has some APIs disabled, so you may want to bring this issue up with them. Each index by default has 5 primary shard and also 5 replica shards, so it is looks like the average shards size in your cluster (based on stats provided before your delete) is around 130 MB, which is very small.

1 Like

thank you for help @Christian_Dahlqvist
even i got Request failed to get to the server (status code: 504) msg, aws es does background process. so i can re-index properly after i wait a few moment by monitoring task api

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.