Failed to execute bulk item error in elasticsearch pods

Hi Team,

Recently, I configured elasticsearch cluster with official helm by using default values. I have not changed any setting. I am seeing few error messages related to index. Could you guide me why I am getting below error message.

"took":2,"errors":true,"items":[{"index":{"_index":"cccwsgisvc-dev-2020.04.02","_type":"flb_type","_id":"l_cMOnEBWA4zUKT4Ykpx","status":429,"error":{"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [11011680][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[cccwsgisvc-dev-2020.04.02][0]] containing [6] requests, target allocation id: _NLLuww1THatLL-HK_JK4g, primary term: 1 on EsThreadPoolExecutor[name = elasticsearch-master-0/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@23849df1[Running, pool size = 1, active threads = 1, queued tasks = 200, completed tasks = 5204108]]"}}},{"index":{"_index":"cccwsgisvc-dev-2020.04.02","_type":"flb_type","_id":"mPcMOnEBWA4zUKT4Ykpx","status":429,"error":{"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [11011680][indices:data/write/bulk[s][p]]: request: Bulk
[2020/04/02 08:37:06] [error] [out_es] could not pack/validate JSON response
{"took":2,"errors":true,"items":[{"index":{"_index":"cccwsgisvc-dev-2020.04.02","_type":"flb_type","_id":"S_cMOnEBWA4zUKT4eUvN","status":429,"error":{"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [11011883][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[cccwsgisvc-dev-2020.04.02][0]] containing [6] requests, target allocation id: _NLLuww1THatLL-HK_JK4g, primary term: 1 on EsThreadPoolExecutor[name = elasticsearch-master-0/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@23849df1[Running, pool size = 1, active threads = 1, queued tasks = 200, completed tasks = 5204217]]"}}},{"index":{"_index":"cccwsgisvc-dev-2020.04.02","_type":"flb_type","_id":"TPcMOnEBWA4zUKT4eUvN","status":429,"error":{"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [11011883][indices:data/write/bulk[s][p]]: request: Bulk
[2020/04/02 08:39:33] [error] [out_es] could not pack/validate JSON response
{"took":3,"errors":true,"items":[{"index":{"_index":"default-dev-2020.04.02","_type":"flb_type","_id":"GQAOOnEB1qfuqzyLuGkD","status":429,"error":{"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [9362295][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[default-dev-2020.04.02][0]] containing [9] requests, target allocation id: gkcTSbSoTOWQkGerzj1L1g, primary term: 1 on EsThreadPoolExecutor[name = elasticsearch-master-2/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@f6bd99b[Running, pool size = 1, active threads = 1, queued tasks = 200, completed tasks = 4816121]]"}}},{"index":{"_index":"default-dev-2020.04.02","_type":"flb_type","_id":"GgAOOnEB1qfuqzyLuGkD","status":429,"error":{"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [9362295][indices:data/write/bulk[s][p]]: request: BulkShardRequest

org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of org.elasticsearch.action.bulk.TransportShardBulkAction$2@7fd2a567 on EsThreadPoolExecutor[name = elasticsearch-master-0/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@23849df1[Running, pool size = 1, active threads = 1, queued tasks = 200, completed tasks = 4001952]]",
{"type": "server", "timestamp": "2020-04-01T11:14:33,192Z", "level": "DEBUG", "component": "o.e.a.b.TransportShardBulkAction", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-0", "message": "[default-qa-2020.03.31][0] failed to execute bulk item (index) index {[default-qa-2020.03.31][flb_type][8VFyNXEBsYO_PAXv4f8f], source[{\"@timestamp\":\"2020-03-31T00:02:56.743Z\",\"log\":\"2020-03-31 00:02:56.743 [INFO][50] int_dataplane.go 921: Finished applying updates to dataplane. msecToApply=4.213096999999999\\n\",\"stream\":\"stdout\",\"time\":\"2020-03-31T00:02:56.74364832Z\",\"kubernetes\":{\"pod_name\":\"calico-node-82bd8\",\"namespace_name\":\"kube-system\",\"pod_id\":\"e0ae5b55-bfd1-427a-ab58-6bad55d12cd8\",\"labels\":{\"controller-revision-hash\":\"56759c96bf\",\"k8s-app\":\"calico-node\",\"pod-template-generation\":\"1\"},\"annotations\":{\"kubespray_etcd-cert/serial\":\"95E28E833DF522CD\"},\"host\":\"cesium-qal3.cisco.com\",\"container_name\":\"calico-node\",\"docker_id\":\"41f2f134cdb4a935ac944e5613db8b0cc22f1d465b7bff4491420c561e375f4f\",\"container_hash\":\"a2782b53500c96e35299b8af729eaf39423f9ffd903d9fda675073f4a063502a\"}}]}", "cluster.uuid": "NgR0MLuWQwKXe03Ut61JNA", "node.id": "TEvL_Dt2RdmD6jtyCHlAfg" ,
"stacktrace": ["org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of org.elasticsearch.action.bulk.TransportShardBulkAction$2@7fd2a567 on EsThreadPoolExecutor[name = elasticsearch-master-0/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@23849df1[Running, pool size = 1, active threads = 1, queued tasks = 200, completed tasks = 4001952]]",
"at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:48) ~[elasticsearch-7.5.2.jar:7.5.2]",
"at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825) ~[?:?]",
"at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1355) ~[?:?]",
"at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.execute(EsThreadPoolExecutor.java:84) [elasticsearch-7.5.2.jar:7.5.2]",
"at org.elasticsearch.action.bulk.TransportShardBulkAction$2.lambda$doRun$0(TransportShardBulkAction.java:162) [elasticsearch-7.5.2.jar:7.5.2]",
"at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:63) [elasticsearch-7.5.2.jar:7.5.2]",
"at org.elasticsearch.action.bulk.TransportShardBulkAction$3.lambda$onResponse$0(TransportShardBulkAction.java:281) [elasticsearch-7.5.2.jar:7.5.2]",
"at org.elasticsearch.action.ActionListener$4.onResponse(ActionListener.java:215) [elasticsearch-7.5.2.jar:7.5.2]",
"at org.elasticsearch.action.bulk.TransportShardBulkAction$1.onNewClusterState(TransportShardBulkAction.java:127) [elasticsearch-7.5.2.jar:7.5.2]",
"at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onNewClusterState(ClusterStateObserver.java:311) [elasticsearch-7.5.2.jar:7.5.2]",
"at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.clusterChanged(ClusterStateObserver.java:196) [elasticsearch-7.5.2.jar:7.5.2]",
"at org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateListeners$6(ClusterApplierService.java:527) [elasticsearch-7.5.2.jar:7.5.2]",
"at java.util.concurrent.ConcurrentHashMap$KeySpliterator.forEachRemaining(ConcurrentHashMap.java:3566) [?:?]",
"at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:735) [?:?]",
"at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658) [?:?]",
"at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateListeners(ClusterApplierService.java:523) [elasticsearch-7.5.2.jar:7.5.2]",
"at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:498) [elasticsearch-7.5.2.jar:7.5.2]",
"at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432) [elasticsearch-7.5.2.jar:7.5.2]",
"at org.elasticsearch.cluster.service.ClusterApplierService.access$100(ClusterApplierService.java:73) [elasticsearch-7.5.2.jar:7.5.2]",
"at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:176) [elasticsearch-7.5.2.jar:7.5.2]",
"at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:703) [elasticsearch-7.5.2.jar:7.5.2]",
"at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252) [elasticsearch-7.5.2.jar:7.5.2]",
"at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215) [elasticsearch-7.5.2.jar:7.5.2]",
"at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]",
"at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]",
"at java.lang.Thread.run(Thread.java:830) [?:?]"] }

Please take your time to properly format messages, this exception is really hard to read.

It looks as if you are sending more write requests to Elasticsearch than it can sustain. Elasticsearch's write thread pool is scaled based on the number of core's and you seem to have only configured a single core to Elasticsearch.

Because the threadpool and its queue is full, Elasticsearch is returning this error message to tell the sender to try later.

Hope this helps!

@spinscale, Thanks for your response. I have increased no of CPU's t 5. Right now, I am not seeing error in elasticsearch pods. I have to monitor it for few more days.

Thanks,
Kasim Shaik.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.