Elasticsearch bulk rejection error in logstash logs

Hello,

I am using ELK stack v6.5.3.
I have a 7 node cluster (5 datanodes [elasticsearch + logstash ] and 2 coordinator nodes [elasticsearch + kibana])
from past some days I am getting below error on one of my logstash nodes
[2020-09-21T16:05:50,889][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 429 ({"type"=>"es_rejected_execution_exception", "reason"=>"rejected execution of processing of [142901911][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[indexName-2020.39][0]] containing [5] requests, target allocation id: F50tII90RQuy4HdjO_G6Kw, primary term: 1 on EsThreadPoolExecutor[name = node1/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@638276c2[Running, pool size = 8, active threads = 8, queued tasks = 200, completed tasks = 99902900]]"})

There is a lot of gc as well.
I read the below document but not able to figure out the problem.
https://www.elastic.co/blog/why-am-i-seeing-bulk-rejections-in-my-elasticsearch-cluster
CPU utilization is also going high for the same node around 98% as an 95 percentile aggregation with a @timestamp bucket of every second.
Also there is a huge lag in the logs , I hope the reason is this only please suggest on this as well.
Please help.

What is the output from GET /_cluster/stats?

{
"_nodes" : {
"total" : 7,
"successful" : 7,
"failed" : 0
},
"cluster_name" : "clusterName",
"cluster_uuid" : "WmJ66kYmSaa6osmtVFnkrQ",
"timestamp" : 1600794514198,
"status" : "green",
"indices" : {
"count" : 652,
"shards" : {
"total" : 1336,
"primaries" : 682,
"replication" : 0.9589442815249267,
"index" : {
"shards" : {
"min" : 1,
"max" : 6,
"avg" : 2.049079754601227
},
"primaries" : {
"min" : 1,
"max" : 5,
"avg" : 1.0460122699386503
},
"replication" : {
"min" : 0.0,
"max" : 1.0,
"avg" : 0.9877300613496932
}
}
},
"docs" : {
"count" : 9137129079,
"deleted" : 2309
},
"store" : {
"size_in_bytes" : 5226509834484
},
"fielddata" : {
"memory_size_in_bytes" : 326647976,
"evictions" : 0
},
"query_cache" : {
"memory_size_in_bytes" : 8315906,
"total_count" : 2175939,
"hit_count" : 44890,
"miss_count" : 2131049,
"cache_size" : 1548,
"cache_count" : 3290,
"evictions" : 1742
},
"completion" : {
"size_in_bytes" : 0
},
"segments" : {
"count" : 19411,
"memory_in_bytes" : 10035861353,
"terms_memory_in_bytes" : 7088193492,
"stored_fields_memory_in_bytes" : 1451567376,
"term_vectors_memory_in_bytes" : 0,
"norms_memory_in_bytes" : 48147968,
"points_memory_in_bytes" : 991423913,
"doc_values_memory_in_bytes" : 456528604,
"index_writer_memory_in_bytes" : 399487824,
"version_map_memory_in_bytes" : 247484,
"fixed_bit_set_memory_in_bytes" : 0,
"max_unsafe_auto_id_timestamp" : 1600761340377,
"file_sizes" : { }
}
},
"nodes" : {
"count" : {
"total" : 7,
"data" : 5,
"coordinating_only" : 2,
"master" : 5,
"ingest" : 5
},
"versions" : [
"6.5.3"
],
"os" : {
"available_processors" : 48,
"allocated_processors" : 48,
"names" : [
{
"name" : "Linux",
"count" : 7
}
],
"mem" : {
"total_in_bytes" : 405139738624,
"free_in_bytes" : 16613928960,
"used_in_bytes" : 388525809664,
"free_percent" : 4,
"used_percent" : 96
}
},
"process" : {
"cpu" : {
"percent" : 174
},
"open_file_descriptors" : {
"min" : 413,
"max" : 2374,
"avg" : 1681
}
},
"jvm" : {
"max_uptime_in_millis" : 1884453538,
"versions" : [
{
"version" : "1.8.0_131",
"vm_name" : "Java HotSpot(TM) 64-Bit Server VM",
"vm_version" : "25.131-b11",
"vm_vendor" : "Oracle Corporation",
"count" : 7
}
],
"mem" : {
"heap_used_in_bytes" : 76398398920,
"heap_max_in_bytes" : 182117728256
},
"threads" : 924
},
"fs" : {
"total_in_bytes" : 15218212196352,
"free_in_bytes" : 9864063373312,
"available_in_bytes" : 9167827349504
}
}
}

hope this helps :slight_smile:

I followed below doc and reduced the heap from 32 to 26 (I think I have given a huge amount for less number of shards following the basic formula of 20 shards per GB of heap , I have around 266 shards per node )
https://www.elastic.co/blog/a-heap-of-trouble#fn4

@warkolm Gentle reminder please.........

How many indices and shards are you actively indexing into? How are you indexing into Elasticsearch (what does you Elasticsearch output config in Logstash look like)?

I have around 50 weekly indices and 1 daily index. for the weekly index I have 1 shard and for daily I have 3 shards.
I am simply giving my outputs in output.elasticsearch.
Sorry I can not share the file here.
Do you want ay specific info from that file?