ElasticSearch(7.7.1) become unresponsive for sometime due to circuit breaker

Hi Team,

I am using elasticsearch 7.7.1. I have 3 data nodes(31GB) and 3 master nodes(16GB). With a heavy load, I am getting below error and I can not see my data node on the monitoring tool for 10-15 mins, also at that particular time, Search is not working. After 10-15 mins, I can see all data nodes on the monitoring tool.

Data too large, data for [<transport_request>] would be [16329231844/15.2gb], which is larger than the limit of [16320875724/15.1gb], real usage: [16329230592/15.2gb], new bytes reserved: [1252/1.2kb], usages [request=0/0b, fielddata=0/0b, in_flight_requests=1252/1.2kb, accounting=0/0b]
at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.checkParentLimit(HierarchyCircuitBreakerService.java:347) ~[na:na]
at org.elasticsearch.common.breaker.ChildMemoryCircuitBreaker.addEstimateBytesAndMaybeBreak(ChildMemoryCircuitBreaker.java:128) ~[na:na]
at org.elasticsearch.transport.InboundHandler.handleRequest(InboundHandler.java:171) ~[na:na]
at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:119) ~[na:na]
at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:103) ~[na:na]
at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:676) ~[na:na]
at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:62) ~[na:na]

For that time, my Search is not working. Can you help me with that?

Other relevant modifications to my jvm.option file

8-13:-XX:+UseG1GC
8-13:-XX:G1ReservePercent=15
8-13:-XX:InitiatingHeapOccupancyPercent=40
8-13:-XX:MaxGCPauseMillis=400
8-13:-XX:ConcGCThreads=4
8-13:-XX:+ParallelRefProcEnabled
8-13:-XX:+UseTLAB
8-13:-XX:+UseStringDeduplication
8-13:-Xlog:gc*=info:file=gc.log:time,tags:filecount=100,filesize=1024k

Other relevant modifications to my elasticsearch.yml file

node.master: true
node.data: false
node.ingest: false
bootstrap.system_call_filter: false
xpack.ml.enabled: false
node.transform: false
node.ml: false
xpack.transform.enabled: false
node.remote_cluster_client: false
discovery.seed_providers: file
thread_pool.write.size: 16
thread_pool.write.queue_size: 8000

Other stats :
Total Indices: 63
2 Primary Shards, 1 Replica
Total Size: 61 GB

We have changed below cluster-level settings.

"refresh_interval": "15s"
"search.idle.after" : "10000d"

Can anyone suggest what is wrong here?

Thanks

What is the output from hot_threads during that time?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.