We are seeking the frequent occurence of the CircuitBreakingException in our ES cluster (7.2.0). The stack-trace of the exception is as follows:-
[2019-10-16T00:34:30,850][DEBUG][o.e.a.a.c.n.i.TransportNodesInfoAction] [es-data-565858079-3-701791534] failed to execute on node [6R0-yLYrSPi6l2az_0LQ9g]
org.elasticsearch.transport.RemoteTransportException: [es-data-565858085-2-701791749][10.118.18.234:9300][cluster:monitor/nodes/info[n]]
Caused by: org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [<transport_request>] would be [20396991666/18.9gb], which is larger than the limit of [20293386240/18.8gb], real usage: [20396986048/18.9gb], new bytes reserved: [5618/5.4kb]
at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.checkParentLimit(HierarchyCircuitBreakerService.java:343) ~[elasticsearch-7.2.0.jar:7.2.0]
at org.elasticsearch.common.breaker.ChildMemoryCircuitBreaker.addEstimateBytesAndMaybeBreak(ChildMemoryCircuitBreaker.java:128) ~[elasticsearch-7.2.0.jar:7.2.0]
at org.elasticsearch.transport.InboundHandler.handleRequest(InboundHandler.java:173) [elasticsearch-7.2.0.jar:7.2.0]
at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:121) [elasticsearch-7.2.0.jar:7.2.0]
at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:105) [elasticsearch-7.2.0.jar:7.2.0]
at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:660) [elasticsearch-7.2.0.jar:7.2.0]
The node in which the above exception was coming, left the cluster. The status of the ES cluster shown in localhost:9200/_cluster/health?format=json&pretty would become red.
Please tell the root-cause and how to avoid it. Max JVM Heap of the failed data node :- 20 GB (GigaBytes) RAM of the failed data node :- 110 GB Index Size on the failed data node :- 51.8 GB
Thanks.
The information about our cluster is as below:-
Number_of_nodes: 18
Number_of_data_nodes: 12
Active_primary_shards:5 Including .kibana_1
Replication Factor: 3
Active_shards:14 (As the replication factor is 3 and the number of data shards is 4, it means 12 and remaining 2 shards corresponding to .kibana_1)
Documents on one data shard: 19 Million (48 GB)
The queries, we are running are as below:-
To query elasticsearch, we are using Painless scripts in which we are using the Update API and update the document using script.
BulkProcessor class is used for bulk operations. Currently, we execute the bulk after every 10 requests.
Read Query: The range query search returns documents based on the date type field (lying with in range)
Looking at the query you posted it seems you have not specified any minimum_should_match parameter for your should clause. Is this query giving the expected results? What is the purpose of this query? What is the average size of your documents?
Hi Christian , Thank you for replying to this thread. The query is working as expected. Also the query returns the same result with/without "minimum_should_match" parameter. The document size is less than 5KB. The purpose of this query is to fetch records which are updated in last one hour with filter "gte" : "now-1h"
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.