I've run into troubles with following exception
org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [<http_request>] would be [32136353583/29.9gb], which is larger than the limit of [32127221760/29.9gb] at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.checkParentLimit(HierarchyCircuitBreakerService.java:230) ~[elasticsearch-6.2.4.jar:6.2.4] at org.elasticsearch.common.breaker.ChildMemoryCircuitBreaker.addEstimateBytesAndMaybeBreak(ChildMemoryCircuitBreaker.java:128) ~[elasticsearch-6.2.4.jar:6.2.4] at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:232) [elasticsearch-6.2.4.jar:6.2.4] at org.elasticsearch.rest.RestController.tryAllHandlers(RestController.java:336) [elasticsearch-6.2.4.jar:6.2.4] at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:174) [elasticsearch-6.2.4.jar:6.2.4]
I'm sure our HW setup is quite tight compared to the amount of data we want to index, but still I'd like to understand what is exactly causing this type of error and if there is way how to avoid it.
In the past, there was terribly big index (10+ TB) with shards multiple times bigger, than it is recommended maximum of 50 GB/shard. The new setup has much more shards and the index is is split to several smaller indices. At first, it was fine, but as the amount of indexed data has grown, the very same error started to appear again. There is no cluster, just one Elastic instance running all indexing/querying.
Playing with circuit breakers limits seems to push the error bit further, but it is still there and with more data it is inevitable.
- What exactly is causing the "Data too large" error on bigger data sets?
- What are possible ways of avoiding/preventing it except for extending number of nodes and adding more powerful hardware?
Thanks in advance.