ES Spark Connector - Circuit Breaker error

Daniel_Solow · June 25, 2019, 2:02pm

I'm using ES 7.1.1 and Spark 2.4.2. The ES cluster is on Google Kubernetes Engine and the Spark cluster is on Google Dataproc.

Big jobs are failing with the following error, often several hours into the job:

19/06/25 08:46:15 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 5.0 in stage 2.0 (TID 556, cluster.name, executor 3): org.apache.spark.util.TaskCompletionListenerException: org.elasticsearch.hadoop.rest.EsHadoopRemoteException: circuit_breaking_exception: [parent] Data too large, data for [<http_request>] would be [5127135952/4.7gb], which is larger than the limit of [5067151769/4.7gb], real usage: [5127135952/4.7gb], new bytes reserved: [0/0b]

It then prints the batch request, which is very large.

Any ideas on how to prevent this kind of error? It looks to me like ES is not keeping up with the rate of requests, so memory usage is increasing until requests are rejected.

In this case, it would be nice if Spark slowed down. It looks like retries are enabled, but the back-off time doesn't appear to increase.

Any tips on resolving this problem?

dakrone · July 2, 2019, 5:13pm

Hi Daniel,

This sort of request could be caused by a number of things, taking a look at the message:

circuit_breaking_exception: [parent] Data too large, data for [<http_request>] would be [5127135952/4.7gb], which is larger than the limit of [5067151769/4.7gb], real usage: [5127135952/4.7gb], new bytes reserved: [0/0b]

So in this case, the "parent" breaker was tripped, the parent breaker is the sum of all the other breakers, the first thing to do in this case is to check the nodes stats API with:

GET /_nodes/stats/breaker?human&pretty

This will return all the breakers for that node, you can then see if any of the other breakers are contributing to the limit causing the breaker to trip.

Next, since this is 7.1, the real memory circuit breaker samples the actual memory usage of ES to try and prevent an OutOfMemoryError, so if the breakers don't tell you where the memory is being used, it may be good to check how large of a request you are sending to ES.

Daniel_Solow · July 2, 2019, 5:25pm

Thanks for your response.

We figured out the issue: the analyzers we were using unnecessarily included a "completion" analyzer that uses a lot of JVM memory for large indices. Removing this analyzer resolved the problem.

system · July 30, 2019, 5:25pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Circuit_breaking_exception [parent] Data too large, data for [<http_request>] Elasticsearch es-hadoop	1	2263	July 29, 2020
Circuit breaker exception Elasticsearch	15	580	July 5, 2022
EsSparkSQL.saveToEs() CircuitBreakingException: [parent] Data too large, data for [<transport_request>] Elasticsearch es-hadoop	1	1074	January 15, 2020
Data too large circuit breaking exception after migrating to 7.12 Elasticsearch	4	946	May 19, 2021
Stability issues when uploading Databricks tables to Elasticsearch indices (circuit_breaking_exception) Elasticsearch es-hadoop	7	433	September 11, 2023

ES Spark Connector - Circuit Breaker error

Related topics