EsSparkSQL.saveToEs() CircuitBreakingException: [parent] Data too large, data for [<transport_request>]

Enrique_Garcia_Garci · December 18, 2019, 4:45pm

I'm trying to get a single line JSON datasource of around some 26M records, apply some logic (two filters and then a "select" to get the desired results -6 fields at all-) and save them in ES... so far so good.

Tha problem seems to be the EsSparkSQL.saveToEs() which always raises the circuit breaker exception. If this connector is responsible of doing whatever it does (I don't really know why this connector takes that large amount of tasks/jobs to save already "formated" data, except it is not JSON) and then save to ES why this exception is raised? Shouldn't it be smart enough to check the max data size and flush the bulk before that limit is exceed?

This is the exact exception (bailing out):
org.elasticsearch.hadoop.rest.EsHadoopRemoteException: circuit_breaking_exception: [parent] Data too large, data for [<transport_request>] would be [259268254/247.2mb], which is larger than the limit of [254332108/242.5mb], real usage: [258560280/246.5mb], new bytes reserved: [707974/691.3kb], usages [request=0/0b, fielddata=38332/37.4kb, in_flight_requests=1395602/1.3mb, accounting=5724890/5.4mb]

I've also tried to set the breaker limit up to 99% but with same result. KO.

String payload = "{\"persistent\" : {\"indices.breaker.total.limit\" : \"99%\"}}"; 
StringRequestEntity requestEntity = new StringRequestEntity(payload, "application/json", "UTF-8");
PutMethod putMethod = new PutMethod(host + CLUSTER_SETTINGS_ENDPOINT);
putMethod.setRequestEntity(requestEntity);
int statusCode = httpClient.executeMethod(putMethod);

If answer is no, how can I fix it?
Thanks.

system · January 15, 2020, 4:45pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Data too large, data for [<transport_request>] - CircuitBreakingException Elasticsearch es-hadoop	1	1259	July 2, 2020
Circuit_breaking_exception [parent] Data too large, data for [<http_request>] Elasticsearch es-hadoop	1	2263	July 29, 2020
Circuit breaker exception Elasticsearch	15	580	July 5, 2022
ES Spark Connector - Circuit Breaker error Elasticsearch es-hadoop	3	1268	July 30, 2019
[parent] data too large Elasticsearch	1	512	June 18, 2020

EsSparkSQL.saveToEs() CircuitBreakingException: [parent] Data too large, data for [<transport_request>]

Related topics