Data too large using BulkProcessor with size limit

Hello,

We are using a BulkProcessor (ES 7.8.0) with the following properties:

  • Actions: 250
  • Size: 2097152 bytes (2MB)
  • Flush Time: 3000 ms

During a load test we are getting the following exception:

        org.elasticsearch.ElasticsearchStatusException: Elasticsearch exception [type=circuit_breaking_exception, reason=[parent] Data too large, data for [<http_request>] would be [1026389342/978.8mb], which is larger than the limit of [1020054732/972.7mb], real usage: [1026388848/978.8mb], new bytes reserved: [494/494b], usages [request=0/0b, fielddata=0/0b, in_flight_requests=494/494b, accounting=96616/94.3kb]]
    	at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:177)
    	at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:1897)
    	at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:1867)
    	at org.elasticsearch.client.RestHighLevelClient$1.onFailure(RestHighLevelClient.java:1783)
    	at org.elasticsearch.client.RestClient$FailureTrackingResponseListener.onDefinitiveFailure(RestClient.java:598)
    	at org.elasticsearch.client.RestClient$1.completed(RestClient.java:343)
    	at org.elasticsearch.client.RestClient$1.completed(RestClient.java:327)
    	at org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:122)
    	at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:181)
    	at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:448)
    	at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:338)
    	at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)
    	at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
    	at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
    	at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
    	at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
    	at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
    	at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
    	at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
    	at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
    	at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
    	at java.base/java.lang.Thread.run(Thread.java:834)
    	Suppressed: org.elasticsearch.client.ResponseException: method [POST], host [http://localhost:32785], URI [/_bulk?timeout=1m], status line [HTTP/1.1 429 Too Many Requests]
    {"error":{"root_cause":[{"type":"circuit_breaking_exception","reason":"[parent] Data too large, data for [<http_request>] would be [1026389342/978.8mb], which is larger than the limit of [1020054732/972.7mb], real usage: [1026388848/978.8mb], new bytes reserved: [494/494b], usages [request=0/0b, fielddata=0/0b, in_flight_requests=494/494b, accounting=96616/94.3kb]","bytes_wanted":1026389342,"bytes_limit":1020054732,"durability":"PERMANENT"}],"type":"circuit_breaking_exception","reason":"[parent] Data too large, data for [<http_request>] would be [1026389342/978.8mb], which is larger than the limit of [1020054732/972.7mb], real usage: [1026388848/978.8mb], new bytes reserved: [494/494b], usages [request=0/0b, fielddata=0/0b, in_flight_requests=494/494b, accounting=96616/94.3kb]","bytes_wanted":1026389342,"bytes_limit":1020054732,"durability":"PERMANENT"},"status":429}
    		at org.elasticsearch.client.RestClient.convertResponse(RestClient.java:283)
    		at org.elasticsearch.client.RestClient.access$1700(RestClient.java:97)
    		at org.elasticsearch.client.RestClient$1.completed(RestClient.java:331)
    		... 16 common frames omitted

Shouldn't bulk processor prevent these kinds of errors?

Thanks,
Fabrizio

This is a sign you are overloading the cluster. Please see this blog post for further details.

Thanks a lot for the useful resource.

I would expect that the failed requests would have been automatically retried since the BulkProcessor has a backoff exponential retry strategy. Instead it seems the retry mechanism does not kick in in this case. Is that the expected behaviour?

@Christian_Dahlqvist - two questions from that blog:

  1. Is a bulk request atomic, i.e. if a sub-request fails on a node, is the whole batch rejected, and if so, how does that happen as most nodes already indexed the docs?
  2. Is bulk indexing multi-threaded, either on the coord node or the data node, i.e. what is the scaling ability vs. cores? My impression is it's better for the sender to send parallel bulk requests, implying it's single-threaded as the coord or data nodes, or both.

Parts of bulk requests can fail so it is not atomic. Indexing is as far as I know single-threaded per shard, but shards are processed in parallel.

Does the error response indicate which docs failed, else I'd think the sender would not know how/what to re-submit? Sorry, my knowledge on this is poor.

Threaded per shard makes sense, as long as the node-level queue is shard-keyed, which I guess it would be as the coord node sent it there to a specific shard; so if my node has 4 shards for two indexes, it can use 4 threads to index in parallel. And parallel bulk batches won't scale more, but maybe not true on coord side.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.