Upper limits on replication and recovery heap usage

Pradyumna_Achar · July 13, 2020, 1:59pm

Hello,
I'm running into a strange situation with elasticsearch. I would like to estimate the Xmx needed for running elasticsearch data nodes while accounting for the heap needed during recovery.

All my documents are small (in KBs), and the writers which index the data into elasticsearch use small bulk requests (of a maximum 5000 documents).

However, when an index goes yellow for some time (due to one node going down and then coming back online), the recovery process sends large requests and starts tripping circuit breakers. The request sizes grow and the index never recovers to green.

I would like to budget for these large requests coming in from the recovery process which seem to be reserving several hundreds of MBs from the circuit breaker. Is there an upper limit to these? In the worst case would an entire shard be sent as a single bulk request across to the other node during replication?

Thank you.

failed to perform indices:data/write/bulk[s] on replica [xx][0], node[wy6bebBaQOC85iEi5vnJrA], [R], s[STARTED], a[id=ZxIAa9ZPRgiKxvT42TUCNg]" ,
"stacktrace": ["org.elasticsearch.transport.RemoteTransportException: [elasticsearch-data-1][100.64.89.127:9300][indices:data/write/bulk[s][r]]",

CircuitBreakingException: [parent] Data too large, data for [<transport_request>] .. new bytes reserved: [454588504/433.5mb].

DavidTurner · July 13, 2020, 3:06pm

This isn't recovery traffic, it's a normal bulk indexing request. Recovery actions have names that start with internal:index/shard/recovery/

Recovery actions are limited in size to 512kB and there's no more than 5 of them in flight at once, so that's a max heap usage of under 3MB per recovery. There will be some amount overhead on top of that but it's certainly not several hundred MB.

system · August 10, 2020, 3:06pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
CircuitBreakingException: [parent] Data too large, data for [<transport_request>] Elasticsearch	7	23700	September 5, 2018
Facing data too large exception frequently Elasticsearch	5	619	December 24, 2020
Questions about index.allocation.max_retries? Elasticsearch	3	495	April 14, 2021
Recovery heap usage Elasticsearch	2	371	July 5, 2017
Circuit_breaking_exception Elasticsearch	6	2089	May 24, 2022

Upper limits on replication and recovery heap usage

Related topics