In ES 7.8 parent breaker is tripping a lot and causing unallocation of shards

sreekanth · July 14, 2022, 11:14am

We have an 8 node cluster and our load (mainly bulk ingest) is pretty high. Earlier the same load was handled well by 6 nodes in ES6.8. Now after moving to 7.8, we see many replica shards get unallocated during load.

allocation api tells the reason as

"details" : "failed shard on node [zC2EkvPLQiWpJ_YjnllD5w]: failed to perform indices:data/write/bulk[s] on replica [10fc5a76ee7042b3ad5bf620ac9fdb39-psrtenant15-fa-cse-asset][0], node[zC2EkvPLQiWpJ_YjnllD5w], [R], s[STARTED], a[id=6xKPtXO5TeyjZL12zRA7rA], failure RemoteTransportException[[psrnativefa112521-esdata4][100.104.145.203:9300][indices:data/write/bulk[s][r]]]; nested: CircuitBreakingException[[parent] Data too large, data for [indices:data/write/bulk[s][r]] would be [31182253448/29gb], which is larger than the limit of [30601641984/28.5gb], real usage: [31181936024/29gb], new bytes reserved: [317424/309.9kb], usages [request=256/256b, fielddata=64205239/61.2mb, in_flight_requests=60178048/57.3mb, accounting=1148757896/1gb]]; ",`

Issue: Basically parent breaker is hitting the limit of 28.5GB and our heap is 30GB.

If We increase the parent breaker to 29.5GB, I see fewer shards getting un allocated but still the issue resides.

Our JVM args already have below args which should help in this case as per a few old discussions. But not helping much.

-XX:G1ReservePercent=25
-XX:InitiatingHeapOccupancyPercent=30

Please let us know what can be done to avoid this. We can disable this breaker but there will be a purpose for this and don't want to disable it.

sreekanth · July 18, 2022, 5:10am

Hi Team, any update on this? Few queries

How frequently this parent breaker usage is calculated?
While calculating parent breaker usage, do you just pick the latest current heap usage? Or after GC usage? As in my case, JVM usage is going to > 29GB several times but GC is bringing it down to <22GB. So this should not be considered as circuit break right?

DavidTurner · July 18, 2022, 10:58am

7.8 is really old, long past EOL, and newer versions are much more memory-efficient.

The first thing to try is to upgrade to a version that hasn't passed EOL.

system · August 15, 2022, 10:58am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch unassigned shards CircuitBreakingException[[parent] Data too large Elasticsearch docker	1	766	November 27, 2020
CircuitBreakingException[[parent] Data too large on upgrading to elasticsearch 7.7 from 5.16 Elasticsearch	4	502	January 7, 2021
Relationship between heap-size and shard-allocation - ES 7.0.x Elasticsearch	3	861	April 2, 2020
Parent circuit breaker calculation seems to be wrong with version 7.x Elasticsearch	24	10962	November 4, 2022
Elasticsearch 7.7.1 shards getting unassigned Elasticsearch	6	593	December 17, 2020

In ES 7.8 parent breaker is tripping a lot and causing unallocation of shards

Related topics