Parent data too large?

I am getting the below error on a shard assignment.

org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [internal:index/shard/recovery/translog_ops] would be [28701173408/26.7gb], which is larger than the limit of [28306518835/26.3gb], real usage: [28700173096/26.7gb], new bytes reserved: [1000312/976.8kb], usages [fielddata=17644219506/16.4gb, eql_sequence=0/0b, model_inference=0/0b, inflight_requests=1461110/1.3mb, request=0/0b]

But the size of the shard is only few hundred MB. Why is it complaining about tripping limit in GB?

I am encountering an issue where the cluster is attempting to assign all the backup shards to one single node out of about 90. This is a very strange balancing algo. Digging further, I found out that other nodes I tried manually assigning the shard to, I got this error.

Any idea?

It is hard to speculate without more information.

What is the full output of the cluster stats API?

We have just added 10 more data nodes so it is moving shards now. I’ll wait for it to finish before investigating further if the issue persist since it’s a production cluster.

My gut feeling is that the node’s are having too much data and it tripped the circuit breaker’s limit on how much data can be stored in RAM.

Is that particular circuit breaker designed for the purpose I described above? I guess that was kind of what I was originally asking.

That’s the only way the error message would make sense (guessing here. trying to get clarity). The text description seems to suggest the data being moved is too large, which clearly is not the case.

Thanks.

It looks like you may have high heap pressure and the stats I requested may help point to possible causes. If you are monitoring heap usage, look at it to see if you spot any patterns across the nodes in GC frequency and average levels.

The cluster seems fine now. I ended up deleting some old indices and adding new data nodes.

The observed behavior stopped for now.

I’m wondering how much data per data node is recommended?

We are running on AWS m7g.4xlarge (or equivalent with 64GB ram). EBS is at 5TB. We don’t let it go above 80%. Is 5TB too much per node?

There is no easy general answer to this as it depends a lot on the use case specifics, e.g. the data, mappings and the level and mix of load the cluster is under.

assuming all indices are active. What is a general guide line?

Is 5TB way too much? About right? Too low (meaning ES can handle much more easily)? Etc.

All indices being active is as I stated earlier not enough information. I have seen logging use cases with 5TB of data or more on nodes. I have also seen search heavy use cases with a lot of analysis and advanced features be limited at much lower levels.

The actual version you are on also matters, and I forgot to mention this in the previous reply. Recent versions of Elasticsearch require less overhead per data volume compared to older ones.

You are considering a multi-dimensinal issue along just one dimenstion, how much data per node. A few other dimensions are how much new data per node per day, the typical and peak query profiles, your requirements for indexing throughput and query response, your expectations for recovery time if/when a node fails. Plus plenty of others.

To give a sporting analogy, you are asking how fast your strikers should be for a successful football team.

1 Like

Thanks for all the responses. I know it’s hard to give a suggestion without knowing more. But the reality is it’s hard to give precise description to someone who’s not involved as well. Sometimes too much info is counter productive. :slight_smile:

I think I got my answer from the responses already. 5TB is within range.

If I say my node contains 5 peta bytes of data, someone would’ve chime in (or with curiosity) why such large size per node and start giving better suggestions, etc.

Just to give more insight into our usage in case someone found this thread in the future with similar question.

We are very write heavy; therefore, my question was purely predicated on relationship of EBS size to write performance on AWS. If reading is causing CPU spike, I can easily determine that and make changes to our cluster (relative to our use case). So my original question already removed the searching dimension.

Thanks again.

As you did not provide any details about the use case I was not aware that the use case was write heavy. When I have in the past sized clusters, especially in older versions of Elasticsearch (you have not specified which version you are on), where some nodes need to support heavy write loads I have generally had these nodes have a lot less data than 5TB. I also try to make sure these types of nodes have very fast storage, ideally ephemeral SSDs on AWS. You have not specified what type of EBS you are using nor whether you have any provisioned IOPS. How far you will get with your current setup will depend on a number of factors, e.g: number of indices and shards actively indexed into, average bulk size, average write size, document size/complexity and what the load profile looks like.

SSD (we increased the max throughput to 300 from default of 125), version 8.10. Each node is less than 300 shards.

I keep forgetting about these.

Sorry.

Our cluster mainly hosting monthly indices. Some daily and much fewer hourly (ratio is about 9:2:1).

So every hour, day, and month will have new indices being created and deleted.

Is that IOPS you specified? For a write heavy load 300 IOPS is very low and could easily become a bottleneck if it is not already. It may cause operations to take longer than they should, which could in turn result in additional heap pressure.

Of the 300 shards per node, how many are actively written to?

I have never used hourly indices. The reason for this is that it often results in small shards, often less than 10GB in size. If the indexing rate is large enough to result in shard sizes within the recommended range of 30GB to 50GB I have often found that instead increasing the number of primary shards and spread the load out works better. The only reason I would consider using hourly indices is if I had a very short retention period, typically 24-48 hours or less, but I have yet to come across such a scenario.