How many shards should I choose?

Zoree · May 26, 2025, 7:00am

Hi, there is a problem on our project and I am trying to understand why it is happening.

One of the reasons that I take into account is how many shards we have installed at the index size, the index can weigh an average of 200GB, with the number of 1 shard. How can this affect performance?

Error what we have:
[parent] Data too large, data for [indices:data/write/bulk[s]] would be [27047392520/25.1gb], which is larger than the limit of [26521423052/24.6gb], real usage: [27047391936/25.1gb], new bytes reserved: [584/584b], usages [request=0/0b, fielddata=0/0b, in_flight_requests=584/584b, model_inference=0/0b, accounting=216717508/206.6mb]

dadoonet · May 26, 2025, 7:54am

Welcome!

You normally don't want to exceed 50gb per shard.

Please read

RainTown · May 26, 2025, 8:12am

Welcome to the forum @Zoree

It's always helpful to include as much as possible info on your setup, e.g. how many nodes, what hardware spec of what resources allocated to the nodes, what version of elasticsearch, a simple one-sentence idea of what your cluster does (logs, security, whatever), ingest pattern, average document sizes, ...

My understanding of what you wrote is you have a number of indices (how many is not given), sizes which average around 200GB per index, so some bigger and some smaller, and each index is one primary shard, and an unknown number of replica shards. And you have also tried to bulk ingest 25.1GB of data in one call, which has failed as its bigger than some elasticsearch limit. Thats an error.

If I've understood wrong, please correct me.

If you want to get past the error without changing anything, then break it up into smaller chunks, both now and on an ongoing basis. Personally, I think that would seem like a sensible thing to do anyways.

The limits can be seen with

curl -sk -u USER:PASSWORD https://ESHOST:9200/_nodes/stats/breaker

I think there is a way to increase the specific limit, but I'd rather know more about what you are doing before going there.

At

there's a section titled:

"Aim for shards of up to 200M documents, or with sizes between 10GB and 50GB"

for which the one line summary is "Very large shards can slow down search operations and prolong recovery times after failures".

Zoree · May 26, 2025, 8:17am

Thanks, i found some information about shard sizing and performance issue

Topic		Replies	Views
Just how big should an index be allowed to be? Elasticsearch	2	1679	July 6, 2017
How is data within an index distributed into shards? Elasticsearch	6	514	March 27, 2018
Limit for shard size? Elasticsearch	2	3740	July 5, 2017
Question: Regarding the Maximum Shard Size Elasticsearch	2	2255	July 6, 2017
Too big a shard vs Too many shards Elasticsearch	7	37790	March 22, 2017

How many shards should I choose?

Related topics