Signs of too few shards? Number of documents per shard?

egalpin · June 24, 2021, 5:58pm

Hi all,

I'm experiencing performance challenges with an ES cluster. The cluster consists of 6 data nodes (3 hot w/ ~30GB heap, 3 warm w/ ~15GB heap). The cluster holds around 4TB of data total, with a total of 20,000,000,000 (20 billion) docs. Indexes are 6 shards each, where each shard holds 25-35GB of data.

The overwhelming majority of the docs are very small, effectively key-value pairs. Those small docs are nested docs, but still indexed as documents in Lucene. I have read on numerous occasions that optimal shard size is "a few GB to a few tens of GB" or "less than but close to 50GB".

I see less info about the ideal number of documents per shard. I know that Lucene (i.e. a single shard) has an upper limit of 2 billion docs. But does the performance really not change regardless of the number of docs? My empirical findings would suggest that number of docs does in fact matter. I see that search threads consistently appear in hot threads, even if no indexing is happening. I can also see that memory usage and CPU usages are low (CPU @ 20%, mem @ 50%).

Could this indicate that for my use case, splitting the indexes into more, smaller shards might yield improved search latency? Under what circumstances might one assume that the number of shards in use is too small?

mayya · June 24, 2021, 9:33pm

Indeed having more smaller shards allows to parallelize search and get results faster, but only if a node has enough threads in the search thread pool. If the node's search threads are already fully and constantly occupied, having more shards will not help search.

egalpin · June 25, 2021, 1:13pm

Thanks very much for the reply @mayya ! This is very valuable insight

system · July 23, 2021, 1:14pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Limit for shard size? Elasticsearch	2	3688	July 5, 2017
Is it ok to have too many tiny shards in a cluster? Elasticsearch	11	2497	December 1, 2017
Numerous small shards or Few big shards? Elasticsearch	4	415	April 20, 2018
Trying to optimize Elasticsearch cluster Elasticsearch	3	963	February 20, 2017
Maximum Shard Size in ElasticSearch Elasticsearch	2	19961	July 5, 2017

Signs of too few shards? Number of documents per shard?

Related topics