Numerous small shards or Few big shards?

I am planning on the number of shards for 2 clusters. One cluster has 3 nodes while the other cluster has 5 nodes. The projected index size for both clusters have been estimated.

With regards to search performance and index performance, is it better to have numerous small sized shards or fewer big sized shards?

I have read that many shards will have system overheads while on the other hand, there is recommended size limit of a few tens of GB per shard.

(Usually) More shards means better indexing performance, less means better search performance.
Having a lot of small shards tends to waste resources.

What sort of data is it?

I am indexing unstructured text data like research white papers and thesis papers. For text heavy data with a focus on search performance, is fewer shards better suited for that?

I am guessing fewer shards would be better suited because the volume of documents to index is not that heavy.

Fewer primary shards yes. You can always increase replicas :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.