Why default number_of_shards is 1 in future release


(Venkata Narasimha Rao Sandu) #1

older and current version of elasticsearch default number_of_shards is 5 so that it distribute and parallelize operations across shards thus increasing performance/throughput
but why in the future releases default value for number_of_shards is 1 ?


(Tim Vernum) #2

5 is not a univerally better number of shards than 1.
For some use cases 5 is fine, but for other use cases, 3 would be better, or 9, or ...
In other cases you want 1 shard, and 4 replicas (or 9 replicas) because you have a relatively small amount of data, but heavy search workloads.

The best practice is to set the number of shards on an index-by-index basis based on the number of nodes you run, your ingest method, your search style, and any data rollover you do.
While 5 wasn't a bad default, it was by no means "the right answer" for all users.

The problem with a default of 5 is that it for many use cases - particularly logging and metrics - it causes significant over-sharding. We see far more issue caused by clusters with too many shards than clusters with not enough.

Having too few shards tends to show up relatively quickly once you do any sort of performance measurement and optimisation. Having too many shards tends to show up weeks or months after production deployment when you end up with 3 months worth of time-based indices, each configured with 5 shards.

A default of 1 prevents customers from running into the oversharding problem that we see a lot of, and encourages those who want to optimise for performance to actually think about the right number for their usage (rather than simply relying on 5).

And we have the Split API if you decide you need additional shards.


(Venkata Narasimha Rao Sandu) #3

appreciate! your time for answering my query


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.