Calculate number of replicas

Hi,

We are about to set up a new cluster and we are not sure if with 6 data-nodes of 32 GB of RAM, the system will cope up with 2 replicas for each shard. The index isn't very big only around 800K documents so maybe we will go with 1 or 2 shards per index. I guess that the ideal number of replicas is one that ensures that each index has either one replica or a primary in each node so the level of parallelization is maximized.

There is some kind of rule or formula to calculate this?

Thanks in advance,

No rule that I'm aware of, but this blog post from 2017 gives some good guidelines about sharding.

It is also worth mentioning that the number of replica shards is a dynamic index setting which can be changed whenever you want (see Update Indices Settings) so while it makes sense to plan the number of primary shards before creating your indices, the number of replica shards can easily be decided later, for instance after doing some benchmark tests to find the best configuration.

Personally I prefer to use just 1 replica for my indices. This is partly because it gives a reasonably good search performance and fault tolerance - I can lose one data node and still have all the data available for searching (losing two nodes may cause some indices to go red, but is very unlikely and usually signals a more serious problem in the infrastructure which typically renders the cluster inoperable anyway). Another reason I use 1 replica is the cost: With every extra replica you multiply the disk usage of the primary data. Hence, if you've got 1 TB of primary data on disk, 1 replica adds 1 TB, 2 replicas 2 TB etc. For large data sets this can quickly become expensive.

1 Like

I second @Bernt_Rostad's comment. I have found replica with 1 value to be sufficient from search performance perspective and fault tolerance.

Typically average shard size should be in between 20G to 40G. But it also depends upon the number of mapped fields inside your index.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.