Evidence/Benchmarking behind Shards/Datanode Recommendation

I'm curious about the recommendation put forth in https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster, specifically that

A good rule-of-thumb is to ensure you keep the number of shards per node below 20 per GB heap it has configured. A node with a 30GB heap should therefore have a maximum of 600 shards, but the further below this limit you can keep it the better. This will generally help the cluster stay in good health.

I've definitely seen the total count of shards in a cluster affect master node performance, but I'm curious what evidence or benchmarking would lead to the per-datanode recommendation. I'm mainly trying to determine if alerting on this metric is needed as a trigger for scaling a cluster.

The most common mistake users tend to make is to overshard, and you can see plenty of examples here in this forum. I have lost count of the number of times I have linked to the blog post you referenced. This is, as stated, a simple rule of thumb that should work for small as well as reasonably large clusters and allow nodes to still hold large amounts of data. It has has been set based on experience and discussions with Elastic support. Naturally there are lots of cases where smaller clusters with lots of heap may be able to handle larger number of shards, but this has the potential to cause problems as the size of the cluster grow, at which point fixing the problems often can get quite difficult.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.