Optimal shards: 1 or number of nodes? Considerations

cdekker · August 1, 2018, 11:02am

The data set is a realistic representation and requests are indeed CPU bound. However, with 8 clients in parallel, the benchmark on 20 shards seems to consume nearly all CPU of the entire cluster. Having a smaller shard size seems to increase the latency and decrease the ops/s, but at least the cluster won't be fully overloaded during the handling of this batch.

Would it be a good conclusion to draw that less shards allows you to service more concurrent requests at once without running the risk of overloading the cluster?

We are currently on 20 shards for all 250 indices (=14K shards) which is causing exactly this: 1 heavy user will bring the entire cluster into critical load, causing time outs for all others.

I am looking into bringing the number of shards down to more sensible values (leaning towards 4 for small and 8 for big ( >20GB) indices), but I am having a tough time benchmarking correctly to get to these values, especially since it seems that ops/s actually goes down with less shards.

I am using Rally to fire (just) queries directly at our production cluster (not feasible to set up a realistic copy of this environment for lab tests).

Topic		Replies	Views
Figuring out the optimal number of shards Elasticsearch	6	1654	July 6, 2017
Too big a shard vs Too many shards Elasticsearch	7	37773	March 22, 2017
Is there a preferred config for Index / Shard configuration? Lots of indexes with lots of shards or fewer indexes and bigger shards? Elasticsearch	3	690	July 6, 2017
Elasticsearch Index shards per nodes Elasticsearch	13	1567	October 5, 2020
How many shards should I put in a node？ Elasticsearch	7	1470	July 5, 2017

Optimal shards: 1 or number of nodes? Considerations

Related topics