Relationship between nodes count, shard count and shard size

macymin · September 27, 2016, 6:33am

Hi is there any guide line for above topic, I've read many posts and people are saying different thing.

some says each shard size is better not to exceed heap size which is around 30 to 32gb and
some says not to exceed 50GB per shard

if that is the case, if my daily injection for index A is 600GB (currently i configure to use 4 shards), based on above guideline, i will need to use 12 to 20 shards per index!! but I only have 4 nodes in my cluster. will that be too much for a 4 node cluster?

how can I achieve a good balance among the number of node, shard count and shard size to maximize my elk cluster performance?

warkolm · September 27, 2016, 6:39am

Dunno where that came from.

This is our usual advice.

This is fine. You do not need a 1:1 shard:node relationship.

You do need to make sure you don't have lots of shards.

macymin · September 27, 2016, 6:47am

Thanks for the super fast reply mark.

You do need to make sure you don't have lots of shards.>

hmm..how many is considered lots of? what is the limit?
from below article it says 1.5 to 3 times of node no., for my case, i can have only 12 shards (3*4)...

https://qbox.io/blog/optimizing-elasticsearch-how-many-shards-per-index

warkolm · September 27, 2016, 7:02am

Yeah, which doesn't really make a lot of sense does it

Honestly, with time series data, we see anywhere from a few shards to hundreds (or even thousands). When you start getting past 500 shards per ~32GB heap you start running into major issues with the resources required to maintain those shards simply taking more than the actual querying and analysis of the data inside them.

Christian_Dahlqvist · September 27, 2016, 7:04am

The optimal number and size of shards depend a lot on the use-case, data, queries and query patterns. Elasticsearch can be used for a lot of different types of use cases, and they all require different types of optimisations. The recommendations you quote are not generally applicable, and I would guess they might be more applicable to search use-cases. For logging and analytics use-cases where time-based indices are used we generally see considerably larger number of shards per node as larger data volumes are involved. In order to find the ideal size and number of shards for a cluster, we generally recommend perform benchmarks with real data and queries in order to size the cluster.

macymin · September 27, 2016, 7:13am

ohhh then we are still far enough from 500 shards, we only have 70 shards per day for all the indexes.

macymin · September 27, 2016, 7:19am

The optimal number and size of shards depend a lot on the use-case, data, queries and query patterns. Elasticsearch can be used for a lot of different types of use cases, and they all require different types of optimisations. The recommendations you quote are not generally applicable, and I would guess they might be more applicable to search use-cases. For logging and analytics use-cases where time-based indices are used we generally see considerably larger number of shards per node as larger data volumes are involved. In order to find the ideal size and number of shards for a cluster, we generally recommend perform benchmarks with real data and queries in order to size the cluster.

hi Christian, thanks for the reply.
our use case is we inject all different kind of application log files using ELK and build dashboard to show the trending (time series data) and we do have our in-house alerting system integrated with elasticsearch with real-time query and alerting. to conclude, we need the data to be available as real-time as possible.

Topic		Replies	Views
Trying to optimize Elasticsearch cluster Elasticsearch	3	963	February 20, 2017
Evidence/Benchmarking behind Shards/Datanode Recommendation Elasticsearch	2	385	August 30, 2019
Max number of shards per elasticsearch data node Elasticsearch	5	1329	June 4, 2017
Too big a shard vs Too many shards Elasticsearch	7	37226	March 22, 2017
Not able to figure out number of shards in my cluster Elasticsearch	5	387	October 14, 2019

Relationship between nodes count, shard count and shard size

Related topics