I know the number of shards is limited by the amount of heap. For example, on my hot node, I have 30GB of heap so limit is 600 shards on this node (20 shards par GB of heap)
But what about the limit for cold nodes where shards/indices are not accessed frequently ? If, let's say, I have 2000 shards of frozen indices on my cold node, how much amount of heap does I need ?
Does the shards recommended limit apply to cold nodes/frozen indices ?
The 20-shards-per-GB-of-heap guideline is a good starting point that came from years of experience of different clusters and different datasets and a bunch of careful experiments. Frozen indices should in theory be somewhat lighter on heap compared to regular indices, but I don't think we have built up enough experience or done enough experiments to give a similarly broad guideline specifically for nodes that only hold frozen indices, sorry.
There are other considerations, e.g. a 600-shard node could represent 30TB of data if the shards target a typical size of ~50GB. If you went to 2000 shards you'd be looking at 100TB per node. Nodes are the unit of failure in a cluster; are you sure you are ok losing 100TB in a single event? It might take days or weeks to recover that much data elsewhere.
Heap usage has been improved in recent versions with the introduction if frozen indices and more heap efficient handling of document IDs and this likely means less heap pressure due to shards. It is however not only heap usage that is impacted by the number of indices and shards in a cluster. The more indices and shards you have in a cluster, the larger the cluster state will likely be and it can get slower to update and propagate as it grows. Eventually this may become a bottleneck and cause stability issues. I am sure this is also being worked on and is likely to improve over time and maybe then a new set of guidelines will be made available.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.