Frozen indices and shard limitation

Hello all !

I know the number of shards is limited by the amount of heap. For example, on my hot node, I have 30GB of heap so limit is 600 shards on this node (20 shards par GB of heap)

But what about the limit for cold nodes where shards/indices are not accessed frequently ? If, let's say, I have 2000 shards of frozen indices on my cold node, how much amount of heap does I need ?

Does the shards recommended limit apply to cold nodes/frozen indices ?

Thanks for your feedback ! :slight_smile:

Anyone ?

The 20-shards-per-GB-of-heap guideline is a good starting point that came from years of experience of different clusters and different datasets and a bunch of careful experiments. Frozen indices should in theory be somewhat lighter on heap compared to regular indices, but I don't think we have built up enough experience or done enough experiments to give a similarly broad guideline specifically for nodes that only hold frozen indices, sorry.

There are other considerations, e.g. a 600-shard node could represent 30TB of data if the shards target a typical size of ~50GB. If you went to 2000 shards you'd be looking at 100TB per node. Nodes are the unit of failure in a cluster; are you sure you are ok losing 100TB in a single event? It might take days or weeks to recover that much data elsewhere.

2 Likes

Heap usage has been improved in recent versions with the introduction if frozen indices and more heap efficient handling of document IDs and this likely means less heap pressure due to shards. It is however not only heap usage that is impacted by the number of indices and shards in a cluster. The more indices and shards you have in a cluster, the larger the cluster state will likely be and it can get slower to update and propagate as it grows. Eventually this may become a bottleneck and cause stability issues. I am sure this is also being worked on and is likely to improve over time and maybe then a new set of guidelines will be made available.

2 Likes

Very interesting ! I will take into account these advices. Thanks a lot @DavidTurner and @Christian_Dahlqvist

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.