Cluster currently has [1000]/[1000] maximum normal shards open


From the title, this is an error I usually encounter with my Elastic Proof Of Concept.
The workaround is easy, I simply close and delete some indices from time to time...

But now I'm currently deploying Elastic in a Production environment with 3 dedicated master and multiple data hot, warm, cold, kibana and fleet-server nodes.

For the moment, I've only deployed Elastic-Agent on the Fleet-server node (with "fleet" as the namespace) and I can see that there is already many Data Streams created (6x logs-, 15x metrics-) :

And - for the moment - each Data Streams are linked to an index :

To summarize, for 1 Elastic-agent and 1 namespace I already have 21 indices created!
So I assume that creating multiple namespaces when deploying the Elastic Agent on Domain Controllers, Windows Servers, Endpoints, I will quickly reach the 1000 shards open limit.

I'm aware that I should avoid small/daily indices (prefer weekly or monthly indices) and that I can increase this limit for my cluster using :

PUT _cluster/settings
  "persistent" : {
    "cluster.routing.allocation.total_shards_per_node" : 2000 

But is it recommended?
How far can I go with this limit for my Production Environment?
Is it recommended to use many namespaces?
How to anticipate the open shards number? As I don't see that through ILM...

Many questions, really sorry about that but I'm a bit confused!
Have a great day.

Not really, the default is the recommended value. It's possible for a cluster to run ok with much higher shard counts sometimes, but overburdened clusters particularly tend to struggle to recover from disruptions more. You might get away with it, but the further this number is from the default the more risk you're taking on.

Instead we recommend adding nodes to your cluster.

1 Like

Hi @DavidTurner,

And thanks a lot for this advice !

So I will also be thrifty on namespaces to use them as less as possible.
And set up weekly or monthly indices.


1 Like

Actually, recommended would be to setup your shard sizing (index rollover) on size and not on time. You would do this to combat oversharding.

In your case, lets assume you have 4 hot nodes, each with 100gb of space.
That means you can store 4 * 100gb = 400gb of data on those nodes. As this is hot you will use 1primary and 1replica shard, reducing the usable space by half -> 200gb. Your shards should be between 30 and 50gb (a lot of reasons but for now just believe me :wink: ) .

Here comes the math:

  • space = 4 * 100gb = 400gb/2 = 200gb -> 200gb
  • indices room = 200gb / 40gb = 5 --> 5 indices

In this example you can hold 5 indices on your hot tier (the ingest tier) and once an index reaches the primary shard size of 40gb it does a roll over and moves to your next tier (warm).

1 Like

Thanks a lot @sholzhauer, this is perfectly clear for me now!

Have a great day!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.