How to fix hitting maximum shards open error

I got the following error when writing docs to Elasticseach cluster. it works till today. I guess I need to increase some limit, but can not find any doc on this yet.

Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [3005]/[3000] maximum shards open;

Could someone suggest? Thanks

You probably have too many shards per node.

May I suggest you look at the following resources about sizing:

https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing

And https://www.elastic.co/webinars/using-rally-to-get-your-elasticsearch-cluster-size-right

1 Like

Thanks so much for these great videos. I will take a look.

I have 5 shards per index and all indices are hourly based. I just check the cluster who has this eror has 314 indices, and each of them is pretty small. around (1gb), As I have 3 data nodes there. could you suggest how can I get the number of how 3000 open shards get calculated?
Also do you think I should merge this index daily from hourly indices to a daily one? Any suggestion on this?

Thanks! Happy weekend.

The links say it all.

But basically (need to be tested with your real use case):

  • Up to 50gb per shard
  • No more than 20 shards per GB of heap

So it depends on your actual volume. Here you can probably have 2 days of data within one single shard. So yes I'd switch to daily indices or use the rollover API.

1 Like

Why would you use hourly indices to start with? The only time I have seen hourly indices make sense is when you have a retention period less than a few days, which is quite rare.

I see. thanks for the suggestion.In the case of if I have 25Gb docs ingested hourly and 50GB index storage. Do you think I can still use daily index, which will be 600GB daily index storage size. Also I am using 15 shards for this case, which seems speed up the indexing. Do you think this makes sense?

My retention period is 30 days.

If you have 25GB of documents being indexed per hour, resulting in 50GB index size on disk (including replica?) that given 1.2TB of data on disk per day. Having 15-20 shards per daily index in such a case sounds reasonable, but this will depend on your data and queries. 30 days retention period however gives a total indexed data volume of around 36TB which sounds like far too much for a 3 node cluster doing heavy indexing. I would expect you to need to scale out the cluster significantly, potentially using a hot/warm architecture.

Thanks so much. I think I over-sharded. I think it should do daily index and adjust shards number based on the data volume. Btw, do you think I should shrink the shards if the index not getting data ingested? From my understanding from the video, this should improve the search speed but not harming the write speed at all( data is not going those index. :slight_smile: )

If your data volumes are unpredicatable I would recommend you have a look at ILM and rollover as that allows you to set the number of primary shards to what is optimum for your cluster, e.g. 3 or 6 if you have 3 data nodes doing the indexing, and then create new shards behind the scenes based on time and/or size. You can therefore define that you want 3 primary shards and that your indices should be 100GB in size or at most a day in duration. During periods when you are receiving more data, new indices will be created more frequently.

1 Like