Guidance on increasing cluster.max_shards_per_node

Hi,
We use Elasticsearch(7.x) to store application/system logs and our indices are time based where new index is created everyday. We have a requirment where different indices are created for every log types and also for every diff application per day. This results in too many indices created everyday and with having retention period of 15 days, this is increasing the default cluster.max_shards_per_node value of 1000. Each index is not too big in size( few in gbs/mbs/kbs also)
Can you please suggest us can we safely exceed this limit by increasing any env parameters like JVM. Any guidance on memory usage based on shards will be very helpful.

Thanks in advance,

We run this higher, like 2-3K for the same daily index reason without issues, but does depend on having enough RAM (i.e. don't do it with a 1GB heap), and watching for any issues or circuit breakers. Depends on your loads, queries, etc.

Note you can freeze indexes to save RAM, and we also just close indexes to retain but not use them unless needed, letting us keep months of very rarely-used data (just uses disk space).

You should move to ILM to mange this. Otherwise look to use _shrink and/or reindex to reduce the index and shard count (eg monthly indices).

Having lots of small shards is a waste of resources, and you will find your nodes will eventually run out of resources to manage them.

Thanks for the reply Steve and Mark.

We use 2GB jvm for data nodes and 1GB for master nodes.

Mark, we have only one shard per index. Because of the requirement having indices created everyday, i'm afraid we won't be allowed to shrink the indices.

Can you please let us know if there is any high level analogy which we can create about the size of shards to the memory that will be required.
I did look at a blog where they suggest to have 1GB for around 20 Shards (where each shard has around 20 to 40GB data).
While in our case the shards size is low :frowning:
Any help or guidance on the dimensioning would be very much appreciated.

Thanks,

Move to ILM anyway. Or at least move to monthly indices.

1 Like

Having lots of small shards is very inefficient and can cause a large cluster state as well as performance issues in addition to heap pressure. If you have that little heap I would recommend limiting the shard count as outlined in the blog post by revisiting your requirements or add resources to your cluster. These limitations and guidelines are there for a reason. I have numerous times seen users overshard clusters to the point where they can no longer operate or be fixed, resulting in data loss.

Thanks for the response Christian.

The initial setting of the resource is 2GB for data node. But we are ok to increase the memory 2x or 4x etc . But we just want to know if there is any high level mapping between the memory and shards data in a node.

thanks,

As I explained above it is not all about heap usage so I would recommend following the provided guidelines.

Agreed - we run a big system this way with dozens of daily indexes and you can't scale it for any length of time beyond a couple of months - much better to use ILM (which wasn't available to us), or at least move to monthly indexes, one per service/type, etc. And we run with 8-16GB Heaps; on 2GB you can't really do any sizable counts - our challenge was modifying all the clients to use an alias or ILM stuff, but bit the bullet and do it.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.