Hi,i have a problem with testing ES 7.2(or any 7.x) and configuration for cluster.max_shards_per_node.
When i set it to lets say 30000 in elasticsearch.yml(Centos6), i think ES somehow overrides that with default setting(1000).
When i read /_cluster/settings?include_defaults=true i get:
"max_shards_per_node" : "30000",
But when i try to add new index i get:
"reason": "Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [999]/[1000] maximum shards open;"
The main problem is that you should not override and set it to 30000 in my opinion. The limit is there for a reason and having that many shards can cause performance and stability problems.
It is actually quite helpful even if you do not think so. Every cluster I have encountered with that number of shards per node has basically been inoperable. Usually you experience problems way before reaching that limit.
Is that 2000 shards? Have you spun up a test instance and tried to create 30000 indices/shards? Any plans to increase the size of the cluster, e.g. for high availability?
The reason I am asking this is that most of the times I have seen problems due to very high shard counts it has been in a cluster with more than one node as the cluster state then need to be updated as well as propagated to other nodes, which slows down as the cluster state grows.
Hi we currently have around 30 daily indices(1in/1shard), around 5-500MB in size. We plan to keep around a year of data,so that's around 12k indices. ATM we plan one node/cluster and i'm at first testing configurations,and later will try to import that much indices.
Currently, we have 1gb RAM for ES, for around 1700 indices and it's working really well.
One thing i'm worried about is RAM usage for >10k indices,and another is file descriptors.
If you are going to keep data that long I would recommend you switch from daily to monthly indices. I am pretty sure you will need more RAM and a larger heap as well.
The standard recommendation is to keep average shard size in the GB range, even tens of GB. Each shard has overhead so will use up heap even if empty.
You could also consolidate indices as it sounds like you are creating quite a few every day. The problem with going down this route is that once you realise you are having problems the node may be in a state that makes it hard to fix.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.