We are currently running into issues with shards being unassigned during the creation process (Logstash with template). This previously was not occurring in the cluster which has been operational for over a 1 yr. We were able to reroute the shards into the cluster to be allocated but have not figured out what the issue is with why there not getting properly assigned anymore (no blaring issues in the logs)? Are there additional settings that we may need to enable in ES to help with allocation?
ES Cluster (on version 2.4.0)
20 Nodes roughly 48000 shards
All indexes use predefined templates during creation
mlockall is enable
file descriptors set to 65536
heap is between 24G and 31G per node
java version is 1.8.0_45
SSDs drives are used and settings are configured properly
refresh interval is 30s
That is far too many shards for a cluster that size, and likely related to the issues you are seeing. I would recommend reducing the number of shards significantly, either by reindexing or using the shrink API. Aim to have an average shard size between 10GB and 30GB.
I ran into shard allocation issues when I had a great disparity between the number of nodes and the number of shards. Like @Christian_Dahlqvist said, it would make sense that having such a huge number of shards could cause some issues with proper allocation.
How many shards per index are you all using? And how many replicas?
So shards per index range from daily indices being 2 primary shards and 0 replicas and hourly indexes between 1 shard and 0 replicas. We ran into issues with the cluster indexing rate during initial setup and disable replicas. Our shard size for hourly is much lower than 1gb and for daily depending on the index is within the ranges that Christian mentioned above. I doubt we can use the shrink api since we are pretty much optimized on the number of shards per index.
I was pretty sure we have been over that node per shards allocation threshold for awhile, I just wondering if we have any other options available other than adding more nodes to the cluster?
If you have that few shards per index, it may be that you have a bit too many indices too. Updates to the cluster state are single threaded in order to user consistency, so that may explain why it takes a long time to create and allocate indices. Having hourly indices with very small shards can be very inefficient unless you have a very short retention period.
How many indices do you have? How many of these are hourly?
Thanks, this is probably the issue we are seeing. We currently have roughly 49000 indices but the bulk of those indices are 1 shard and the size of them are 1GB or less. We keep planned on keep these shards in the cluster for a long time (retention period of 3 years). We planned this so that the user of the application was able to drill down to the hourly increments. We will need to workaround this and reduce the shards over the given retention period.
If you are going to keep the data that long I would recommend consolidating indices quite a lot. Fewer indices with multiple, larger shards will work much better over time.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.