Hi All,
I am pretty new to Elastic search and going thru lot of documents to understand the internals so as to optimize our cluster.
Our 28 Data node and 5 Master node Elastic search (6.1) cluster is facing very frequent Yellow and Red status and lots of un assigned shards. Below are more details of cluster:
Data Node - 28 (124 GB - Ram, 8 core, 55Gb - Heap)
Master Node - 5
Total Indices - 1500
Total Shards - 7000
Total Docs - 6,920,711,994
Total Data Size - 6 TB
Almost 10% of Indices have 20 Shards with 2 replicas and others have 2 shards with 2 replicas.
Those 10% of Indices with 20 Shards are of 32Gb data size and other indices are below 1Gb.
We Index on an avg 50Gb of data everyday and discard same amount of data each day. We have plenty of aggregate queries being executed, with almost 3-4 requests per min.
We are facing very frequent un-assinged shards which is taking couple of hours to recover. We also observed very high heap usage and at several occurences getting OOM error.
Is there anything very wrong with our cluster setup?
In monitoring, check your nodes segment count. A huge number of segments can cause problems. Use forcemerge to consolidate segments of read-only indices. (Indices that won't get new data).
I believe you have far too many shards in your cluster but I struggle to make sense of these numbers:
If 10% of your 1500 indices have 20 shards with 2 replicas, those 150 indices should alone account for 150 * 20 * 3 = 9000 shards. If the remaining 1350 indices only uses 1 shard each the total should be at least 10350 shards in your cluster. But you mention 2 shards and 2 replicas in those indices which should take the total shard count up to 9000 + (1350 * 2 * 3) = 17100.
Are you sure you only have 7000 primary and replica shards in your cluster?
Aside from that, a common rule of thumb is to aim for shard sizes of 20-50 GB so if you create daily indices with 50 GB of data I would only assign 1 or at most 2 primary shards to each index. That would still result in 1500 to 3000 primary shards in your cluster, plus the replicas.
Finally, I notice that you list your data node heap space at 55 GB:
Is this the Java Heap Space, as specified in Xmx and Xms? If so, that number is also too high, as the official documentation points out:
Ideally set Xmx and Xms to no more than the threshold for zero-based compressed oops; the exact threshold varies but 26 GB is safe on most systems, but can be as large as 30 GB on some systems.
For the record, I use either 8 GB or at most 16 GB of java heap space on the data nodes in my clusters (depending on what the specific cluster is set up for).
As well as the other very valid points raised in this thread, version 6.1 is past the end of its supported life. More recent versions, particularly in 7.x, have seen improvements to resiliency to OOMs and more.
After you have upgraded to a more recent version if you are still seeing unassigned shards you should use the allocation explain API to determine why they are unassigned. You should also be looking at the Elasticsearch logs to find more detailed errors as they occur. A properly-configured Elasticsearch cluster emits very few log messages.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.