We have a cluster of 8nodes with 3,248 indices and 29,544 shards with 1 replica. We have 5shards per each index. These are daily indices with size ranging from 10MB to max 2GB. Out of these 8 nodes, 3Master,3Data and 2 client nodes. We are using ELK 5.2.1. I believe that having more shards is impacting Kibana. Could you please suggest me whether it is a good strategy to have 5 shards per index or decreasing would be better?
That is an excessive amount of shards for a cluster with only 3 data nodes. Please read this blog post, which contains discussion and guidelines on this topic.
@Christian_Dahlqvist Thanks for the post. As per the post, we have 3 data nodes of 32 gb heap. So I need have only 1 primary shard for each index. But if I make number of shards to 1 for each daily index instead of default 5 shards, how can we handle node failure. If a node or shard fails somehow, are we going to lose the data on that shard. Thanks.
The number of primary shards are important as it affects the total shard count. All primary shards however hold different data. If you want to make sure you can cope with a node going down without data loss, you need to make sure you have at least 2 copies of each shard, which is what replicas give you.
1 replica is often sufficient as it results in 2 copies of each shard. Given the number of indices and shards in your cluster I however suspect you may need to reduce it even further than what you would get just by reducing the primary shard count. I would recommend starting to use weekly or monthly indices for smaller indices with a long retention period.
@Christian_Dahlqvist I have started implementing weekly indices. But I have quick doubt about 2 copies of each shard. As we have 3 data nodes, If I set my primary shard as 1 and replica as 1, how will I get 2 copies of each shard?
@Christian_Dahlqvist Thanks for the response. As I am going to have 1 primary shard and 1 replica, what if 2 data nodes were down(which are holding 1 primary and 1 replica respectively) out of 3 data nodes. How can i handle this failure?
If 2 out of 3 nodes are down and they are all master eligible, the cluster should no longer be able to accept writes if configured correctly, as this could otherwise lead to data loss. If you need protection against 2 nodes going down you will need a cluster of at least 5 master-eligible nodes and set replicas to 2 so that you have 3 copies of each shard.
Sure @Christian_Dahlqvist Thanks. We have 3 master nodes, 3 data nodes and 2 client nodes. As per our configuration, Data nodes are not eligible for becoming master. In order to protect against 2 nodes going down, I will try to setup replicas to 2. Thank for the help @Christian_Dahlqvist
@Christian_Dahlqvist Shards improve performance of the queries. But if i set primary shards to 1 and replica as 1 fro each weekly index, does this impact the performance of queries.
@Christian_Dahlqvist Shards improve performance of the queries. But if i set primary shards to 1 and replica as 1 fro each weekly index, does this impact the performance of queries.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.