Sharding Strategy


We have a cluster of 8nodes with 3,248 indices and 29,544 shards with 1 replica. We have 5shards per each index. These are daily indices with size ranging from 10MB to max 2GB. Out of these 8 nodes, 3Master,3Data and 2 client nodes. We are using ELK 5.2.1. I believe that having more shards is impacting Kibana. Could you please suggest me whether it is a good strategy to have 5 shards per index or decreasing would be better?

That is an excessive amount of shards for a cluster with only 3 data nodes. Please read this blog post, which contains discussion and guidelines on this topic.

1 Like

@Christian_Dahlqvist Thanks for the post. As per the post, we have 3 data nodes of 32 gb heap. So I need have only 1 primary shard for each index. But if I make number of shards to 1 for each daily index instead of default 5 shards, how can we handle node failure. If a node or shard fails somehow, are we going to lose the data on that shard. Thanks.

You handle node failures by having a replica shard configured. The number of primary shards does not really matter.

1 Like

So, currently we have 1 replica in place. Shall I make it to 2 replica by having 1 shard or Shall I keep the same 1 replica with 3 shards?

@Christian_Dahlqvist Could you also explain why number of primary shards does not matter here. Sorry if I am asking so many questions. Thanks.

The number of primary shards are important as it affects the total shard count. All primary shards however hold different data. If you want to make sure you can cope with a node going down without data loss, you need to make sure you have at least 2 copies of each shard, which is what replicas give you.

@Christian_Dahlqvist Thanks. Then I will go with 1 primary shard for each daily index along with 2 replicas. Thanks.

1 replica is often sufficient as it results in 2 copies of each shard. Given the number of indices and shards in your cluster I however suspect you may need to reduce it even further than what you would get just by reducing the primary shard count. I would recommend starting to use weekly or monthly indices for smaller indices with a long retention period.

1 Like

Sure @Christian_Dahlqvist. Thanks for the valuable the inputs. I will try to implement the same. Thanks.

@Christian_Dahlqvist I have started implementing weekly indices. But I have quick doubt about 2 copies of each shard. As we have 3 data nodes, If I set my primary shard as 1 and replica as 1, how will I get 2 copies of each shard?

If replica is set to 1, it means that you will have one replica in addition to the primary shards, resulting in two copies of the same shard.

@Christian_Dahlqvist Thanks for the response. As I am going to have 1 primary shard and 1 replica, what if 2 data nodes were down(which are holding 1 primary and 1 replica respectively) out of 3 data nodes. How can i handle this failure?

If 2 out of 3 nodes are down and they are all master eligible, the cluster should no longer be able to accept writes if configured correctly, as this could otherwise lead to data loss. If you need protection against 2 nodes going down you will need a cluster of at least 5 master-eligible nodes and set replicas to 2 so that you have 3 copies of each shard.

Sure @Christian_Dahlqvist Thanks. We have 3 master nodes, 3 data nodes and 2 client nodes. As per our configuration, Data nodes are not eligible for becoming master. In order to protect against 2 nodes going down, I will try to setup replicas to 2. Thank for the help @Christian_Dahlqvist

@Christian_Dahlqvist Shards improve performance of the queries. But if i set primary shards to 1 and replica as 1 fro each weekly index, does this impact the performance of queries.

@Christian_Dahlqvist Shards improve performance of the queries. But if i set primary shards to 1 and replica as 1 fro each weekly index, does this impact the performance of queries.

You would need to test that, but more smaller shards does not necessarily mean faster queries.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.