When using the index settings with auto_expand_replicas set to "0-all," an issue arises where primary shards are concentrated on specific nodes

wedul_chul · July 25, 2023, 12:02pm

Due to the service requirements, the setting "auto_expand_replicas" is configured as "0-all," enabling replica shards to be present on all nodes. However, there is an issue where primary shards are concentrated on a specific node, which can negatively impact performance.

To address this, the following steps are being utilized, and we want to know if there are any potential risks and if there is a better approach:

Step 1: Set "auto_expand_replicas" to false and "number_of_replicas" to 0.
Step 2: Verify that primary shards are evenly distributed across all nodes.
Step 3: Set "auto_expand_replicas" back to "0-all."
Step 4: finish

Please let us know if there are any potential risks involved in these steps, and if there are any better alternatives.

I'm uploading a link to a stackoverflow question because I can't upload the picture.

Opster_support · July 25, 2023, 6:09pm

The steps you've outlined are generally safe, but there are a few potential risks and considerations:

During the time when "number_of_replicas" is set to 0, your data is at risk. If a node fails during this time, you could lose data.
Changing the "number_of_replicas" to 0 and then back to "0-all" will cause a lot of shard movement, which can put a significant load on your cluster and impact performance. This is especially true if your indices are large.
Verifying that primary shards are evenly distributed across all nodes can be tricky. Elasticsearch tries to balance the shards across all nodes, but it's not always perfect. There's no guarantee that the shards will be evenly distributed after setting "number_of_replicas" back to "0-all."

As for alternatives, you could consider using shard allocation filtering to control the allocation of the shards. This allows you to specify which nodes a shard can be allocated to, giving you more control over the distribution of your shards. However, this requires careful planning and understanding of your cluster's capacity.

wedul_chul · July 26, 2023, 4:31am

I fully sympathize with the risks you mentioned.

Considering the various risks, I think that reroute the primary shard in the above situation may not be of great benefit. I wonder what you think.

In my opinion, there is a replica shard on every node anyway, so there is no problem even if the node with the primary shard is shut down, so I think it would be better not to do it.

Nevertheless, I'm curious about your opinion if you think it's good to reblance when primary shards are crowded.

Thanks in advance for your reply.

system · August 23, 2023, 4:31am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Auto_expand_replicas and shard allocation filtering Elasticsearch	3	1205	July 6, 2017
Index.auto_expand_replicas Elasticsearch	8	16890	July 6, 2017
Can a data node be set up to only be allocated replicas? Elasticsearch	3	822	July 6, 2017
Index distribution clarification Elasticsearch	2	139	January 19, 2024
Shard allocation Elasticsearch	7	27	September 30, 2024

When using the index settings with auto_expand_replicas set to "0-all," an issue arises where primary shards are concentrated on specific nodes

I'm uploading a link to a stackoverflow question because I can't upload the picture.

Related topics