Unbalanced shards

Sashi · March 1, 2022, 2:08pm

Hello,
I have a cluster of over 25 nodes and the indices are balanced on all the nodes except for one node. Let's call the node as problematic node. The problematic node has less indices and shards when compared to other nodes in the cluster. And there are no unassigned shards on the cluster

I have checked

Node settings, same as other nodes
Settings and mappings on the indices that are missing on the problematic node, its ip is no excluded
ES logs don't have anything relevant to find out the cause
Tried to get allocation explain (GET /_cluster/allocation/explain) seems like it works when we have unassigned shards.

I also observed there is constant shard relocation on the problematic node. The shard count is keep fluctuating.
I don't see a reason why shardallocator has to put fewer shards on this node. Can someone please throw some light here?

Thanks

warkolm · March 1, 2022, 10:21pm

What do the master logs that mention that node show?
What do the actual logs on that node show?

What's the output from _cat/allocation?v?

Sashi · March 3, 2022, 4:24pm

I went through the master logs and the problematic node logs. I Didn't find anything related to the issue. The node is like the other nodes.

The cat allocation shows less data on the node.

I have tried replacing the node with a new node and seeing the same behavior.

DineshNaik · March 3, 2022, 5:59pm

Is the storage same on every node ?

Sashi · March 3, 2022, 6:31pm

Yes, storage is same as other nodes. the node config(Same heap,disk,machine,CPU) and ES settings are same as other nodes

DineshNaik · March 3, 2022, 6:46pm

Can you share the output of :
GET _cluster/settings

Need to look at the routing settings in the cluster .
Also good to verify if Elasticsearch.yml file is consistent in all nodes .

Sashi · March 7, 2022, 2:17pm

Thanks Dinesh, These are my routing settings

"cluster.routing.allocation.allow_rebalance" : "indices_all_active",
    "cluster.routing.allocation.awareness.attributes" : [ ],
    "cluster.routing.allocation.balance.index" : "0.55",
    "cluster.routing.allocation.balance.shard" : "0.45",
    "cluster.routing.allocation.balance.threshold" : "1.0",
    "cluster.routing.allocation.disk.include_relocations" : "true",
    "cluster.routing.allocation.disk.reroute_interval" : "60s",
    "cluster.routing.allocation.disk.threshold_enabled" : "true",
    "cluster.routing.allocation.disk.watermark.enable_for_single_data_node" : "false",
    "cluster.routing.allocation.enable" : "all",
    "cluster.routing.allocation.node_concurrent_incoming_recoveries" : "2",
    "cluster.routing.allocation.node_concurrent_outgoing_recoveries" : "2",
    "cluster.routing.allocation.node_initial_primaries_recoveries" : "4",
    "cluster.routing.allocation.same_shard.host" : "false",
    "cluster.routing.allocation.shard_state.reroute.priority" : "NORMAL",
    "cluster.routing.allocation.total_shards_per_node" : "-1",
    "cluster.routing.allocation.type" : "balanced",
    "cluster.routing.rebalance.enable" : "all",
    "cluster.routing.use_adaptive_replica_selection" : "true",

We spin up the clusters with a pipeline and every node gets the same settings like others. I verified and they are same as others

system · April 4, 2022, 2:18pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Unassigned shards: NODE_LEFT (Cloud) Elasticsearch	4	3956	February 8, 2021
ES Unassigned shards not assigning Elasticsearch	9	1919	July 5, 2017
Unbalanced cluster with nearly half of the shards allocated to a single node Elasticsearch	5	1995	July 5, 2017
Shard balancing question Elasticsearch	1	292	July 6, 2017
All shards being allocated on the same node Elasticsearch	7	4045	July 5, 2017

Unbalanced shards

Related topics