Incorrect shard allocation

Petr.Simik · August 3, 2021, 6:39pm

i have 15nodes cluster with various mix of datasources 8% is larger shards 10-30GB size. 30% is smaller shards 1-15GB. And majority is very small shards megabites 60%.
The reason of having small shards is having lot of small different indices.

When ILM removes small shards from the node, the cluster allocates shards onto this node despite the fact this node is full 90%disk allocation.

The problem si amplified by Elasticsearch when it creates all shards on the same node. (the one with 90% allocated space and lowest number of shards). It causes a complete cluster freeze.

The Cluster

v7.13.3.
15x datanode(32GB RAM, 4core, 500GB SSD HOT)
3x master node
3x coordinator node
5x hot node (32GB RAM, 4core, 2TB SAS)

When analysing shards allocation, why I have all shards on the same node it reports this:

      "node_decision" : "worse_balance",
      "weight_ranking" : 8

can you explain the meaning of weight_ranking and worse balance?

Can anyone please help me with this problem?
Am I missing some important understanding of how allocation is done.

Can I configure the cluster to avoid shards allocation on the same node?

The problem repeats on different nodes every day. I have to stop the data load disable shard allocation on problematic node. Manual rollover and enable shard allocation .
I am thinking of bad dirty solution like cron script to evaluate disk space + number of shards and fake shards on node problematic node. I belive there is a right solution..

thank u

Stef_Nestor · August 3, 2021, 6:59pm

It sounds like you may need to layer index-level shard allocation awareness into your setup (alternative reference). With ILM, Elastic recommends using data tiers for automated routing off node roles however it still works in tandum with node attributes. (Note: node attributes overrides node roles during allocation routing.)

Petr.Simik · August 3, 2021, 7:25pm

Thank you @Stef_Nestor for very prompt response
i recently workarounded the problem by adding this to index template

    "routing.allocation.total_shards_per_node":"1"

Petr.Simik · August 9, 2021, 10:32am

@stef you recommend to migrate from node attributes to node roles, what is the reason ?
Are node-attributes going to be replaced by tiers in new version?
thank u

system · September 6, 2021, 10:32am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Unbalanced shards Elasticsearch	7	1390	April 4, 2022
Elasticseach shards allocation Elasticsearch	3	422	December 12, 2022
Shard allocation on single node causes cluster overload Elasticsearch	4	357	August 27, 2021
ELK data/shard allocation is not happening properly Elasticsearch	9	370	September 20, 2021
Elastic shards and replica Elasticsearch ilm-index-lifecycle-management	6	1192	June 7, 2021

Incorrect shard allocation

Related topics