Shard allocation on single node causes cluster overload

Cluster creates index with 5 shards on same node. And it picks the node with least disk space.
High performance ETL process writes all the data on to single node instead of to 5 multiple nodes. This causes overloads of this node CPU (100%) and whole cluster becomes in high latency/ unavailable. (writing a single document lasts several seconds)

The problem is related to cluster shard allocation. When new index is created via ILM it creates all shards on this node.
I tried to temporarily exclude the node from cluster and rollover the index and it helped - it creates the index with shards across several servers and all is fine.
However after some time when ILM rollover new indices it happened again

PUT _cluster/settings
{
  "transient" : {
    "cluster.routing.allocation.exclude._name" : "elastic12"
  }
}

Our cluster is 15 datanodes, 3 warm nodes, 3 master nodes. 2 coordinator nodes.
Data nodes 32GB RAM, 500GB SSD, 8core.
Elastic version 7.13.3

do you have any clue why this happened and how to avoid the problem repetition?

shards disk.indices disk.used disk.avail disk.total disk.percent node
   111      406.8gb     440gb       51gb      491gb           89 elastic12 <------ this one see shards and disk space
   228      400.3gb   433.2gb     57.7gb      491gb           88 elastic04
   230      394.1gb   432.7gb     58.3gb      491gb           88 elastic05
   237      283.6gb   314.6gb    176.4gb      491gb           64 elastic13
   237        271gb   301.8gb    189.1gb      491gb           61 elastic07
   237        321gb   350.8gb    140.1gb      491gb           71 elastic14
   237      245.6gb   276.6gb    214.4gb      491gb           56 elastic09
   237      195.5gb   226.4gb    264.6gb      491gb           46 elastic02
   237      277.4gb     308gb      183gb      491gb           62 elastic06
   237      303.1gb   333.4gb    157.6gb      491gb           67 elastic15
   237      236.1gb   266.8gb    224.2gb      491gb           54 elastic08
   237      287.3gb   318.8gb    172.2gb      491gb           64 elastic03
   237      251.9gb     283gb      208gb      491gb           57 elastic01
   238      343.5gb   375.3gb    115.6gb      491gb           76 elastic11
   238      322.8gb   354.9gb      136gb      491gb           72 elastic10

Was that node removed or added in the recent past?
It seems odd to have that few shards compared to the others.

Thank you warkolm.

No the nodes are there for a long time (1year+)

The problematic node is changing during a time
for instance now elastic04 has least shards + least disk space.
Few days ago it was elastic12

Elastic tries to allocate new shards to fullest node elastic04
And some time it creates 5shards index on this one.

shards disk.indices disk.used disk.avail disk.total disk.percent node
   209      406.5gb   441.3gb     49.6gb      491gb           89 elastic04 <-- this one now
   229      398.3gb   428.6gb     62.4gb      491gb           87 elastic14
   236      313.7gb   345.1gb    145.8gb      491gb           70 elastic07
   236      309.1gb   340.2gb    150.7gb      491gb           69 elastic06
   236      344.2gb   375.5gb    115.5gb      491gb           76 elastic09
   237      382.7gb   414.1gb     76.8gb      491gb           84 elastic11
   237        186gb     218gb    272.9gb      491gb           44 elastic10
   237      364.5gb   395.8gb     95.1gb      491gb           80 elastic03
   237        278gb   309.3gb    181.7gb      491gb           62 elastic02
   237      334.3gb   365.1gb    125.8gb      491gb           74 elastic15
   237      172.5gb   203.8gb    287.1gb      491gb           41 elastic01
   237      348.7gb   387.3gb    103.7gb      491gb           78 elastic05
   237      264.8gb     295gb    195.9gb      491gb           60 elastic12 <-- this had problem before
   237      285.4gb   316.4gb    174.5gb      491gb           64 elastic13

it all is caused by one high performance pipeline. It creates new index every 30GB / 5times/day.
When ILM rollover is done it creates new index on node with lowest number of shards.
BUT index is 5shards - all shards goes to the same node, unfortunately this node is the one with least size cluster is overloaded and problem is here.

The reason of this disbalance might happened due to warm hot architecture. I move 2days old indices to warm nodes.

I have tried workaround which helps from short time perspective

  1. disable shard allocation on problematic node
  2. manual rollover
  3. enable shard allocation

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.