Elastic cluster is getting overloaded by incorrect shard allocation

Hi Community,
May I ask for help:

We have elastic v7.17.0 with 43 nodes. Sizing 2TB SSD, 8cores, 32GB RAM, 16GB Heap.
Currently about 2400 indices and 6400shards.

some nodes have less shards but have disk full which causes node cpu at 100% and it impacts whole cluster that all ingest pipelines are being rejected with error

{'error': {'root_cause': [{'type': 'es_rejected_execution_exception', 'reason': 'rejected execution of coordinating operation [coordinating_and_primary_bytes=750630256, replica_bytes=0, all_bytes=750630256, 
coordinating_operation_bytes=1477151, max_coordinating_and_primary_bytes=751619276]'}], 'type': 'es_rejected_execution_exception', 'reason': 'rejected execution of coordinating operation [coordinating_and_primary_bytes=750630256, replica_bytes=0, all_bytes=750630256, coordinating_operation_bytes=1477151, max_coordinating_and_primary_bytes=751619276]'}, 'status': 429}

shards are automatically distributed by elastic among 43 nodes, all indices are timeseries indices having templates and ILM doing rollover daily/weekly/monthly depends on size of datasource.

there is a mix of very small indices xxMB size and huge indices xxGB size. We keep recommended defaults 50GB shard max size, but some datasources are very small .

Combination of small and large indices is a problem for elastic it happened that it allocates small indices on one node and huge end up on another node and it results in situation where some nodes have full storage and other are half empty byt Elatic allocates data to nodes with least number of shards.

My workaround is to remove node from cluster and reinsert it back in few hours later.

The problem is the situation occurs repeatedly every few days and it breaks the production.

GET _cat/allocation?v&s=node

shards disk.indices disk.used disk.avail disk.total disk.percent node
   136        1.6tb     1.6tb    232.2gb      1.9tb           88 tela01prahkz --> problem node smalles num of shard (136) but disk full
   134        1.6tb     1.7tb    201.8gb      1.9tb           89 tela02prahkz --> problem node smalles num of shard (134) but disk full
   179      643.8gb     736gb      1.2tb      1.9tb           37 tela03prahkz
   179          1tb     1.1tb    822.2gb      1.9tb           58 tela04prahkz
   179      738.2gb   836.4gb      1.1tb      1.9tb           42 tela05prahkz
....

thank you for any advice

I already reported this issue before but no resolution:

Try upgrading to 8.6, noting that the release blog calls out some improvements to shard balancing in this version:

2 Likes

thank you this is definitelly my plan

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.