Uneven disk usage after removing a node set

Fares_Oueslati · October 20, 2022, 12:33pm

Hello
I'm using ECK 1.8.0 and ES 7.16.2.

In order to increase the storage available for an ES cluster managed by the ECK operator, I proceeded through two steps, first I added a new node set with the new disk capacity I need, I waited for the shard relocations, then I removed the old node set, then waited for the shard relocation.
I did that for the data node set (composed of 3 nodes across 3 zones) as well as for the master node set (composed of 3 nodes across 3 zones)
I did it at the same time because I wanted to rename the master node set so I took the opportunity.

However, after relocating the shards, one data node is completely unbalanced even though the large shards are well balanced across the nodes. Here is the status
You can see that the node with the less shards is the one having the disk usage anomaly, I precise that there a single large index, and that index has 6 shards that are well balanced across nodes (2 on each node)

shards disk.indices disk.used disk.avail disk.total disk.percent host         ip           node
    70       77.3gb    77.5gb     50.2gb    127.8gb           60 10.208.63. 10.208.63. elastic-site-search-es-data140g-zoneb-0
    69       77.3gb    77.6gb     50.2gb    127.8gb           60 10.208.36.  10.208.36.  elastic-site-search-es-data140g-zonec-0
     4       76.2gb   113.5gb     14.3gb    127.8gb           88 10.208.59.  10.208.59.  elastic-site-search-es-data140g-zonea-0

Also, I can see this warning in the master nodes logs

{"type": "server", "timestamp": "2022-10-20T11:49:47,792Z", "level": "WARN", "component": "o.e.c.r.a.d.DiskThresholdDecider", "cluster.name": "elastic-site-search", "node.name": "elastic-site-search-es-master-zonea-0", "message": "after allocating [[members][2], node[LjEUE5PgTiqB65tygsCQAQ], [R], s[STARTED], a[id=b6KSlUKPSVaWSclfM5IelQ]] node [bCZMIAM6RkS6pqH_MEukjg] would have more than the allowed 10% free disk threshold (9.7% free), preventing allocation", "cluster.uuid": "ykOg819RTCKt73Ehvjylxg", "node.id": "ydGRqsigTbScRiwWZk9ztw"  }

What is taking that space in the over used node? What can I do to unblock the situation?

Thanks!

Fares_Oueslati · October 20, 2022, 1:19pm

I decided to scale the set of nodes from one node per zone to two nodes per zone, to "unblock" the cluster and hope that it will be fixed when I scale down again later.

system · November 17, 2022, 1:20pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Unbalanced disk usage with ES 6.1.3 Elasticsearch	4	2587	May 1, 2018
Shard reallocation and disk space Elasticsearch	5	956	August 4, 2020
Unbalanced disk usage with ES 2.4.x Elasticsearch	8	1535	December 27, 2017
Elasticsearch data nodes - disk usage optimisation Elasticsearch	6	683	March 20, 2023
Unbalanced disk usage - ES 2.1 Elasticsearch	30	2014	June 27, 2018

Uneven disk usage after removing a node set

Related topics