Disk.size is 10 times more than disk.indices

mrudrego · September 3, 2021, 1:02pm

We are facing shard allocation issues because of disk watermark and when checked allocation API, strangely disk.size is 5-6 times more than disk.indices

[root@bcmt-ovs-control-0 ~]# curl -s elasticsearch.cgnat.svc.cluster.local:9200/_cat/allocation?v
shards disk.indices disk.used disk.avail disk.total disk.percent host ip node
 148 **2.8tb 16.7tb** 2.8tb 19.6tb 85 172.16.4.76 172.16.4.76 cgnat-belk-elasticsearch-data-3
 147 2.8tb 16.7tb 2.8tb 19.6tb 85 172.16.4.234 172.16.4.234 cgnat-belk-elasticsearch-data-0
 151 2.9tb 16.7tb 2.8tb 19.6tb 85 172.16.4.114 172.16.4.114 cgnat-belk-elasticsearch-data-12
 151 2.8tb 16.8tb 2.7tb 19.6tb 85 172.16.4.172 172.16.4.172 cgnat-belk-elasticsearch-data-19
 149 2.9tb 16.7tb 2.8tb 19.6tb 85 172.16.4.92 172.16.4.92 cgnat-belk-elasticsearch-data-13
 155 2.9tb 16.8tb 2.8tb 19.6tb 85 172.16.4.74 172.16.4.74 cgnat-belk-elasticsearch-data-6
 154 2.9tb 16.7tb 2.8tb 19.6tb 85 172.16.4.129 172.16.4.129 cgnat-belk-elasticsearch-data-18
 147 2.7tb 16.9tb 2.6tb 19.6tb 86 172.16.4.21 172.16.4.21 cgnat-belk-elasticsearch-data-1
 157 3tb 16.8tb 2.8tb 19.6tb 85 172.16.4.177 172.16.4.177 cgnat-belk-elasticsearch-data-15
 147 2.8tb 16.7tb 2.8tb 19.6tb 85 172.16.4.29 172.16.4.29 cgnat-belk-elasticsearch-data-5
 184 3tb 16.8tb 2.8tb 19.6tb 85 172.16.4.139 172.16.4.139 cgnat-belk-elasticsearch-data-17
 148 2.7tb 16.7tb 2.8tb 19.6tb 85 172.16.4.4 172.16.4.4 cgnat-belk-elasticsearch-data-9
 152 2.9tb 16.7tb 2.9tb 19.6tb 85 172.16.4.109 172.16.4.109 cgnat-belk-elasticsearch-data-14
 152 2.8tb 16.8tb 2.8tb 19.6tb 85 172.16.4.133 172.16.4.133 cgnat-belk-elasticsearch-data-8
 149 2.8tb 16.8tb 2.8tb 19.6tb 85 172.16.4.3 172.16.4.3 cgnat-belk-elasticsearch-data-7
 151 2.9tb 16.7tb 2.9tb 19.6tb 85 172.16.4.50 172.16.4.50 cgnat-belk-elasticsearch-data-11
 154 3tb 16.8tb 2.8tb 19.6tb 85 172.16.4.110 172.16.4.110 cgnat-belk-elasticsearch-data-4
 184 3.1tb 16.9tb 2.6tb 19.6tb 86 172.16.4.32 172.16.4.32 cgnat-belk-elasticsearch-data-10
 150 3tb 16.7tb 2.8tb 19.6tb 85 172.16.4.170 172.16.4.170 cgnat-belk-elasticsearch-data-16
 157 3.1tb 16.8tb 2.7tb 19.6tb 85 172.16.4.117 172.16.4.117 cgnat-belk-elasticsearch-data-2
 3

Can you please let us know why there can be such a huge difference.
Also please let us know if there is any process to clean up this disk storage?

Thanks,
Mahesh

Christian_Dahlqvist · September 4, 2021, 7:32am

Can you tell us s bit more about the cluster? What type of storage sre you using? Does each node has a dedicated data volume? Is there anything else running on the nodes or using the same storage?

mrudrego · September 4, 2021, 8:56am

Hi,

Elasticsearch is running in k8s cluster. Each node is mounted with dedicated PVC of Cinder Storage class.

Thanks,

Christian_Dahlqvist · September 4, 2021, 11:15am

How are the PVC configured? What is the total capacity of the underlting storage?

I have not seen this problem before so I wonder if it is related to how the underlying storage reports statistics, capacity and usage.

mrudrego · September 4, 2021, 11:47am

Hi,

PVCs are configured via kubernetes "volumeClaimTemplates" . Each pvc is of 20TB. I'm not currently aware of the total size of underlying storage, will check and reply back.

Thanks,

system · October 2, 2021, 11:48am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.