I followed the helm chart example, and created a 3-node elasticsearch 8.4.3 instance, and it was running OK until running out of the data volume space.
The first issue I encountered was not able to open kibana. I XDELETE one of the biggest index through API interface, and the problem was fixed after a short period.
This morning, I noticed one non-master node's status is still Running but colored as BLUE, and it's said that "Readiness probe failed." It appears the node is offline and not accessible by the master anymore. Then, I found the disk is full again. I believe that's because a number of "RELOACATING" moved lots of data to this node.
I feel this is a bug, but I'd like to confirm this before submitting it to the GitHub. Please let me know if more info should be provided.
In addition, is there a graceful way to purge/empty the data volume in this situation? I looked at the bin directory and couldn't figure out which tool may help.
By the way, I realized that, since the default setting was used, the default policy sets the limit to 50gb, but the helm chart example defaults to only 30gb. The lifecycle policy was never triggered.
/ $ curl -k -u elastic:elasticsearch -X GET "https://elasticsearch-master:9200/_cat/nodes?v=true&pretty"
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.42.15.185 55 63 2 0.03 0.07 0.08 cdfhilmrstw - elasticsearch-master-0
10.42.5.129 50 64 4 0.03 0.07 0.10 cdfhilmrstw * elasticsearch-master-2
elasticsearch¾elasticsearch-master-1:´$ df
Filesystem 1K-blocks Used Available Use% Mounted on
overlay 108241468 49320008 53400052 49% /
tmpfs 65536 0 65536 0% /dev
tmpfs 8164656 0 8164656 0% /sys/fs/cgroup
/dev/nvme0n1p3 108241468 49320008 53400052 49% /etc/hosts
shm 65536 0 65536 0% /dev/shm
/dev/longhorn/pvc-8e3c241c-092f-456f-9e8b-255157fd3d25 30832548 30816164 0 100% /usr/share/elasticsearch/data
tmpfs 16329316 12 16329304 1% /run/secrets/kubernetes.io/serviceaccount
tmpfs 16329316 16 16329300 1% /usr/share/elasticsearch/config/certs
tmpfs 8164656 0 8164656 0% /proc/acpi
tmpfs 8164656 0 8164656 0% /proc/scsi
tmpfs 8164656 0 8164656 0% /sys/firmware
Wed Dec 14 06:30:35 UTC 2022,.ds-metrics-xxxxxx-default-2022.12.11-000002,0,p,RELOCATING,167670950,24.2gb,10.42.15.185,elasticsearch-master-0,->,10.42.16.243,dnlqpEz8SCCV3znZ63Vj4A,elasticsearch-master-1