I'm trying to improve my Elasticsearch monitoring and would need to be able to reliably query the total free disk space (that is available for Elasticsearch) of the cluster.
In Kibana > Monitoring I see
Using the Elasticsearch /_cluster/stats API endpoint I see
I have 20 ES nodes running on 4 machines. On each machine there is one 894GB SSD drive for each ES node. (On each machine there is one more 894GB drive for OS, etc. which I hope to disregard in my calculations...)
This means ES has 20x894GB ~ 17,5 TB of total disk space it can use which does not match up with anything I see from Kibana monitoring or the API.
df -b 1 for one of the SSD drives shows 959727210496
This means that ES API total_in_bytes corresponds to 4 x SSD result with df.
SSD: 959727210496
20 x SSD: 19194544209920
total_in_bytes: 3838908841984
total_in_bytes/SSD: 4
I have default replica and sharding settings, so 5 shards and 1 replica.
Playing around a bit I came to this solution for now...
#!/bin/bash
set -o nounset
set -o errexit
WARN=70
CRIT=80
while getopts w:c: option; do
case $option in
w) WARN=$OPTARG;;
c) CRIT=$OPTARG;;
esac
done
ES_HOST=$(hostname -s)
DISK_USE=$(curl -s -XGET "http://$ES_HOST:9200/_cat/nodes?h=disk.used_percent")
USED_PERCENT=0
NODE_COUNT=0
for i in $DISK_USE; do
USED_PERCENT=$(echo $USED_PERCENT + $i | bc)
(( NODE_COUNT = $NODE_COUNT + 1 ))
done
DISK_USAGE=$(echo $USED_PERCENT / $NODE_COUNT | bc)
echo Disk space used is $DISK_USAGE percent
if [ "$DISK_USAGE" -gt "$CRIT" ]; then
echo "Very bad"
exit 2
elif [ "$DISK_USAGE" -gt "$WARN" ]; then
echo "Pretty bad"
exit 1
else
echo "All good"
exit 0
fi
As all nodes have the same disk size this should work, right?
$ curl -s -XGET "http://$ES_HOST:9200/_cat/nodes?h=disk.used_percent"
37.41
71.41
75.18
26.79
66.21
30.23
65.56
37.23
24.99
71.71
42.00
77.94
37.48
69.25
24.89
64.06
36.70
66.85
74.68
20.76
$ ./es_disk_available.sh
Disk space used is 51 percent
All good
The question is if I can trust this number as Kibana is reporting something completely different...
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.