Get correct disk usage stats ES API vs. x-pack monitoring

A_B · June 12, 2018, 11:18am

Hello all,

I'm trying to improve my Elasticsearch monitoring and would need to be able to reliably query the total free disk space (that is available for Elasticsearch) of the cluster.

In Kibana > Monitoring I see

Using the Elasticsearch /_cluster/stats API endpoint I see

$ curl -s -XGET 'http://es_server:9200/_cluster/stats?human&pretty' | jq .nodes.fs
{
"total": "3.4tb",
"total_in_bytes": 3838908841984,
"free": "1.7tb",
"free_in_bytes": 1878063632384,
"available": "1.7tb",
"available_in_bytes": 1878063632384
}

Which shows about 50% free...

I have 20 ES nodes running on 4 machines. On each machine there is one 894GB SSD drive for each ES node. (On each machine there is one more 894GB drive for OS, etc. which I hope to disregard in my calculations...)

This means ES has 20x894GB ~ 17,5 TB of total disk space it can use which does not match up with anything I see from Kibana monitoring or the API.

df -b 1 for one of the SSD drives shows 959727210496
This means that ES API total_in_bytes corresponds to 4 x SSD result with df.

SSD: 959727210496
20 x SSD: 19194544209920
total_in_bytes: 3838908841984
total_in_bytes/SSD: 4

I have default replica and sharding settings, so 5 shards and 1 replica.

How do you guys do it?

Cheers,
AB

A_B · June 12, 2018, 1:29pm

Playing around a bit I came to this solution for now...

#!/bin/bash

set -o nounset
set -o errexit

WARN=70
CRIT=80

while getopts w:c: option; do
  case $option in
    w) WARN=$OPTARG;;
    c) CRIT=$OPTARG;;
  esac
done

ES_HOST=$(hostname -s)
DISK_USE=$(curl -s -XGET "http://$ES_HOST:9200/_cat/nodes?h=disk.used_percent")

USED_PERCENT=0
NODE_COUNT=0

for i in $DISK_USE; do
  USED_PERCENT=$(echo $USED_PERCENT + $i | bc)
  (( NODE_COUNT = $NODE_COUNT + 1 ))
  done

DISK_USAGE=$(echo $USED_PERCENT / $NODE_COUNT | bc)

echo Disk space used is $DISK_USAGE percent

if [ "$DISK_USAGE" -gt "$CRIT" ]; then
  echo "Very bad"
  exit 2
elif [ "$DISK_USAGE" -gt "$WARN" ]; then
  echo "Pretty bad"
  exit 1
else
  echo "All good"
  exit 0
fi

As all nodes have the same disk size this should work, right?

$ curl -s -XGET "http://$ES_HOST:9200/_cat/nodes?h=disk.used_percent"
37.41
71.41
75.18
26.79
66.21
30.23
65.56
37.23
24.99
71.71
42.00
77.94
37.48
69.25
24.89
64.06
36.70
66.85
74.68
20.76
$ ./es_disk_available.sh
Disk space used is 51 percent
All good

The question is if I can trust this number as Kibana is reporting something completely different...

-AB

system · July 10, 2018, 1:29pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ES Cluster Stats API Returning False Values Elasticsearch	9	1265	June 14, 2017
Cluster not reporting actual available space Elasticsearch docker	4	525	May 6, 2020
Elasticsearch cluster monitoring Elasticsearch	3	397	November 30, 2020
Cluster Stats Total On Disk Storage Used Elasticsearch	2	1165	July 5, 2017
Docker and disk size Elasticsearch	9	5666	January 31, 2018

Get correct disk usage stats ES API vs. x-pack monitoring

Related topics