Continous Monitoring of Disk Allocation in Kibana Stack Monitoring

Hello,
we are using Elastic Cloud Enterprise and we send all the Logs and Metrics of our Deployments to a Monitoring cluster we have created just for the cluster monitoring purposes - just as the best practices suggest.
But one feature seems to be missing...
Is it possible to continuously monitor the disk allocation in Kibana Stack Monitoring?
There are a lot of interesting charts in the Kibana Stack Monitoring for a cluster node in the Overview and Advanced section - but there is no disk allocation monitoring!
I am talking about this section of Kibana for example:


We would like to see, how the disk space allocation develops over time and if there were some spikes and to see how our ILMs work etc...
Is it possible to monitor the disk allocation?
Thank you kindly in advance!
Regards!

What I have found out so far:
In the ECE "Monitoring" Cluster the .ds-metricbeat-* Data Stream is collected and according to the mapping of this Index the field system.filesystem.used.pct exists indeed... but it is not filled! So this field stays empty! Also othe corresponding fields in the system.filesystem.-section are all empty across all .ds-metricbeat- indices :(((
So the disk space usage is not monitored at all in ECE? That's very unpractical :(((

So all I have are these real time views of the disk space usage but no continous monitoring data collection in charts :frowning:

Hi Igor!

You're correct that, unfortunately, we don't have any graph for the disk usage in the Stack Monitoring product.

However, it should be easy to create one that you can put on a Dashboard for when it is needed, and since you already have Logs & Metrics collection enabled you already have all the needed data.

The monitoring data is a little special in ESS, it doesn't live in metricbeat-* or metrics-* as you would expect but rather in .monitoring-*.
The fields you're looking for come from the node stats dataset:
elasticsearch.node.stats.fs.total.available_in_bytes and elasticsearch.node.stats.fs.total.total_in_bytes.

Here is a dashboard to get you started Elasticsearch Disk usage monitoring · GitHub

1 Like

Hi @ibollman Welcome to the community.

Curious Exactly what version are you on?

Curious as I was just building some of these visualizations and alerts using the new ES|QL.

Hi Milton!
Thank you for the Dashboard JSON! That was an awesone idea to show these metrics as a graph! :slight_smile:
Unfortunately it doesen't work as regardless, of what I am selecting in the dropdowns (I could recognize the cluster item IDs) for Cluster or Node, the graph alwas stays empty and looks like this:


But then I have looked more deeply into the Monitoring view in Kibana Discover and found out, that the relevant fields (elasticsearch.node.stats.fs.total.* and elasticsearch.cluster.stats.nodes.fs.*) are mostly not filled in! (see next post for the screenshot).
In the most documents in the corresponding index those fields stay empty :frowning:
I'll analyze my data a little bit more now, but that seems to be an issue now.
I am using only the default configuration in ECE to ship the Logs and Metrics to my Monitoring cluster.
Regards and thanks again for your help!
Igor

This is how it looks in the Metrics View:


The fields, that are needed in the Dashboard are mostly not filled in :frowning:

HI @stephenb ,
currently we are using Elastic Cloud Enterprise 3.6.2 on premise and the Elastic Stack v8.11.4 in all of our deployments.
Regards
Igor

Hm that might be my bad, you probably need to filter by the right data set, which would be elasticsearch.node.stats (of the top of my head)!

@miltonhultgren

Yes, now it works! Thank you!

es_disk_usage_v2.ndjson

2 Likes

@ibollman Glad you got it working

Just for Grins...if you want here is the ES|QL i just did for the same

This does not create the line graph but good / easy way to see results and create and alert

Create a data view for .monitoring-*

Then take a look at ES|QL It is in discover

Here will show the disk usage and concat the node roles

FROM .monitoring-es-* |
WHERE node_stats.fs.summary.total.bytes IS NOT null | 
EVAL role = MV_CONCAT(elasticsearch.node.roles,"-") |
EVAL name_role = CONCAT(elasticsearch.node.name,"-",role) |
STATS avail_disk = AVG(node_stats.fs.summary.available.bytes), tot_disk = AVG(node_stats.fs.summary.total.bytes) by name_role | 
EVAL used_pct = (tot_disk - avail_disk) / tot_disk |
DROP avail_disk,tot_disk 

and the alert can be...

FROM .monitoring-es-* |
WHERE node_stats.fs.summary.total.bytes IS NOT null | 
EVAL role = MV_CONCAT(elasticsearch.node.roles,"-") |
EVAL name_role = CONCAT(elasticsearch.node.name,"-",role) |
STATS avail_disk = AVG(node_stats.fs.summary.available.bytes), tot_disk = AVG(node_stats.fs.summary.total.bytes) by name_role | 
EVAL used_pct = (tot_disk - avail_disk) / tot_disk |
DROP avail_disk,tot_disk | 
WHERE used_pct > 0.7

here is with just the node name and keeping some other stats

FROM .monitoring-es-* |
WHERE node_stats.fs.summary.total.bytes IS NOT null | 
STATS avail_disk = AVG(node_stats.fs.summary.available.bytes), tot_disk = AVG(node_stats.fs.summary.total.bytes) by elasticsearch.node.name | 
EVAL used_pct = (tot_disk - avail_disk) / tot_disk |
WHERE used_pct > 0.7

Thank you for the help again.

One other question about Dashboards: Is there am index field for monitoring the JVM memory pressure, that is visible in the ECE UI on the elastic nodes?

As far, as I understood, it's not the JVM Memory Heap usage, right?

According to THIS "The indicator uses the fill percentage of the old generation pool".

I have looked for "*old*" across the fields in the Monitoring Data View and found these fields: elasticsearch.node.stats.jvm.mem.pools.old.*

But they are all empty and not filled in.

Is the JVM memory pressure somewehere recorded?

Thank you kindly in advance!

Hi @miltonhultgren ,

I have found, what I wanted in the previous post. If I execute this query directly on a deployment:

GET /_nodes/stats?filter_path=nodes.*.name,nodes.*.jvm.mem.pools.old

JVM memory pressure in % = used_in_bytes / max_in_bytes * 100

But this seems to be real time data and not an index... :frowning:

Is there a way to get this into the monitoring Cluster into the Monitoring Data View?

Thank you kindly in advance!