Metricbeat - is it accurate?

I want to replace all my home grown & Nagios plugins with the marvelous metricbeat, but I don't trust the Linux memory figures ie. I rely on avail memory to alert on when a system is going to swap.

For older kernels I use the values discussed here:

For new ones I lazily

$ cat /proc/meminfo | grep Avail
MemAvailable: 1996488 kB

but on the same system at the same time metricbeat shows, the below - can I trust the beat? :wink:

"system": {
"memory": {
"actual": {
"free": 619864064,
"used": {
"bytes": 3355496448,
"pct": 0.8441

From looking at the library used by Metricbeat the value reported is

` = MemFree + Buffers + Cached`


The value reported on other operating systems take a similar approach.

Based on that commit message it sounds like Metricbeat could report a more accurate estimate if it used MemAvailable. And for older kernels it would need to do the calculation on it's own like you do with your Nagios plugin. Would you be interested in contributing the improvement to the gosigar library?

It is wrong because Cached includes memory that is not freeable as page
cache, for example shared memory segments, tmpfs, and ramfs, and it does
not include reclaimable slab memory, which can take up a large fraction
of system memory on mostly idle systems with lots of files.

Ay Up Andrew, thanks for your reply. I've seen 5 different fancy monitoring products show 5 different values for this. Servers start swapping and no one understands why. The truth is out there (or in the kernel) ...

At the moment I'm in the middle of pumping beats and nagios perf data into your cloud service in some kind of crazy monitoring face off; beats wins hands down in most cases up to yet. If I get some time I'll consider contributing in the future.

Cheers. Simon

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.