Hello,
I want to replace all my home grown & Nagios plugins with the marvelous metricbeat, but I don't trust the Linux memory figures ie. I rely on avail memory to alert on when a system is going to swap.
For older kernels I use the values discussed here:
The value reported on other operating systems take a similar approach.
Based on that commit message it sounds like Metricbeat could report a more accurate estimate if it used MemAvailable. And for older kernels it would need to do the calculation on it's own like you do with your Nagios plugin. Would you be interested in contributing the improvement to the gosigar library?
It is wrong because Cached includes memory that is not freeable as page
cache, for example shared memory segments, tmpfs, and ramfs, and it does
not include reclaimable slab memory, which can take up a large fraction
of system memory on mostly idle systems with lots of files.
Ay Up Andrew, thanks for your reply. I've seen 5 different fancy monitoring products show 5 different values for this. Servers start swapping and no one understands why. The truth is out there (or in the kernel) ...
At the moment I'm in the middle of pumping beats and nagios perf data into your cloud service in some kind of crazy monitoring face off; beats wins hands down in most cases up to yet. If I get some time I'll consider contributing in the future.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.