All numbers reported by Metricbeat too high (bug?)

pkrueger · July 21, 2016, 12:53pm

I was wondering why the numbers reported for free disk space were way too high and went to investigate. Switched on debug log and found this in the log:

2016-07-21T14:33:34+02:00 DBG  Publish: {
  "@timestamp": "2016-07-21T12:33:34.783Z",
  "beat": {
    "hostname": "CCT-WEB008-C7V",
    "name": "CCT-WEB008-C7V"
  },
  "metricset": {
    "module": "system",
    "name": "fsstat",
    "rtt": 379
  },
  "system": {
    "fsstat": {
      "count": 32,
      "total_files": 1410425,
      "total_size": {
        "free": 23922294784,
        "total": 35769786368,
        "used": 11847491584
      }
    }
  },
  "type": "metricsets"
}

Compared to the correct results obtained with df:

# df -h
Dateisystem    Größe Benutzt Verf. Verw% Eingehängt auf
/dev/sda3        16G    5,5G  8,4G   40% /
devtmpfs        483M       0  483M    0% /dev
tmpfs           492M       0  492M    0% /dev/shm
tmpfs           492M     19M  473M    4% /run
tmpfs           492M       0  492M    0% /sys/fs/cgroup
/dev/sda2       494M    210M  285M   43% /boot
/dev/sda1       200M    9,5M  191M    5% /boot/efi
tmpfs            99M       0   99M    0% /run/user/0

Compare free disk space of ~8.4 GB with reported free disk space of 23922294784 and total disk space of ~16G with reported total disk space of 35769786368.

I'm running the 5.0.0-alpha4 versions of ELK+beats.

Any ideas what could be wrong?

andrewkroh · July 21, 2016, 4:58pm

I did a comparison of the values reported and did not find any differences. Here's the data I collected: https://docs.google.com/spreadsheets/d/1VOfpgccSSxbnbWtUlvjotA-2le6fLPhDM1C97NU-6c4/pubhtml

It looks like you are comparing verf. to system.fsstat.free which are not the same. Metricbeat's fsstats does not report available, but I think it should (care to file an issue or open a PR to fix it?). If you sum the system.filesystem.avail values they will equal 8.4G.

See http://linux.die.net/man/2/statfs for a low level answer to the difference between "free" and "available".

pkrueger · July 22, 2016, 9:45am

I will look into it re PR if I have time. Any idea about "total" and "used" sizes which are seemingly wrong as well? The VMs disk is only 16GB in total, how can fsstat report 35GB total nad 11GB used?

andrewkroh · July 22, 2016, 12:13pm

Try doing the same analysis that I did. I expect you'll be able to better determine the source of the error or bug. Compare the output of df --total against one set of samples from the fsstats and filesystem metricsets.

What OS is this?

pkrueger · July 22, 2016, 3:15pm

Ok, these are the results from my little analysis. Seems metricbeat is counting a number of spurious filesystems...

Spreadsheet of results

I'm on CentOS Linux release 7.2.1511 (Core)

andrewkroh · July 25, 2016, 4:29pm

Thanks for the information. Looks like the mount table contains entries for rootfs and /dev/sda3 that are identical. It seems that df ignores the rootfs entry.

Where is the rootfs device coming from? Is that LVM? Why does df ignore that device? At this point I have more questions than answers. I need to do some reading.

system · August 11, 2016, 12:53pm

This topic was automatically closed after 21 days. New replies are no longer allowed.

monica · October 12, 2016, 8:19pm

I did a bit of research, and it looks like df doesn't always list all the mounted file systems. Here is an example where df doesn't list all NFS mounted file system, and it only works with df -a.

Topic		Replies	Views
Metricbeat fsstat values are too high Beats metricbeat	8	1533	December 28, 2017
Disk Space Metrics "wrong" when working with overlay mounts Beats metricbeat	2	1340	February 20, 2017
[BUG] Incorrect disk usage statistics Beats metricbeat	3	1624	November 29, 2017
How to monitor %free disk space via metricbeat on windows server Beats metricbeat	6	6216	January 4, 2021
Metric using too much disk space -need optimization help Beats metricbeat	2	528	July 21, 2020

All numbers reported by Metricbeat too high (bug?)

Related topics