Filesystem stats not including xfs device

I have 2 devices I expect to see logs for "xvda" and "nvme0n1". The second doesn't appear in kibana. It is a xfs filesystem. I do see one error showing up repeatedly in the metricbeat logs, but it's hard to tell if it's related. The error is...

ERROR elasticsearch/client.go:374 Failed to encode event: unsupported float value: NaN

Output from "lsblk"
xvda 202:0 0 50G 0 disk
nvme0n1 259:0 0 1.7T 0 disk /data

Output from "cat /proc/filesystems"

nodev sysfs
nodev rootfs
nodev ramfs
nodev bdev
nodev proc
nodev cpuset
nodev cgroup
nodev cgroup2
nodev tmpfs
nodev devtmpfs
nodev debugfs
nodev tracefs
nodev securityfs
nodev sockfs
nodev bpf
nodev pipefs
nodev hugetlbfs
nodev devpts
nodev autofs
nodev pstore
nodev mqueue
nodev selinuxfs
nodev xenfs
nodev overlay
nodev binfmt_misc

Output from "cat /etc/fstab"
/dev/nvme0n1 /data xfs defaults,noatime,nodiratime 0 0

Output from "df -h"
Filesystem Size Used Avail Use% Mounted on
/dev/nvme0n1 1.8T 1.7T 114G 94% /data

What am I doing wrong?

Hi @Eric_Herbrandson and welcome to discuss :slight_smile:

I tried to reproduce this issue with an xfs device but I couldn't, the device was properly reported.

Could you send the output of stat -f /dev/nvme0n1?

This error could be caused by other metricsets, could you also try to disable all metricsets except filesystem to check if this is the module generating this error?


Thanks so much for your reply @jsoriano! Your response got me thinking and almost certainly this issue is because metricbeat is running in side a docker container. I can't believe I forgot to mention that in my original post :frowning:

What's the best way to resolve that? Do I just need to make sure some path on that device is mounted into the container?

There are 2 parts here I think. First you need to mount it into the container in case you run metricbeat inside the container.

Second, the error you see should not happen any means we hit some edge cases. Any chance you could provide the output from the command @jsoriano posted above?

Here is the ouput:

File: "/dev/nvme0n1"
ID: 0 Namelen: 255 Type: tmpfs
Block size: 4096 Fundamental block size: 4096
Blocks: Total: 7853870 Free: 7853870 Available: 7853870
Inodes: Total: 7853870 Free: 7853537

However, all of the commands and output I've listed are from the host, not the container. I assume you want them from the container instead? However, stat -f /dev/nvme0n1 from inside the container errors with "stat: cannot read file system information for '/dev/nvme0n1': No such file or directory"

@Eric_Herbrandson it should be possible to access the devices if you mount /dev as a volume in /hostfs/dev and you start metricbeat with --system.hostfs=/hostfs. If this doesn't work we might need to take a deeper look.

Another option is to use the --device docker run flag, for example --device /dev/nvme0n1 would expose this device in the container.

@jsoriano That did the trick! Though, I did also need to add the -e argument before it started working. Thanks so much for the help!

I am still getting the error "elasticsearch/client.go:374 Failed to encode event: unsupported float value: NaN". Would you like to try to track that down? If so, would you like to continue discussing in this thread or would you prefer we start a separate thread for that?

Good to read that it is working for you now! For the record, what did the trick? Mounting /dev under /hostfs, or the --device flag?

The -e argument only changes the logs configuration so the are written to stderr. This shouldn't affect any other thing.

Are you still missing events? To identify what metricset is creating the incorrect events it'd be good if you could try to enable them one by one to try to isolate the problem.

And what version of metricbeat are you using?

Mounting /dev did it.

I don't know that we're missing events, but I guess it could be possible. I've narrowed it down to the diskio metricset. When I remove that, the errors go away.

A fix for the diskio metricset was recently merged:

This will be available in 6.7.2 and 7.0.1.

Great. Thank you so much for all the help!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.