Elastic Agent fails to read io.stat file with more than one $MAJ:$MIN device number

We have a cluster with multiple VM nodes and a physical node, let's say "PH". The physical node has attached network storage with the multi-path option enabled. When elastic-agent (8.7.0) daemonset is deployed in this cluster, the agent running on the "PH" node starts to print out error messages:

{"log.level":"error","@timestamp":"2023-04-12T08:48:08.152Z","message":"error getting cgroup stats for V2: error fetching stats for controller io: error fetching IO stats: error getting io.stats for path /hostfs/sys/fs/cgroup: error scanning file: /hostfs/sys/fs/cgroup/io.stat: input does not match format","component":{"binary":"metricbeat","dataset":"elastic_agent.metricbeat","id":"kubernetes/metrics-default","type":"kubernetes/metrics"},"log":{"source":"kubernetes/metrics-default"},"log.logger":"metrics","log.origin":{"file.line":234,"file.name":"report/report.go"},"service.name":"metricbeat","ecs.version":"1.6.0","ecs.version":"1.6.0"}

On the VM-nodes elastic-agent can read io.stat

# cat /hostfs/sys/fs/cgroup/io.stat
7:8 rbytes=14336 wbytes=0 rios=11 wios=0 dbytes=0 dios=0
253:0 rbytes=966835712 wbytes=15229238784 rios=22127 wios=1035229 dbytes=0 dios=0
8:0 rbytes=1060506112 wbytes=15229890048 rios=23426 wios=900790 dbytes=0 dios=0
11:0 rbytes=2048 wbytes=0 rios=10 wios=0 dbytes=0 dios=0
7:7 rbytes=89740288 wbytes=0 rios=2610 wios=0 dbytes=0 dios=0
7:6 rbytes=484352 wbytes=0 rios=42 wios=0 dbytes=0 dios=0
7:5 rbytes=1259520 wbytes=0 rios=68 wios=0 dbytes=0 dios=0
7:4 rbytes=1224704 wbytes=0 rios=54 wios=0 dbytes=0 dios=0
7:3 rbytes=2428928 wbytes=0 rios=244 wios=0 dbytes=0 dios=0
7:2 rbytes=485376 wbytes=0 rios=44 wios=0 dbytes=0 dios=0
7:1 rbytes=483328 wbytes=0 rios=42 wios=0 dbytes=0 dios=0
7:0 rbytes=499712 wbytes=0 rios=49 wios=0 dbytes=0 dios=0

On the "PH" node elastic-agent fails to read io.stat, I assume it is due to two MAJ:MIN numbers in the line #4:

# cat /hostfs/sys/fs/cgroup/io.stat
7:8 rbytes=14336 wbytes=0 rios=11 wios=0 dbytes=0 dios=0
253:0 rbytes=1634749952 wbytes=5449465344 rios=38749 wios=399082 dbytes=0 dios=0
9:126 rbytes=16306176 wbytes=0 rios=724 wios=0 dbytes=0 dios=0
9:127 8:16 rbytes=18655744 wbytes=0 rios=967 wios=0 dbytes=0 dios=0
8:0 rbytes=1650753024 wbytes=5449609728 rios=39161 wios=373122 dbytes=0 dios=0
11:0 rbytes=0 wbytes=0 rios=3 wios=0 dbytes=0 dios=0
7:7 rbytes=92527616 wbytes=0 rios=2418 wios=0 dbytes=0 dios=0
7:6 rbytes=484352 wbytes=0 rios=42 wios=0 dbytes=0 dios=0
7:5 rbytes=1259520 wbytes=0 rios=68 wios=0 dbytes=0 dios=0
7:4 rbytes=1225728 wbytes=0 rios=54 wios=0 dbytes=0 dios=0
7:3 rbytes=2302976 wbytes=0 rios=238 wios=0 dbytes=0 dios=0
7:2 rbytes=483328 wbytes=0 rios=42 wios=0 dbytes=0 dios=0
7:1 rbytes=485376 wbytes=0 rios=44 wios=0 dbytes=0 dios=0
7:0 rbytes=499712 wbytes=0 rios=49 wios=0 dbytes=0 dios=0

I assume the issue is caused by using multi-path option for the attached storage. If my assumption is correct, could you please enhance metricbeat to accept multiple MAJ:MIN device numbers in the io.stat?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.