I can't get system.cpu.total.pct metrics from my metricbeats

shell · July 26, 2021, 3:32pm

I'm getting into ELK stack and now I can't figure out how to get and visualize in Kibana metrics of CPU load (in any form, but I suppose the best choice is system.cpu.total.pct ) from my metricbeats. I can do this for memory usage, so I'm kinda getting the idea and the connections beat => ES => Kibana is OK. From what I can see those cpu metrics are not even sent by metricbeat.

Here is my current configuration of metricbeat's system module that I've done basing on the official documentation and issues found on StackOverflow etc. :

  4 - module: system
  5   period: 10s
  6   metricsets:
  7     - cpu
  8     - load
  9     - memory
 10     - network
 11     - process
 12     - process_summary
 13     - socket_summary
 14     #- entropy
 15     #- core
 16     #- diskio
 17     #- socket
 18     #- service
 19     #- users
 20   process.include_top_n:
 21     by_cpu: 5      # include top 5 processes by CPU
 22     by_memory: 5   # include top 5 processes by memory
 23   # Configure the mount point of the host’s filesystem for use in monitoring a host from within a container
 24   #system.hostfs: "/hostfs"
 25   cpu.metrics:  ["percentages","normalized_percentages"]
 26   #core.metrics: ["percentages"]

What am I doing wrong and where could I find some more good references concerning system monitoring with ELK stack please ?

stephenb · July 26, 2021, 5:14pm

Hi @shell Welcome to the Community

So when you say you can't get... Did you try a simple Lens?

If you just go to Lens pick metricbeat-* and drag system.cpu.total.pct and system.load.1 into the middle of the graph area? Then you can break down by host.

shell · July 26, 2021, 6:17pm

Hello Stephen,
thanks for your feedback.

This is actually what I'm trying to do, but for system.cpu.total.pct I have no information obtained from my metricbeats and for system.load.1 it doesn't seem to be accurate and tends to stick to round numbers as 300%, 250% etc (cf. screenshots).

This is why I suppose it is something related to the beat configuration and it must be a really silly thing that I'm missing out, but I can't figure out what it is.

Will appreciate any help

PS. greyed out hostnames as these are internal servers.

stephenb · July 26, 2021, 6:59pm

First I would not use Median I would use Average sorry I used Median... I wasn't paying attention.

Load is not a Percent it is a Float number so Load of 300% is actually load of 3 .... 2.5, 3, 10 are very normal load numbers... so that looks like it is working.

Try Average / Max on the CPU lets see what is there....

You can also go to discover and see if you see the cpu metricset and see that the values are.

Assuming your config is the same as above it looks good.

What kind of host is this running on? Linux, Docker,...?

What Version Beats / Stack? DOH I see 7.13.4 Latest .. Nice!

shell · July 26, 2021, 7:26pm

Sure, I've tried Average also and I'm aware of different types (Percent, Float etc.). However, the total absence of not round values like 2.41 etc (I have the format set to 2 decimals) seems to be incorrect if we take into account that only values of type 2.00, 2.50 or 3.00 are present.

The thing is that I wanted to use total load or total cpu used normalized per core in the range 1-100% to have a simple look on how heavily my servers are loaded, but I have a lack of entry data.

The main problem is that I don't see any entries from my metricbeat for cpu.used when at the same time I see other metrics so I'm sure the beat is working correctly. When I use Kibana in Discover mode and try to see top 5 values for system.cpu.total.pct - it's 0 in 100%. It's true for almost all metrics except system.process.cpu.total.value (system.process.cpu.total.pct is also 0). (cf. screenshots)

Any permissions issue may be ?

I'm using CentOS 7, all elements of ELK stack installed from tarballs.

stephenb · July 26, 2021, 7:37pm

Yup... Something is not right... this should all be OOTB straight forward.

Is the system.yml above still exactly what you are running?

Did you make any other changes to the metricbeat config?

I assume the user you are running as has access to "Top" etc normal commands (even though metricbeat uses underlying calls though)

What user are you running metricbeat as?... normal use should be fine unless your system is really locked down... its possible.

Have you stopped and restarted metricbeat?

have you looked for any error in the metricbeat logs?

You could try installing the package you will need sudo for that.

shell · July 26, 2021, 8:34pm

Here is my entire system.yml :

# Module: system
# Docs: https://www.elastic.co/guide/en/beats/metricbeat/7.13/metricbeat-module-system.html

- module: system
  period: 10s
  metricsets:
    - cpu
    - load
    - memory
    - network
    - process
    - process_summary
    - socket_summary
    #- entropy
    - core
    #- diskio
    #- socket
    #- service
    #- users
  process.include_top_n:
    by_cpu: 5      # include top 5 processes by CPU
    by_memory: 5   # include top 5 processes by memory
  # Configure the mount point of the host’s filesystem for use in monitoring a host from within a container
  #system.hostfs: "/hostfs"
  cpu.metrics:  [percentages, normalized_percentages]
  #core.metrics: ["percentages"]

- module: system
  period: 1m
  metricsets:
    - filesystem
    - fsstat
  processors:
  - drop_event.when.regexp:
      system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib|snap)($|/)'

- module: system
  period: 15m
  metricsets:
    - uptime

#- module: system
#  period: 5m
#  metricsets:
#    - raid
#  raid.mount_point: '/'

And metricbeat.yml (the part that concerns modules)

metricbeat.config.modules:
  # Glob pattern for configuration loading
  path: /opt/elastic/metricbeat/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: true

  # Period on which files under path should be checked for changes
  reload.period: 10s

My current user can use top and other commands, I also have root access on the server and created the user myself. I suppose I have no specific restrictions, but I tried to run the beat as root just to check whether it will get any more information about CPU load - nothing. Restarted metricbeat multiple times before and right now and I don't see any errors in logs since the last restart.

What do you mean exactly by "You could try installing the package you will need sudo for that." ?
Yes, I will need sudo for this and I can do this with sudo.

I will probably reinstall it all from scratch just to see if I did anything wrong, but if you have any other clue I would appreciate it

Thank you !

stephenb · July 26, 2021, 9:00pm

The config looks pretty good core is going to give you a ton of stuff (assuming it works)
I like to enable diskio I find it very usefull

I mean if you install as a package you need root / sudo to install a package for that

curl -L -O https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-7.13.4-amd64.deb
sudo dpkg -i metricbeat-7.13.4-amd64.deb

Current install, You can start this way an it will do deep logging perhaps you will see something.

metricbeat -e -d "*"

Did you run setup as shown here?

You would run for the tar.gz

./metricbeat setup -e

No I do not have any quick suggestions why it is not working... this is normally 2 min setup.

shell · July 27, 2021, 12:31pm

Hello Stephen,

I have reinstalled once again and now I can see all CPU metrics, so for sure there must have been a mistake during configuration. BTW now I've done this omitting logstash as initially there also was a logstash instance and I suppose the initial beat setup went wrong on that stage. And now I have some questions :

is logstash of use at all for metricbeats (or heartbeats) ? for filebeats I do understand the use case, but not for these two;
if I have multiple metricbeats on different machines (let's say 10), I do ./metricbeat setup -e only once when setting up the first beat as it only communicates to ES instance how to interpret my beat data, is that correct ? Suppose the same will be for other beats ?

Thank you a lot for your help once more, I have a better understanding of these things now.

stephenb · July 27, 2021, 2:01pm

Ok Good to hear...

Lets tackle this one first.

Yes you only need to run setup once per modules....
So you setup 10 hosts with system modules you only need to run setup once
Then you add an 11th system and enable mysql module you should run setup again to make sure all the assets for that module are loaded.

Make Sense

In general Logstash is not of great use for metricbeat (not to say there are no any edge cases) but there is an Architecture pattern using Logstash as a Collect and Forward so that only 1 egress IP is needed to say forward metric from many 100s or 1000s of hosts to Elastic Cloud, we see this in enterprises for example. This way not every host has to have access to the internet.

Filebeat with Logstash can be helpful for parsing custom logs etc which you could also do in Ingest Pipelines.

system · August 24, 2021, 4:01pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Metricbeat CPU information is wrong Beats metricbeat	4	4759	June 14, 2018
Solo me aparecen algunas metricas Elastic en Español	15	2005	July 6, 2018
Unable to Visualise Metricbeat system.process.CPU data Beats metricbeat	1	2365	November 21, 2017
Metricbeat is not capturing CPU and DISK information Beats metricbeat	21	4844	December 17, 2018
System.cpu.total.norm.pct different of the sum system.process.cpu.total.norm.pct Beats metricbeat	4	3483	February 15, 2021

I can't get system.cpu.total.pct metrics from my metricbeats

Related topics