Metricbeat on FreeBSD using a lot of resources


(Michbsd) #1

Hi,

I am running Metricbeat on FreeBSD.

❯ ./metricbeat -version [3:42:34 PM]
metricbeat version 5.0.0-alpha4 (amd64), libbeat 5.0.0-alpha4

It seems to use a whole lot of resources for a simple monitoring tool:

  PID USERNAME       THR PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
12583 root            17  21    0 99776K 22784K uwait   6  40:20  41.26% metricbeat
  785 root             1  52    0 61204K  6696K select  6   5:05   0.00% sshd
32240 root             1  20    0 25448K  4960K select  6   4:21   0.00% ntpd
27606 root             3  40    0   736M   714M uwait   1   3:33   0.00% redis-serv
27608 root             3  20    0   696M   672M uwait   5   3:17   0.00% redis-serv

Here's my conf:

❯ grep -v '^#' metricbeat.yml [3:44:15 PM]

    metricbeat.modules:
    - module: system
      metricsets:
        - cpu
        #- core
        - diskio
        - filesystem
        #- fsstat
        - memory
        - network
        - process
      enabled: true
      period: 10s
      processes: ['.*']

      # if true, exports the CPU usage in ticks, together with the percentage values
      cpu_ticks: false

    - module: apache
      metricsets: ["status"]
      enabled: true
      period: 10s
      hosts: ["http://172.16.0.70"]

    - module: redis
      metricsets: ["info"]
      enabled: true
      period: 10s
      hosts: ["172.16.0.70:7000"]



    output.elasticsearch:
      hosts: ["172.16.0.0:9200"]
      template.name: "metricbeat"
      template.path: "metricbeat.template-es2x.json"
      template.overwrite: false

Any ideas?


(ruflin) #2

Could you remove you filter for processes, as it seems to match all processes anyway? Comment out or remove the line processes: ['.*']. It could be that the regexp causes the CPU load.


(Michbsd) #3

Will give it a try, and report back


(Michbsd) #4

Still the same issue.
I started the metricbeat process (with the new config) 4.39pm

Please see attached graphs of the spike in load.

Any ideas?

Thanks,

Mike


(ruflin) #5

Could you try the nightly instead of alpha4? https://beats-nightlies.s3.amazonaws.com/index.html?prefix=metricbeat/ it should not make a change but we recently had some changes in the Freebsd support in gosigar. Be aware that "official" FreeBSD is not supported yet, but we are trying to get it working: https://github.com/elastic/beats/issues?utf8=✓&q=freebsd%20is%3Aissue%20is%3Aopen%20

Could you try to enable one metricset after the other to see which one causes the issue?

@andrewkroh Perhaps you know more here?


(Michbsd) #6

I will give it a go, and report back.
Is there a special branch to checkout from github for the nightly build ? (master?)


(Michbsd) #7

It seems to be the "process" metricset.
When removing it from the configuration, CPU usage seems to be much more reasonable:

  PID USERNAME       THR PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
49691 root            10  20    0 55840K 18328K uwait   5   0:01   0.29% metricbeat
  785 root             1  21    0 61204K  6696K select  6   5:06   0.00% sshd
32240 root             1  20    0 25448K  4960K select  6   4:25   0.00% ntpd
27606 root             3  40    0   736M   714M uwait   1   4:15   0.00% redis-serv
27608 root             3  20    0   696M   672M uwait   5   3:58   0.00% redis-serv
  426 unbound          1  20    0 45296K 21168K select  1   2:15   0.00% unbound 

Thanks,


(Andrew Kroh) #8

There was another report of high CPU usage here. Maybe they are related.


(Michbsd) #9

Hmm.. I see the CPU usage spike straight away.


(ruflin) #10

Yes, best is to build your binaries off master. Sorry about the link to the nightlies as these are not available for freebsd yet.


(Michbsd) #11

Right. I am already running latest build


(ruflin) #12

Ok, that is great.

Just as a quick summary. The minimal config that causes the issue on FreeBSD is:

metricbeat.modules:
- module: system
  metricsets:
    - process
  period: 10s

And the spike happens immediately after starting metricbeat. Correct?

How many processes you you have?


(Michbsd) #13

Hi,

Yes. Here is my current config:

❯ grep -v '^#' metricbeat.yml | cat -s [10:23:23 AM]

metricbeat.modules:
- module: system
  metricsets:
    - process
  enabled: true
  period: 10s

  # if true, exports the CPU usage in ticks, together with the percentage values

output.elasticsearch:
  hosts: ["172.16.0.140:9200"]
  template.name: "metricbeat"
  template.path: "metricbeat.template-es2x.json"
  template.overwrite: false

The spike happens immediately after launching the process (and remains).

 ❯ ps -ax | wc -l                                                        [10:26:27 AM]
      81

(ruflin) #14

Could you try to follow the debugging recommended by @andrewkroh here? Metricbeat CPU


(Michbsd) #15

Please find the pprof file here:

https://dl.dropboxusercontent.com/u/42812/Temp/pprof.localhost%3A6060.samples.cpu.001.pb.gz


(ruflin) #16

@michbsd Thanks for sharing. We will have a look to see if it helps in the investigations.


(Michbsd) #17

Great thanks.

Let me know if I can assist in any way

Mike


(ruflin) #18

@michbsd There is not real progress here yet. There wasn't something very obvious yet in the data you provided. We once had a similar issue on OS X but couldn't reproduce it. The problem is that I currently don't have a FreeBSD machine at hand to test it. My assumption currently is that it is probably more a gosigar issue then metricbeat itself: https://github.com/elastic/gosigar There were other FreeBSD challenges in the past.


(Michbsd) #19

Hi @ruflin

Thanks for the feedback.
Do you suggest I open a ticket with gosigar?

thanks,


(ruflin) #20

@michbsd yeah, probably makes sense to open a issue in gosigar: https://github.com/elastic/gosigar This also brings visibility to the other FreeBSD users. Make sure to link this discussion.