michbsd
(Michbsd)
July 13, 2016, 1:45pm
#1
Hi,
I am running Metricbeat on FreeBSD.
❯ ./metricbeat -version [3:42:34 PM]
metricbeat version 5.0.0-alpha4 (amd64), libbeat 5.0.0-alpha4
It seems to use a whole lot of resources for a simple monitoring tool:
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
12583 root 17 21 0 99776K 22784K uwait 6 40:20 41.26% metricbeat
785 root 1 52 0 61204K 6696K select 6 5:05 0.00% sshd
32240 root 1 20 0 25448K 4960K select 6 4:21 0.00% ntpd
27606 root 3 40 0 736M 714M uwait 1 3:33 0.00% redis-serv
27608 root 3 20 0 696M 672M uwait 5 3:17 0.00% redis-serv
Here's my conf:
❯ grep -v '^#' metricbeat.yml [3:44:15 PM]
metricbeat.modules:
- module: system
metricsets:
- cpu
#- core
- diskio
- filesystem
#- fsstat
- memory
- network
- process
enabled: true
period: 10s
processes: ['.*']
# if true, exports the CPU usage in ticks, together with the percentage values
cpu_ticks: false
- module: apache
metricsets: ["status"]
enabled: true
period: 10s
hosts: ["http://172.16.0.70"]
- module: redis
metricsets: ["info"]
enabled: true
period: 10s
hosts: ["172.16.0.70:7000"]
output.elasticsearch:
hosts: ["172.16.0.0:9200"]
template.name: "metricbeat"
template.path: "metricbeat.template-es2x.json"
template.overwrite: false
Any ideas?
ruflin
(ruflin)
July 13, 2016, 6:44pm
#2
Could you remove you filter for processes, as it seems to match all processes anyway? Comment out or remove the line processes: ['.*']
. It could be that the regexp causes the CPU load.
michbsd
(Michbsd)
July 14, 2016, 2:39pm
#3
Will give it a try, and report back
michbsd
(Michbsd)
July 14, 2016, 2:55pm
#4
Still the same issue.
I started the metricbeat process (with the new config) 4.39pm
Please see attached graphs of the spike in load.
Any ideas?
Thanks,
Mike
ruflin
(ruflin)
July 18, 2016, 8:36am
#5
Could you try the nightly instead of alpha4? https://beats-nightlies.s3.amazonaws.com/index.html?prefix=metricbeat/ it should not make a change but we recently had some changes in the Freebsd support in gosigar. Be aware that "official" FreeBSD is not supported yet, but we are trying to get it working: https://github.com/elastic/beats/issues?utf8=✓&q=freebsd%20is%3Aissue%20is%3Aopen%20
Could you try to enable one metricset after the other to see which one causes the issue?
@andrewkroh Perhaps you know more here?
michbsd
(Michbsd)
July 18, 2016, 10:33am
#6
I will give it a go, and report back.
Is there a special branch to checkout from github for the nightly build ? (master?)
michbsd
(Michbsd)
July 18, 2016, 10:58am
#7
It seems to be the "process" metricset.
When removing it from the configuration, CPU usage seems to be much more reasonable:
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
49691 root 10 20 0 55840K 18328K uwait 5 0:01 0.29% metricbeat
785 root 1 21 0 61204K 6696K select 6 5:06 0.00% sshd
32240 root 1 20 0 25448K 4960K select 6 4:25 0.00% ntpd
27606 root 3 40 0 736M 714M uwait 1 4:15 0.00% redis-serv
27608 root 3 20 0 696M 672M uwait 5 3:58 0.00% redis-serv
426 unbound 1 20 0 45296K 21168K select 1 2:15 0.00% unbound
Thanks,
There was another report of high CPU usage here . Maybe they are related.
michbsd
(Michbsd)
July 18, 2016, 2:48pm
#9
Hmm.. I see the CPU usage spike straight away.
ruflin
(ruflin)
July 19, 2016, 7:13am
#10
Yes, best is to build your binaries off master. Sorry about the link to the nightlies as these are not available for freebsd yet.
michbsd
(Michbsd)
July 19, 2016, 7:37am
#11
Right. I am already running latest build
ruflin
(ruflin)
July 19, 2016, 7:43am
#12
Ok, that is great.
Just as a quick summary. The minimal config that causes the issue on FreeBSD is:
metricbeat.modules:
- module: system
metricsets:
- process
period: 10s
And the spike happens immediately after starting metricbeat. Correct?
How many processes you you have?
michbsd
(Michbsd)
July 19, 2016, 8:27am
#13
Hi,
Yes. Here is my current config:
❯ grep -v '^#' metricbeat.yml | cat -s [10:23:23 AM]
metricbeat.modules:
- module: system
metricsets:
- process
enabled: true
period: 10s
# if true, exports the CPU usage in ticks, together with the percentage values
output.elasticsearch:
hosts: ["172.16.0.140:9200"]
template.name: "metricbeat"
template.path: "metricbeat.template-es2x.json"
template.overwrite: false
The spike happens immediately after launching the process (and remains).
❯ ps -ax | wc -l [10:26:27 AM]
81
ruflin
(ruflin)
July 20, 2016, 8:59am
#14
Could you try to follow the debugging recommended by @andrewkroh here? Metricbeat CPU
michbsd
(Michbsd)
July 20, 2016, 9:22am
#15
ruflin
(ruflin)
July 25, 2016, 6:26am
#16
@michbsd Thanks for sharing. We will have a look to see if it helps in the investigations.
michbsd
(Michbsd)
July 25, 2016, 9:21am
#17
Great thanks.
Let me know if I can assist in any way
Mike
ruflin
(ruflin)
August 2, 2016, 7:06am
#18
@michbsd There is not real progress here yet. There wasn't something very obvious yet in the data you provided. We once had a similar issue on OS X but couldn't reproduce it. The problem is that I currently don't have a FreeBSD machine at hand to test it. My assumption currently is that it is probably more a gosigar issue then metricbeat itself: https://github.com/elastic/gosigar There were other FreeBSD challenges in the past.
michbsd
(Michbsd)
August 2, 2016, 12:31pm
#19
Hi @ruflin
Thanks for the feedback.
Do you suggest I open a ticket with gosigar?
thanks,
ruflin
(ruflin)
August 2, 2016, 2:32pm
#20
@michbsd yeah, probably makes sense to open a issue in gosigar: https://github.com/elastic/gosigar This also brings visibility to the other FreeBSD users. Make sure to link this discussion.