top command shows topbeat consuming very high CPU on a HAproxy server that is getting around 400 req/sec.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13770 root 15 0 761m 378m 5904 R 66.2 4.7 41:23.84 topbeat
2427 haproxy 15 0 45988 2676 840 S 11.3 0.0 2:42.13 haproxy
2424 haproxy 15 0 46008 2780 920 S 10.0 0.0 2:38.81 haproxy
2426 haproxy 15 0 46104 2732 796 S 8.3 0.0 2:49.87 haproxy
2425 haproxy 16 0 45848 2584 796 S 7.3 0.0 2:42.12 haproxy
2368 root 18 0 192m 1880 1052 S 7.0 0.0 0:53.25 rsyslogd
14152 root 15 0 118m 16m 4504 S 5.3 0.2 0:23.72 filebeat
66% CPU is taken by topbeat.
Nothing in topbeat log file.
Both the amount of CPU and memory seem to be unusually high.
- What version of topbeat are you running?
- How many processes are running on this system? Are there a lot of new processes being created?
- Can you post your config?
I am using Topbeat 1.0. The latest one.
My topbeat.yml has nothing special, the only things I have changed are:
The server does not start stop processes. Only HAProxy is running.
I have a similar topbeat deployment (except I am using the elasticsearch output) and mine has remained constantly at about 15MB of memory and almost no CPU usage.
Would you be willing to collect some profiling information and attach it in this thread (or email it to me - andrew dot kroh at elastic dot co)? It would help us figure out where all those CPU cycles and memory are being used.
To do so, stop the service and run topbeat with the
topbeat -c /etc/topbeat/topbeat.yml -cpuprofile topbeat-1.0.1-cpu.prof -memprofile topbeat-1.0.1-mem.prof -e -v
Let it run for a while then stop the process. It will write out two .prof files which we will look at with
go tool pprof.
Since I restarted topbeat, it is no longer taking high cpu/memory. Probably it gets stuck when there's issue with ES cluster. I will wait for another event and do the profile.
high memory usage is a sign for topbeat queuing up data + trying to reconnect and send to elasticsearch. This high CPU usage is still weird.
Which topbeat version have you installed? 1.0.0 or 1.0.1?
You can reduce memory usage be decreasing 'bulk_max_size' (in 1.0.1 value has been reduced to 200 to decrease memory usage) in logstash output.