Metricbeat CPU

I've been running the daily snapshot 5.0.0a5 for the past 24 hours. I've started to see the metricbeat.exe use a lot more cpu than I would like.

I've been collecting the following:

- module: system
 period: 20s
 processes: ['filebeat.exe','metricbeat.exe']

cpu_ticks: false

- module: system
 - filesystem  
enabled: true
period: 300s 

- module: system
   - memory
  enabled: true
  period: 60s 

  - drop_fields:
     fields: ["system.process.cmdline","system.process.cpu.start_time","metricset.rtt"]

Output to Elasticsearch

This is the usage of metricbeat.exe

Looking at the log file created for the time of the three larger spikes I have the following:

 2016-07-07T12:04:24-04:00 INFO Non-zero metrics in the last 30s: libbeat.publisher.published_events=2 fetches.system-cpu.success=2 libbeat.publisher.messages_in_worker_queues=2
 2016-07-07T12:04:54-04:00 INFO Non-zero metrics in the last 30s: fetches.system-process.success=3 libbeat.publisher.published_events=95 fetches.system-diskio.success=3 fetches.system-memory.success=1 fetches.system-cpu.success=1 libbeat.publisher.messages_in_worker_queues=95

 2016-07-07T12:21:24-04:00 INFO Non-zero metrics in the last 30s: libbeat.publisher.messages_in_worker_queues=34 fetches.system-cpu.success=2 fetches.system-process.success=1 fetches.system-diskio.success=2 libbeat.publisher.published_events=34
 2016-07-07T12:21:54-04:00 INFO Non-zero metrics in the last 30s: libbeat.publisher.published_events=63 fetches.system-process.success=2 fetches.system-cpu.success=1 fetches.system-diskio.success=2 fetches.system-memory.success=1 libbeat.publisher.messages_in_worker_queues=63

 2016-07-07T12:27:24-04:00 INFO Non-zero metrics in the last 30s: libbeat.publisher.published_events=33 fetches.system-cpu.success=2 fetches.system-process.success=1 fetches.system-diskio.success=1 libbeat.publisher.messages_in_worker_queues=33
 2016-07-07T12:27:54-04:00 INFO Non-zero metrics in the last 30s: fetches.system-memory.success=1 fetches.system-diskio.success=1 fetches.system-process.success=2 libbeat.publisher.messages_in_worker_queues=63 libbeat.publisher.published_events=63 fetches.system-cpu.success=1

Is there anything else I can check to see why the CPU is increasing?


I haven't seen this on my machine. But I'm running the alpha4 release on Linux. I wonder if it has something to do with differences on Windows. I'll install the snapshot release and see if I get similar results on Linux.

You could connect to the live process with a profiler or you could configure the process to dump CPU profiling information to a file. I prefer the live profiling. You must add a CLI flag when starting the Beat to expose the HTTP endpoint -httpprof "localhost:6060". Then you can connect with the pprof tool go tool pprof http://localhost:6060/debug/pprof/profile. It might be difficult to find the cause. This is a good howto:

Another approach might be to set the period lower to accelerate the problem and then try each metricset individually (or try removing a single metricset) to see if you can isolate the issue to one metricset. Just trying to throw out some ideas :slight_smile:

From past experience, my first suspect would be system/process metricset since you are on Windows.

1 Like

This topic was automatically closed after 21 days. New replies are no longer allowed.