CPU usage does not match the CPU usage in Task Manager


(zaee) #1

Hi,

I have just downloaded metricbeat and have successfully shipped the stats over to a third party elk saas provider. I have purposely spiked my CPU using Cpustress tool and it is remaining constantly above 85%-100% range but the stats shipped do not reflect this.

system.cpu.system.norm.pct shows a different value
system.cpu.total.norm.pct shows a different value

I have looked everywhere and cannot seem to find a solution.
Ideally i need to create a gauge on the back of it but seems like the cpu stats i see on Taks manager does not reflect on kibana


image

system.yml filew below

Module: system

Docs: https://www.elastic.co/guide/en/beats/metricbeat/6.5/metricbeat-module-system.html

  • module: system
    period: 1s
    metricsets:
    • cpu
      #- load
      #- memory
      #- network
      #- process
      #- process_summary
      #- core
      #- diskio
      #- socket
      cpu.metrics: ["percentages", "normalized_percentages"] # The other available options are normalized_percentages and ticks.

process.include_top_n:

by_cpu: 5 # include top 5 processes by CPU

by_memory: 5 # include top 5 processes by memory

#- module: system

period: 1m

#metricsets:

- filesystem

- fsstat

#processors:

- drop_event.when.regexp:

system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib)($|/)'

#- module: system

period: 15m

metricsets:

#- uptime

#- module: system

period: 5m

metricsets:

- raid

raid.mount_point: '/'

Sample Log File below

2019-01-10T18:08:33.122Z INFO [monitoring] log/log.go:144 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":4000},"total":{"ticks":94156,"time":{"ms":469},"value":94156},"user":{"ticks":90156,"time":{"ms":469}}},"handles":{"open":252},"info":{"ephemeral_id":"b824d11a-326a-46d4-82ed-a482c65045f6","uptime":{"ms":9123793}},"memstats":{"gc_next":8069712,"memory_alloc":4053768,"memory_total":9019312424}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":29,"batches":17,"total":29},"read":{"bytes":595},"write":{"bytes":10163}},"pipeline":{"clients":2,"events":{"active":1,"published":30,"total":30},"queue":{"acked":29}}},"metricbeat":{"system":{"cpu":{"events":30,"success":30}}}}}}
2019-01-10T18:09:03.125Z INFO [monitoring] log/log.go:144 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":4015,"time":{"ms":15}},"total":{"ticks":94390,"time":{"ms":234},"value":94390},"user":{"ticks":90375,"time":{"ms":219}}},"handles":{"open":254},"info":{"ephemeral_id":"b824d11a-326a-46d4-82ed-a482c65045f6","uptime":{"ms":9153794}},"memstats":{"gc_next":8070288,"memory_alloc":5598792,"memory_total":9044014912}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":30,"batches":16,"total":30},"read":{"bytes":560},"write":{"bytes":9932}},"pipeline":{"clients":2,"events":{"active":1,"published":30,"total":30},"queue":{"acked":30}}},"metricbeat":{"system":{"cpu":{"events":30,"success":30}}}}}}
2019-01-10T18:09:33.125Z INFO [monitoring] log/log.go:144 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":4015},"total":{"ticks":94671,"time":{"ms":281},"value":94671},"user":{"ticks":90656,"time":{"ms":281}}},"handles":{"open":254},"info":{"ephemeral_id":"b824d11a-326a-46d4-82ed-a482c65045f6","uptime":{"ms":9183794}},"memstats":{"gc_next":5119392,"memory_alloc":7105288,"memory_total":9070225936,"rss":4096}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":31,"batches":17,"total":31},"read":{"bytes":595},"write":{"bytes":10610}},"pipeline":{"clients":2,"events":{"active":0,"published":30,"total":30},"queue":{"acked":31}}},"metricbeat":{"system":{"cpu":{"events":30,"success":30}}}}}}
2019-01-10T18:10:03.122Z INFO [monitoring] log/log.go:144 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":4015},"total":{"ticks":94686,"time":{"ms":15},"value":94686},"user":{"ticks":90671,"time":{"ms":15}}},"handles":{"open":252},"info":{"ephemeral_id":"b824d11a-326a-46d4-82ed-a482c65045f6","uptime":{"ms":9213792}},"memstats":{"gc_next":7309472,"memory_alloc":4081336,"memory_total":9096439616}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":30,"batches":17,"total":30},"read":{"bytes":595},"write":{"bytes":10358}},"pipeline":{"clients":2,"events":{"active":0,"published":30,"total":30},"queue":{"acked":30}}},"metricbeat":{"system":{"cpu":{"events":30,"success":30}}}}}}

(Maddin2016) #2

The metricbeat system module should use GetSystemTimes instead of NtQuerySystemInformation for calculating cpu usage. Or using performance counters. As a workaround you could create following windows module

- module: windows
  metricsets: ["perfmon"]
  enabled: true
  period: 1s
  perfmon.counters:
  - instance_label: processor.time
    measurement_label: processor.time.total.pct
    query: '\Processor(_Total)\% Processor Time'

and check if the value match with the one from taskmanager.


(system) closed #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.