Filebeat is using too much cpu

I didn't set the scan_frequency option(which means leave as default 10s), why does it still use that much cpu?

Here is the log from filebeat:

2019-06-28T14:30:52.034+0800	INFO	[monitoring]	log/log.go:144	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":47740,"time":{"ms":1782}},"total":{"ticks":800400,"time":{"ms":29664},"value":800400},"user":{"ticks":752660,"time":{"ms":27882}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":46},"info":{"ephemeral_id":"89e3b738-7016-49fa-af1c-3b7b560ea1b3","uptime":{"ms":810044}},"memstats":{"gc_next":67501680,"memory_alloc":46935904,"memory_total":150198960864}},"filebeat":{"events":{"active":107,"added":120138,"done":120031},"harvester":{"closed":1,"open_files":37,"running":37,"started":3}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":120026,"batches":59,"total":120026}},"outputs":{"kafka":{"bytes_write":21317697}},"pipeline":{"clients":1,"events":{"active":1163,"filtered":5,"published":120184,"total":120188},"queue":{"acked":120026}}},"registrar":{"states":{"cleanup":1,"current":47,"update":120031},"writes":{"success":59,"total":59}},"system":{"load":{"1":10.6,"15":10.48,"5":10.4,"norm":{"1":0.6625,"15":0.655,"5":0.65}}}}}}

I'm using filebeat7.1 and here is the filebeat.yml, any idea?

- type: log
    - /var/lib/docker/containers/*/*.log
  tail_files: true
  close_inactive: 5m
  close_timeout: 5m

#- add_cloud_metadata: ~
- add_locale: ~
- add_docker_metadata:
    host: "unix:///var/run/docker.sock"
- add_host_metadata:
    netinfo.enabled: true
    cache.ttl: 5m
- drop_event:
       - equals:
           container.labels.io_kubernetes_container_name: kube-system
       - equals:
           container.labels.io_kubernetes_container_name: default
       - equals:
           container.labels.io_kubernetes_container_name: kube-public
- decode_json_fields:
    fields: ["message"]
    process_array: false
    max_depth: 1
    overwrite_keys: false
- include_fields:
    fields: ["host.hostname", "container", "message", "event"]
#------------------------------- Kafka output ----------------------------------
  versions: 2.1.0
  hosts: ${KAFKA_HOSTS:?environment variable KAFKA_HOSTS not found}
  topic: ${KAFKA_TOPIC:?environment variable KAFKA_TOPIC not found}
  worker: 1
  keep-alive: 600
  required_acks: 0
  compression: gzip
  max_message_bytes: 52428800
  channel_buffer_size: 8192
  bulk_max_size: 2048
  compression_level: 4

Filebeat was designed to use as little resources as possible and be very light weight. To get there a lot of the processing that Logstash could do was not available and this had to be performed elsewhere, e.g. in Logstash or through an ingest pipeline. Now Filebeat is getting additional capabilities e.g. adding metadata and parsing JSON payload, and this all requires CPU cycles. When all of these are used it will therefore use more resources than older, less capable versions would.

You seem to be doing a bit of processing. How large are your events? How many events per second does Filebeat process when using that amount of CPU?

Got it, i have 10k events per second with event size about 500B on average, the cpu usage sounds reasonable now, I will try to remove the json decode processor.
I understand parsing JSON payload requires CPU cycles, but why is adding metadata a cpu costing job?

That I can not tell as I am not familiar with how metadata is added and cached.

This seems to be related to this issue,

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.