File Beat Consuming very high CPU

Hello Team ,

This is my configuration
filebeat version 5.0.0-1 with centos 7 and also it directly inserts into kafka

These are the events in filebeat

INFO Non-zero metrics in the last 30s: registrar.states.update=8935 publish.events=8935 libbeat.kafka.call_count.PublishEvents=6 libbeat.kafka.published_and_acked_events=8935 libbeat.publisher.published_events=8935 registrar.writes=6
INFO Non-zero metrics in the last 30s: registrar.states.update=8495 registrar.writes=5 libbeat.kafka.call_count.PublishEvents=6 libbeat.publisher.published_events=10372 libbeat.kafka.published_and_acked_events=8495 publish.events=8495
INFO Non-zero metrics in the last 30s: registrar.writes=11 registrar.states.update=22357 libbeat.kafka.call_count.PublishEvents=10 libbeat.publisher.published_events=20480 libbeat.kafka.published_and_acked_events=22357 publish.events=22357
INFO Non-zero metrics in the last 30s: publish.events=15430 libbeat.kafka.published_and_acked_events=15430 registrar.writes=8 libbeat.kafka.call_count.PublishEvents=9 libbeat.publisher.published_events=17478 registrar.states.update=15430
INFO Non-zero metrics in the last 30s: registrar.writes=12 publish.events=24576 libbeat.kafka.call_count.PublishEvents=11 libbeat.publisher.published_events=22528 libbeat.kafka.published_and_acked_events=24576 registrar.states.update=24576

filebeat.prospectors:

  • input_type: log
    paths:

    • /var/log/nginx/1.log
      document_type: app1
  • input_type: log
    paths:

    • /var/log/nginx/2.log
      document_type: app2
  • input_type: log
    paths:

    • /var/log/nginx/3.log
      document_type: app3
  • input_type: log
    paths:

    • /var/log/nginx/4.log
      document_type: app4
  • input_type: log
    paths:

    • /var/log/nginx/5.log
      document_type: app5
  • input_type: log
    paths:

    • /var/log/nginx/6.log
      document_type: app6
  • input_type: log
    paths:

    • /var/log/nginx/7.log
      document_type: app7
  • input_type: log
    paths:

    • /var/log/nginx/8.log
      document_type: app8
  • input_type: log
    paths:

    • /var/log/nginx/9.log
      document_type: app9
  • input_type: log
    paths:

    • /var/log/nginx/10.log
      document_type: app10
  • input_type: log
    paths:

    • /var/log/nginx/11.log
      document_type: app11
  • input_type: log
    paths:

    • /var/log/nginx/12.log
      document_type: app12
  • input_type: log
    paths:

    • /var/log/nginx/13.log
      document_type: app13

    close_inactive: 48h
    output.kafka:
    enabled: true

    hosts: ["1.1.1.1:9092","2.2.2.2:9092","3.3.3.3:9092","4.4.4.4:9092","5.5.5.5:9092","6.6.6.6:9092","7.7.7.7:9092","8.8.8.8:9092","9.9.9.9:9092","10.10.10.10:9092"]

    topic: '%{[type]}'

    partition.round_robin:

logging.to_files: true
logging.files:
path: /var/log/filebeat

Cpu : overshoots to more than 100% and it is only affinity is to single cpu

  1. cpu affinity. Same CPU or same Core? This is somewhat due to the OS/kernel scheduling the OS-threads. The scheduling might not be always optimal (on linux tools like taskset do help). On the other hand, having a multi-core processor I'd favor my process running on the same CPU to reduce potential communcation required if data are passed to a worker running on another CPU.

  2. These are not that many events being published. This makes me wonder about potential network/kafka problems requiring the client to retry sending. At worst this might generate some garbage to be collected as well.

  3. Without profiling I can not really tell if there are any hotspots. In case you don't have a (don't want to) go development environment installed, you can easiliy get a profile via http. Just start filebeat with -httpprof 127.0.0.1:6060. This starts an http server (reachable from localhost only) you can use your browser or curl/wget to sample a full stack-trace every 10 seconds from : http://localhost:6060/debug/pprof/profile?debug=3 . Or use wget http://localhost:6060/debug/pprof/trace?seconds=20 to collect a profile over the course of 20 seconds.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.