Filebeat queue.disk keeps piling up even when Logstash persisted queue remains relatively empty

sergeyarl · September 18, 2023, 6:40am

Hi!

So we are using the following chain:

Filebeats, that run in a K8s cluster (1 Filebeat instance on each k8s worker node) ->
2 Logstash nodes behind AWS ALB ->
Elastic search cluster

Everything works pretty well. But when the log volume is high quite often files in Filebeat queue.disk folder start piling up till we reach queue.disk.max_size .

Don't see any CPU iowaits on K8s nodes, plenty of CPU resources available. No CPU/Memory limits for Filebeat pod.

At the same time on both Logstash nodes persisted queue remains relatively empty. Logstash nodes are not loaded (CPU is mostly half idle) .

logstash.yml

path.data: /var/lib/logstash
pipeline.workers: 33
# we played with different batch sizes values here
pipeline.batch.size: 131072
path.config: /etc/logstash/conf.d
queue.type: persisted
queue.max_bytes: 310gb
dead_letter_queue.enable: true
dead_letter_queue.max_bytes: 35gb
path.dead_letter_queue: /var/lib/logstash/dead_letter_queue
path.logs: /var/log/logstash
log.level: info
http.host: 0.0.0.0

Logstash jvm.options:

-Xms60g
-Xmx60g
-Djava.awt.headless=true
-Dfile.encoding=UTF-8
-Djruby.compile.invokedynamic=true
-Djruby.jit.threshold=0
-Djruby.regexp.interruptible=true
-XX:+HeapDumpOnOutOfMemoryError
-Djava.security.egd=file:/dev/urandom

filebeat.yml

filebeat.inputs:

- type: container
  stream: all
  paths:
    - "/var/log/containers/*.log"
  multiline.type: pattern
  multiline.pattern: '^(\d{4})'
  multiline.negate: true
  multiline.match: after

processors:

- add_kubernetes_metadata:
    default_indexers.enabled: false
    default_matchers.enabled: false
    indexers:
      - container:
    matchers:
      - logs_path:
          logs_path: '/var/log/containers/'
          resource_type: 'container'

- drop_event:
    when:
      not:
        has_fields: ['kubernetes.labels.log-format']

output.logstash:
  hosts: logstash-nlb:5044

  loadbalance: false

  compression_level: 0

  pipelining: 5

  # tried different values from 256 to 8192
  bulk_max_size : 1024
  slow_start: false
  # tried different values from 1 to 6
  workers: 6

queue.disk:
  max_size: 25GB
  path: /usr/share/filebeat/data/queue/
  segment_size: 1MB

http.enabled: true
http.host: 0.0.0.0

I noticed that each Filebeat instance is sending logstream to Logstash nodes at a speed of around 5-15 MB/sec max. Which in our case looks to be not enough to keep up with our apps' logs.

Can anyone please help me with understanding where to look for the bottleneck? I expect Filbeats to send their buffer as fast as possible to Logstash nodes and then let them do all the heavy lifting with queuing, parsing, etc.

Thanks!

system · October 16, 2023, 8:40am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat/Logstash poor disk.queue read performance: is that the maximum i can get? Beats filebeat	3	287	November 13, 2023
Filebeat 7.6.2_linux_x86_64 is not keeping up with the log entries added to .log files Beats filebeat	3	421	March 31, 2021
Filbeat to logstash extremely slow Beats docker , filebeat	1	362	July 27, 2021
For how long filebeat attempts sending logs to logstash while it is unreachable Beats filebeat	4	439	March 16, 2020
Logstash Persistent Queues Logstash	1	274	February 18, 2021

Filebeat queue.disk keeps piling up even when Logstash persisted queue remains relatively empty

Related topics