Filebeat keeps open files, s3 upload fail


We have filebeat instances on huge amout of servers but couple of them are problematic. Cofiguration looks like:

  • Filebeat harvests files from folder /mnt/log/*.log and sends them to ES cluster.
  • There is a lot traffic going on in this folder. In peak hours new files are created every couple of seconds and can grow up to over 10mb.
  • Every 30min there is a python backup script which has to send all files to AWS s3. It's checking 'lsof' first to find out if the files are not in use.

The backup script fails every time because filebeat keeps all log files open until the diskspace is full and we have to stop the filebeat and rerun the backup script to clear it.
Yesterday I tunred debug log for filebeat and saw that it needs over 30min to harvests one file.
So I think it's not keeping up with harvesting all that files within the 30min window, new files are still created, files are not being uploaded to s3, /mnt is getting 100% and everything is stuck. Also events are published to ES cluster, with some delay, until we stop filebeat.

We're are using filebeat 5.1.1 now and on each server have few prospectors configured like this(sorry or the XXX'es, I cannot share much more details):

    - /mnt/log/XXX*
  input_type: log
  exclude_files: [".gz$"]
  document_type: vom_urs
    pattern: ^@?[0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{3}
    negate: true
    match: after

  ignore_older: 30m

  close_inactive: 10s
  close_renamed: true
  close_removed: true

  clean_inactive: 10m
  clean_removed: true

  scan_frequency: 1s

Output is set to ES and look like this:

hosts: ["http://X.X.X.X:9200", "http://X.X.X.X:9200", "http://X.X.X.X:9200", "http://X.X.X.X:9200"]
bulk_max_size: 4096
loadbalance: true
worker: 4

Is there a way to speed up/optimize harvesting process in such situation?

Any suggestions, ideas?

I measured speed through pipe it its seems sooo slow:

/usr/bin/ -e -c /etc/filebeat/filebeat.yml 2>/dev/null | /usr/bin/pv > /dev/null
86.7MiB 0:04:10 [ 763kiB/s]

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.