Filebeat low throughput and slow log files read

Filebeat log transfer latency

We have observed the high latency on Filebeat while publishing the logs to Logstash, this latency is more than 2 hours. We are using the below setup for log transfer.

Filebeat (11 nodes) -> Logstash (3 nodes) -> ES

  • Filebeat with 11 nodes having storm application running and generating the logs with speed of 10GB/hour on single node.

  • Filebeat showing harvesting for more than 40 files at a time. Although only 3 or 4 log files are being written or active at a time and remaining files are kind of not being updated for 12 hours but still harvesting as filebeat throughput is very slow.

  • Because of filebeat is reading too slow hence open file counts is increasing all the time.

  • And causes low filebeat throughput and log streaming is very slow in pipeline.

Filebeat configuration:

filebeat.inputs: 

- type: log 

  enabled: true 

  paths: 

    - <log path> 

  ignore_older: 24h 

  close_inactive: 5m 

  close_removed: true 

  clean_removed: true 

 

#----------------------------- Logstash output -------------------------------- 

output.logstash: 

  hosts: ["<logstash1>","<logstash2>","<logstash3>"] 

  loadbalance: true 

  compression_level: 1 

  bulk_max_size: 1600 

  worker: 4 

As per the above configuration, we are seeing a very high lag issue. No error from logstash and no backpressure from logstash. We have tested and verified logstash and its running fine.

Refer below filebeat stats:

"harvester": { 

  "closed": 0, 

  "files": { 

    "0535ba6f-8098-4214-83aa-ce330d5dedc8": { 

      "last_event_published_time": "2021-03-11T08:10:13.906Z", 

      "last_event_timestamp": "2021-03-11T08:10:13.906Z", 

      "name": "/storm/apache-storm-2.1.0/logs/workers-artifacts/topology-xxxx/6701/14_worker.log", 

      "read_offset": 1856738732, 

      "size": 2147579881, 

      "start_time": "2021-03-11T07:57:05.940Z" 

    } 

    .. 

    .. 

    .. 

     

  }, 

  "open_files": 176, 

  "running": 176, 

  "skipped": 0, 

  "started": 176 

}, 

“output": { 

      "events": { 

        "acked": 517952, 

        "active": 2048, 

        "batches": 546, 

        "dropped": 0, 

        "duplicates": 0, 

        "failed": 0, 

        "toomany": 0, 

        "total": 520000 

      }, 

      "read": { 

        "bytes": 4548, 

        "errors": 0 

      }, 

      "type": "logstash", 

      "write": { 

        "bytes": 6772441478, 

        "errors": 0 

      } 

    }, 

    "pipeline": { 

      "clients": 1, 

      "events": { 

        "active": 8021, 

        "dropped": 0, 

        "failed": 0, 

        "filtered": 1974106, 

        "published": 525916, 

        "retry": 3, 

        "total": 2500023 

      }, 

      "queue": { 

        "acked": 517896 

      } 

    } 

  } 

Tested with filebeat 7.1 and 7.9 but the issue is still the same.

Please suggest how to fix this as it's impacting our business.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.