Filebeat log transfer latency
We have observed the high latency on Filebeat while publishing the logs to Logstash, this latency is more than 2 hours. We are using the below setup for log transfer.
Filebeat (11 nodes) -> Logstash (3 nodes) -> ES
-
Filebeat with 11 nodes having storm application running and generating the logs with speed of 10GB/hour on single node.
-
Filebeat showing harvesting for more than 40 files at a time. Although only 3 or 4 log files are being written or active at a time and remaining files are kind of not being updated for 12 hours but still harvesting as filebeat throughput is very slow.
-
Because of filebeat is reading too slow hence open file counts is increasing all the time.
-
And causes low filebeat throughput and log streaming is very slow in pipeline.
Filebeat configuration:
filebeat.inputs:
- type: log
enabled: true
paths:
- <log path>
ignore_older: 24h
close_inactive: 5m
close_removed: true
clean_removed: true
#----------------------------- Logstash output --------------------------------
output.logstash:
hosts: ["<logstash1>","<logstash2>","<logstash3>"]
loadbalance: true
compression_level: 1
bulk_max_size: 1600
worker: 4
As per the above configuration, we are seeing a very high lag issue. No error from logstash and no backpressure from logstash. We have tested and verified logstash and its running fine.
Refer below filebeat stats:
"harvester": {
"closed": 0,
"files": {
"0535ba6f-8098-4214-83aa-ce330d5dedc8": {
"last_event_published_time": "2021-03-11T08:10:13.906Z",
"last_event_timestamp": "2021-03-11T08:10:13.906Z",
"name": "/storm/apache-storm-2.1.0/logs/workers-artifacts/topology-xxxx/6701/14_worker.log",
"read_offset": 1856738732,
"size": 2147579881,
"start_time": "2021-03-11T07:57:05.940Z"
}
..
..
..
},
"open_files": 176,
"running": 176,
"skipped": 0,
"started": 176
},
“output": {
"events": {
"acked": 517952,
"active": 2048,
"batches": 546,
"dropped": 0,
"duplicates": 0,
"failed": 0,
"toomany": 0,
"total": 520000
},
"read": {
"bytes": 4548,
"errors": 0
},
"type": "logstash",
"write": {
"bytes": 6772441478,
"errors": 0
}
},
"pipeline": {
"clients": 1,
"events": {
"active": 8021,
"dropped": 0,
"failed": 0,
"filtered": 1974106,
"published": 525916,
"retry": 3,
"total": 2500023
},
"queue": {
"acked": 517896
}
}
}
Tested with filebeat 7.1 and 7.9 but the issue is still the same.
Please suggest how to fix this as it's impacting our business.