Hello
I am currently using the elk stack in v5.5 and i have a quite big issue:
Everytime i have a tcp write or read error i have to restart filebeat because it stops sending messages.
2017-07-25T11:14:46+02:00 ERR Failed to publish events caused by: write tcp 163.172.15.176:53430->163.172.99.57:5000: write: connection reset by peer
2017-07-25T11:14:46+02:00 ERR Failed to publish events caused by: write tcp 163.172.15.176:53428->163.172.99.57:5000: write: connection reset by peer
2017-07-25T11:14:47+02:00 INFO Non-zero metrics in the last 30s: libbeat.logstash.call_count.PublishEvents=9 libbeat.logstash.publish.read_bytes=350 libbeat.logstash.publish.write_bytes=5351704 libbeat.logstash.publish.write_errors=4 libbeat.logstash.published_and_acked_events=30181 libbeat.logstash.published_but_not_acked_events=12220 libbeat.publisher.published_events=30901 publish.events=30181 registrar.states.update=30181 registrar.writes=5
2017-07-25T11:15:17+02:00 INFO Non-zero metrics in the last 30s: libbeat.logstash.call_count.PublishEvents=2 libbeat.logstash.publish.read_bytes=4408 libbeat.logstash.publish.write_bytes=1727247 libbeat.logstash.published_and_acked_events=24497 libbeat.publisher.published_events=13739 publish.events=12277 registrar.states.update=12277 registrar.writes=2
2017-07-25T11:15:47+02:00 INFO No non-zero metrics in the last 30s
2017-07-25T11:16:17+02:00 INFO No non-zero metrics in the last 30s
2017-07-25T11:16:47+02:00 INFO No non-zero metrics in the last 30s
2017-07-25T11:17:17+02:00 INFO No non-zero metrics in the last 30s
2017-07-25T11:17:47+02:00 INFO No non-zero metrics in the last 30s
2017-07-25T11:18:17+02:00 INFO No non-zero metrics in the last 30s
2017-07-25T11:18:47+02:00 INFO No non-zero metrics in the last 30s
2017-07-25T11:19:17+02:00 INFO No non-zero metrics in the last 30s
2017-07-25T11:19:47+02:00 INFO No non-zero metrics in the last 30s
2017-07-25T11:20:17+02:00 INFO No non-zero metrics in the last 30s
When i restart filebeat he push the missing messages and the new ones.
Here is my filebeat conf:
filebeat:
name: "host7"
spool_size: 16384
prospectors:
-
paths:
- /var/log/varnish/varnish.log
input_type: log
fields_under_root: true
fields:
tags: ['json', 'varnish']
platform: boxes
document_type: varnish-logs
close_inactive: 5m
output.logstash:
hosts: ["ls1:5000","ls2:5000"]
loadbalance: true
pipelining: 5
worker: 2
bulk_max_size: 8192
ssl:
certificate_authorities: ["/etc/filebeat/wildcard.ls.dev.logstash.crt"]
I have to send the logs to a distant datacenter, my logstash usualy gets 12k messages/s and i have the same pb on 5 differents plateforms (especialy the ones that don't send a lot of messages)
I started to have an extensive usage of filebeat (and started to loadbalance) since i migrated from 5.4 to 5.5 so i am not sure the problem happened since the 5.5 migration of if it would occur in 5.4.
Thanks !