In order to try filebeat on production, i launched 1.1k instances of filebeat on my production boxes, each monitoring a couple of files in its boxes and sending data to 3 central logstash servers, which are pushing those data after some modifications and droppings to elasticsearch cluster. Since the day i launched i could see a pattern that every day some of the filebeat instances are dropping their connections to logstash. for example first day 1100 filebeats sending data to logstash, second day 1070, third day 1030, and with in a month it come to half now. All my filebeats are in freebsd boxes and i got the filebeat binary here https://beats-nightlies.s3.amazonaws.com/index.html?prefix=jenkins/filebeat/
I went to those boxes and observed in filebeat.log that following errors exist
016-07-18T03:31:00-07:00 ERR SSL client failed to connect with: read tcp 126.96.36.199:48835->188.8.131.52:443: read: connection reset by peer
2016-07-18T03:32:13-07:00 ERR SSL client failed to connect with: read tcp 184.108.40.206:59477->220.127.116.11:443: read: connection reset by peer
2016-07-18T03:33:43-07:00 ERR SSL client failed to connect with: read tcp 18.104.22.168:56951->22.214.171.124:443: i/o timeout
2016-07-18T03:35:14-07:00 ERR SSL client failed to connect with: read tcp 126.96.36.199:15389->188.8.131.52:443: i/o timeout
2016-07-18T03:36:44-07:00 ERR SSL client failed to connect with: read tcp 184.108.40.206:52429->220.127.116.11:443: i/o timeout.
This couldnt be any ssl issue or firewall issue as till a couple of days earlier this particular filebeat sent logs to logstash, and i tried connecting through curl and telnet and i can make connections to logstash. It seems to me that its somehow a filebeat issue . Following is my sample filebeat configuration file which is modified in every instance.
1 filebeat: 2 prospectors: 3 - 4 paths: 5 - /sc/log/setcainfo.log 6 fields: 7 hostip: "localipaddress" 8 document_type: setcainfo_Etc/GMT:timeadjust 9 10 - 11 paths: 12 - /var/log/messages 13 fields: 14 hostip: "localipaddress" 15 document_type: messages_Etc/GMT:timeadjust 16 17 - 18 paths: 19 - /root/.bash_history 20 fields: 21 hostip: "localipaddress" 22 document_type: bashhistory_Etc/GMT:timeadjust 23 24 25 output: 26 logstash: 27 hosts: ["18.104.22.168:443"] 28 tls: 29 certificate_authorities: ["/sc/filebeat/logstash-forwarder.crt"] 30 31 logging: 32 to_syslog: false 33 to_files: true 34 35 files: 36 path: /var/log/filebeat 37 name: filebeat.log 38 rotateeverybytes: 10485760 39 keepfiles: 7 40 level: debug