Hey All, very weird issue. Hoping for some guidance/assistance here.
Issue: Filebeat running on CentOS Linux release 7.3.1611 (Core), filebeat 5.1. Server suffers power failure due to host failing. Ungraceful shutdown, VM running off remote QNAP via ISCSI. VM recovers fine but filebeat doesn't start - error message below;
2017-02-05T19:21:25+13:00 INFO Home path: [/usr/share/filebeat] Config path: [/etc/filebeat] Data path: [/var/lib/filebeat] Logs path: [/var/log/filebeat]
2017-02-05T19:21:25+13:00 INFO Setup Beat: filebeat; Version: 5.1.1
2017-02-05T19:21:25+13:00 INFO Max Retries set to: 3
2017-02-05T19:21:25+13:00 INFO Activated logstash as output plugin.
2017-02-05T19:21:25+13:00 INFO Publisher name: mon.testdomain.com
2017-02-05T19:21:25+13:00 INFO Flush Interval set to: 1s
2017-02-05T19:21:25+13:00 INFO Max Bulk Size set to: 2048
2017-02-05T19:21:25+13:00 INFO filebeat start running.
2017-02-05T19:21:25+13:00 INFO Registry file set to: /var/lib/filebeat/registry
2017-02-05T19:21:25+13:00 INFO Loading registrar data from /var/lib/filebeat/registry
2017-02-05T19:21:25+13:00 INFO Total non-zero values:
2017-02-05T19:21:25+13:00 INFO Uptime: 9.9703ms
2017-02-05T19:21:25+13:00 INFO filebeat stopped.
2017-02-05T19:21:25+13:00 CRIT Exiting: Could not start registrar: Error loading state: Error decoding states: EOF
/var/lib/filebeat/registry is empty after the power failure and filebeat does not recover.
systemctl start filebeat yields the same error.
I'm assuming that filebeat isn't recovering because the registry file is empty and hence the EOF error?
Question: How do you re-mediate this issue and ensure that the registry file stays intact in the event of an ungraceful failure? Is this a bug in the filebeat agent? Worthy of raising a ticket on github? Both servers that have filebeat enabled experienced the same failure and none of the agents start at boot or after manual start.
Thoughts appreciated, no idea where to look next.
Cheers
Andy