We are currently using GELF to transfer logs from docker containers to RabbitMQ via logstash. Messages are then read from RabbitMQ to Elasticsearch using another logstash instance.
I am worried that logstash/GELF may drop messages if it losts connection to RabbitMQ cluster. Would it be better to configure docker to write logs to files and then read these files using Beats? Or are there other ways to guarantee messages safety in case of sudden network failure?
Also, beats doesn't directly support RabbitMQ so we should switch to Kafka or Redis. But if I'm already storing messages in files, couldn't I just send them directly to Elasticsearch with Beats?
I had the same concerns you described here and implemented the combination of using the native docker json log storage and a log shipper that was deployed on each docker host and pushed entries directly to Elasticsearch. This worked quite well and had the advantage that it would even pull in logs retroactively. I would recommend to take a look at Filebeat's decode_json_fields processor, which should be able to parse docker's json object. You could also perform such processing an Elasticsearch ingest pipeline instead. See this example of how to configure Filebeat for that.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.