My current logging infrastructure uses just filebeats and elasticsearch.
Would like to enhance the quality of the logs by parsing out individual fields columns.
Looking for recommendation on best practice to do this.
Add an extra logstash layer.
Format the logs with https://pypi.python.org/pypi/logstash_formatter ; but still send the logs directly to ElasticSearch. Because the output is json; it does not seem as if the Logstash is necessary.
In particular one problem I am struggling with is multiline outputs.
Filebeat currently does not have a JSON input, it reads log files line by line, so you will still end up with strings. If you want to use grok for example to identify the timestamp, adding logstash is your best option.
About the python to json logging: If you send it directly to elasticsearch without filebeat, this should work, but it would mean you have to handle multiple servers or failure of sending packages yourself.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.