Parsing Python Logs - Use Logstash or not

My current logging infrastructure uses just filebeats and elasticsearch.
Would like to enhance the quality of the logs by parsing out individual fields columns.

Looking for recommendation on best practice to do this.

  1. Add an extra logstash layer.
  2. Format the logs with https://pypi.python.org/pypi/logstash_formatter ; but still send the logs directly to ElasticSearch. Because the output is json; it does not seem as if the Logstash is necessary.

In particular one problem I am struggling with is multiline outputs.

Thanks for the help.

The multiline problem will be fixed in one of the next versions. Follow the ticket here: https://github.com/elastic/filebeat/issues/89

Filebeat currently does not have a JSON input, it reads log files line by line, so you will still end up with strings. If you want to use grok for example to identify the timestamp, adding logstash is your best option.

About the python to json logging: If you send it directly to elasticsearch without filebeat, this should work, but it would mean you have to handle multiple servers or failure of sending packages yourself.