Feed Logstash with gzipped multiline inputs

gilbertotcc · July 13, 2017, 8:35am

Hi All,

I'm implementing an ELK stack for Java logs indexing and analysis. At the moment, it's only a proof of concept and I cannot feed Logstash with plain text log files; I've got gzipped past log files and I want to index their content.

Given that I cannot use multiple codecs (gzip_lines and multiline), which is the best solution to index them? Should I aggregate lines and then feed Logstash with the result? Or are there other ways to reach my goal?

I've written a Python script to read lines from GZIP files and feed Logstash via http_input plugin but I suppose it is not the best solution (according to the long times needed to index files).

Thank you in advance for suggestions.

guyboertje · July 21, 2017, 8:41am

I suggest that you use your python script to unzip to another folder and have filebeat read the unzipped files and send them to the beats input. Filebeat has support for multiline.

gilbertotcc · July 21, 2017, 10:40am

Thank you very much for the suggestion.

Meanwhile I've implemented a Java program to extract log entries —taking account of multiline entries too— and push them into a Redis list. Then Logstash takes such entries from Redis and index them into Logstash.

It works but I've not measured performances yet.

In this manner I know when all lines of the GZIP file are processes. In the solution you suggested is there a way to know when the file is processed? The goal is to remove the uncompressed file as soon as its content is loaded somewhere and ready to be indexed.

guyboertje · July 21, 2017, 11:08am

Your solution is fine.

If you need more performance you can push alternate file contents to two redis instances and use two LS instances to read from each redis and output to the same ES index.

gilbertotcc · July 21, 2017, 12:21pm

I will try with both one and two Redis instances to evaluate performances. If the topic will be still open I'll post here the benchmark results.

Thank you for your help.

system · August 18, 2017, 12:22pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Codec multiline with redis Logstash	4	1244	December 25, 2016
Filebeat with multiline vs. multiline logstash codec Beats filebeat	8	4512	July 5, 2017
Filebeat vs logstash for handling multiline Logstash	7	1428	May 7, 2017
Does logstash filebeat support reading from redis Logstash	4	828	July 6, 2017
Logstsash Multiline Codec with Filebeat is a must Logstash	7	1734	November 14, 2018

Feed Logstash with gzipped multiline inputs

Related topics