How can I create logstash conf filter with different patterns for different lines for input file?

Stella_Martin · October 3, 2015, 10:26pm

I have following file format

HeadingString1<\tab>HeadingString2<\tab>......<\tab>HeadingStringN
Col11_1<\tab>Col11_2<\tab>Col21_1<\tab>Col21_2<\tab>...<\tab>ColN1_1<\tab>ColN1_2
Col12_1<\tab>Col12_2<\tab>Col22_1<\tab>Col22_2<\tab>...<\tab>ColN2_1<\tab>ColN2_2

............

I want data to be indexed as following

HeadingString1 { Col11_1 : Col11_2 , Col12_1 : Col12_2, ......}
HeadingString2 {Col21_1: Col21_2 , Col22_1 : Col22_2 , .....}
....
HeadingStringN {ColN1_1: ColN1_2, ColN2_1: ColN2_2, .....}

Note: <\tab> means \t (tab) character

warkolm · October 4, 2015, 12:15am

What have you tried?

Stella_Martin · October 4, 2015, 2:24am

I have tried to use the CSV filter as follows

filter {
csv {
columns => ["HeadingString1" , "HeadingString2" , "HeadingString3"]
separator => " "
}
}

but do not have any ideas how can I get the output

magnusbaeck · October 4, 2015, 4:13pm

Logstash is not a very good fit for processing this kind of data. You could do it with a custom plugin or maybe with a complicated ruby filter, but I don't think it's worth it. You'll have to read the whole file in one swoop and the file input isn't built for that use case. I suggest you write a custom script. It's probably 10–15 lines of e.g. Python or Perl.

Stella_Martin · October 4, 2015, 6:58pm

Thanks Magnus,

I think I will go with the scripting approach as you suggested.

Just for the future knowledge

I was trying to use logstash because of the following reasons

scale -> I can just add another instance of logstash to scale out, but is there any simple way to run the script in a distributed environment ?
streaming data -> logstash handles logs being written to file really well, but I will have to develop my own way maintaining amount of data read.

Thanks again.

magnusbaeck · October 4, 2015, 8:10pm

That's a too general question with the information you've given us, but if you can just add additional Logstash instances to read a particular set of files (without conflicts between instances trying to read the same files) I don't see why it would be impossible to do the same thing with a separate script.
Maybe I'm misunderstanding your data format, but how are supposed to support streamed data when the contents of the very first emitted message (corresponding to the line beginning with HeadingString1 in your example) contains Col1N_1 and Col1N_2? You basically want to pivot a table. That's doable for reasonably small N and when the data can be read in one chunk but for streamed data that's no longer fun.

Topic		Replies	Views
How do I create a filter for a custom app log with different lines? Logstash	10	1381	July 6, 2017
Multiple tables in csv file Logstash	7	4006	December 8, 2018
Grok Custom pattern file Logstash	7	18372	March 5, 2018
Grok pattern to match different log formats in same log file Logstash	2	1980	August 13, 2018
Help need to create filter Logstash	2	728	April 28, 2017

How can I create logstash conf filter with different patterns for different lines for input file?

Related topics