Multiline codec in filter

Can I use a multiline codec in the filter?

I've more than 10k+ files of 3 different patterns logs pattern on S3 bucket that will be fetched using input plugin.

One log pattern requires a multiline codec before start parsing them. I want to use it like

filter {
if [message] =~ /this regex/ {

  • use multiline codec here ...
  • parse the logs
    }
    else if [message] =~ /and this regex/ {
  • do this and that
    }
    }

I'm not able to use multiline codec in filter but when I use this in input, it works fine. Please guide me..

No. codecs are called by inputs, not filters.

Ok, but I've get files from S3 bucket. Can I use multiline codec on S3 input too or it is only specific for file input only?

Maybe. You would have to try it. There is a comment in the code that "We are making an assumption concerning cloudfront log format, the user will use the plain or the line codec", which suggests you cannot, but also a comment "ensure any stateful codecs (such as multi-line ) are flushed" which suggests you can.

Sir, we've about more than 10-50K+ files on S3 buckets, some logs patterns require multiline codec to get single logs lines before applying filter. For that purpose I would need to first segregate the multilines and single lines logs in S3 that is nearly impossible for me at this time. Is there a way I get all logs of different type and use something in filter to segregate one from other.

Like if [message] =~ /this pattern/ {do this} else if [message] =~ /next pattern/ {do that.. }

thank you

If you need to make the use a codec conditional on the contents you could use a tcp input and output to connect two pipelines and conditionally send the events to the second pipeline, which could use a multiline codec.

You would need event order to be preserved, so you have to set pipeline.workers to 1 and possibly set pipeline.ordered.

Sir, please correct me here:

taking_input.yml

input { fetching 50k+ logs files from S3 buckets}
filter {}
output { pipeline { send_to => "parsing_listener" } }

parse.yml

input {pipeline { address => "parsing_listener"} }
filter {
if [message] =~ /regex to detect single line data/
{
mutate { add_tag => ["single_line_data"] }
}
else if [message] =~ /regex to detect multiline data/
{
mutate {add_tag => ["multiline_data"]}
}
}
output {
if "multiline_data" in [tags]
{
send_to => "multiline_listener"
}
}

multiline.yml

input {
pipeline {address => "multiline_listener"}
codec => multiline {blah blah ..}
}
filter {will parse here}
output {will send output to the elasticsearch etc...}

Here my ambiguity is about multiline.yml file. I'm adding two things in input, 1st is the input listener and 2nd one is the codec. Don't know it will work or not.

pipeline to pipeline links ignore the codec, since they communicate using the pipeline bus, that is why I said to use tcp.

If you could please add an example I would be greatful.

There is an example here.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.