Elapsed filter with multiple workers..does it work or not?

Hi,

I tried searching but haven't got a definite answer.

Aggregate filter clearly mentions that:
"You should be very careful to set Logstash filter workers to 1 (-w 1 flag) for this filter to work correctly otherwise events may be processed out of sequence and unexpected results will occur."

But the same is not mentioned for Elapsed filter. Fundamentally I would expect the same limitation to apply for elapsed filter as well.

So is my assumption correct? (Q1)
If so, how can we achieve the conflicting goal of faster log ingestion along with the ability to use elapsed filter plugin? (Q2)
Also, does using multiple threads in filebeat matter at all? (Q3)
Right now I set "pipeline.workers: 8" in logstash.yml. Help says this defaults to number of cores.
I haven't tested elapsed filter after setting to 8 but with default value of 4 (no. of cores) it did seem to work fine.

-Thanks
Nikhil

I did some testing and see that with 8 worker threads in logstash and 3 logstash nodes load-balancing I miss about 15-20% of events.
Checking now if I can do something with the data that is already sitting in ES.

-Nikhil

Perhaps best option for me is to route all those events that need multi-line, aggregate, elapsed filter processing to a different logstash instance that runs only 1 worker thread. Since this will be a fraction of the overall events I should be able to meet both my objectives.

Comments?

-Nikhil

You should use multiline in filebeat i.e. as close to the source as possible.

We are moving more and more towards stateless plugins (no local state) but we are building a shared state solution that stores the state in Elasticsearch meaning that plugins can share state across LS instances as well as worker threads. This solution will not be ready for some time.

If possible you should try using Elasticsearch to do aggregations and elapsed time calcs by rereading events from Elasticsearch using a suitable query and the elasticsearch input.

Thank You Guy for your response.

-Regards
Nikhil

Is this a suggestion to not use the elapsed filter?

I have noticed that my elapsed plugin works most of the time. However, when handling several thousands of rapid logged BEGIN/END tag pairs, it can miss 10% of the pairs and just report an elapsed_end_without_start in an END tag.

Same here John. In my case I saw missing 15-20% of the events. With that
limitation it is for every individual to decide whether that is acceptable
or not.

Cheers
Nikhil

1 Like

Reducing the worker threads to 1 seems to work for some people. I may try that option. Thanks for the response.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.