How to use aggregate filter with multiple workers

Manoj_Hettiarachchi · November 21, 2018, 4:07am

I am using logstash aggregate filter to aggregate two log lines with same "uuid"
But in the documentation, it is mentioned that " You should be very careful to set Logstash filter workers to 1 ( -w 1 flag) for this filter to work correctly otherwise events may be processed out of sequence and unexpected results will occur."

Since my system has a considerable traffic I am using the default number of workers " Number of the host’s CPU cores"

Because of this, I have found out that most of the logs were not properly aggregated.

Do we have any alternative method to execute the aggregation functionality by keeping multiple workers?

Please advice.

Christian_Dahlqvist · November 21, 2018, 6:54am

The aggregate filter indeed has this limitation, which limits performance considerable and prevents scaling to multiple threads and Logstash instances. To get a solution that scales it is probably better to have a solution that does not rely on the ingest layer to handle this.

One option could be to have a batch process that periodically queries new data and updates documents where needed. This would typically run externally to Elasticsearch and be implemented using one of the language client.

You could also create an entity-centric index where you store a single document per UUID (and use this as the document ID). When you find a document that should be aggregated, you update this document (first time it would be indexed) while at the same time writing the document to the standard index.

Manoj_Hettiarachchi · November 21, 2018, 7:08am

@Christian_Dahlqvist,

Thanks for the quick response.
I will try the options that you have mentiond and update my results here.

Manoj_Hettiarachchi · December 19, 2018, 5:36am

I could not find a better solution with multiple workers, So I move in to a multiple pipeline solution each with single worker. This system is working very well with high transaction load.

system · January 16, 2019, 5:39am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Aggregate filter plugin + Logstash	2	268	February 27, 2020
Pipeline.workers configuration and aggregation filter Logstash	9	984	October 15, 2021
Elapsed filter with multiple workers..does it work or not? Logstash	8	1898	December 29, 2017
Elapsed and aggregate filter with multiple workers Logstash	6	1658	November 1, 2018
Concerns with scaling Logstash when using the Aggregate filter Logstash	6	586	August 27, 2021

How to use aggregate filter with multiple workers

Related topics