Pipeline.workers configuration and aggregation filter

Roberto_B · September 17, 2021, 8:27am

Hi all,

I'm going to use a brand new server with 4 vCPU and 16GB RAM, I've some pipelines (+60) and I'll run multiple pipeline (for eg: 1 pipeline for 10 "easy pipelines" 8 for "medium pipelines" and so on).

Some of these pipelines uses aggregation filter. I want to apply the best configuration in order to speed up the elaboration of the pipelines.

If i set pipeline.workers to 2 each *.conf will use 1 worker or the pipelines "shares" the 2 workers? what about the aggregation filter? I read about the fact that if I use an aggregation filter i must use only 1 worker per pipeline that use aggregation filter.It is correct?

Kind Regards

Roberto

Badger · September 17, 2021, 3:24pm

Yes. In order to aggregate events those events must pass through the same instance of the aggregate filter, so you can only have one instance, which means one worker thread.

If you have expensive processing before the aggregate (e.g. dns or geoip lookups, an elasticsearch, http or jdbc_streaming filter) it is possible to use pipeline-to-pipeline communication to use multiple workers for the initial processing and then merge them into a single worker for the aggregate. Of course that will not retain the order of events.

Roberto_B · September 17, 2021, 3:59pm

let me explain, if i run logstash specifiing to run this pipeline (with aggregation filter):

- pipeline.id: checksum path.config: "/etc/logstash/conf.d/checksum_*.conf" pipeline.workers: 2
Each conf file will take 1 worker?

KR

Roberto

Badger · September 17, 2021, 4:03pm

No, logstash will combine all the files that match that regexp into a single configuration, and two worker threads will run that combined configuration.

Roberto_B · September 17, 2021, 4:20pm

And what if i have the following and run logstash without option (and without aggregation)?

- pipeline.id: checksum_1
  path.config: "/etc/logstash/conf.d/checksum_1.conf" 
  pipeline.workers: 2
- pipeline.id: checksum_2 
  path.config: "/etc/logstash/conf.d/checksum_2.conf" 
  pipeline.workers: 2

logstash will use 2 worker per configuration (4 in total) ?

KR

Roberto

Badger · September 17, 2021, 4:28pm

Correct.

Roberto_B · September 17, 2021, 4:53pm

And just to be clear if i do the same with 1 worker per aggregation configuration specifying the name of the conf file i'll speed up the elaboration?

- pipeline.id: checksum_1_aggr
  path.config: "/etc/logstash/conf.d/checksum_1.conf" 
  pipeline.workers: 1
- pipeline.id: checksum_2_aggr
  path.config: "/etc/logstash/conf.d/checksum_2.conf" 
  pipeline.workers: 1
- pipeline.id: checksum_3_aggr
  path.config: "/etc/logstash/conf.d/checksum_3.conf" 
  pipeline.workers: 1
- pipeline.id: checksum_4_aggr
  path.config: "/etc/logstash/conf.d/checksum_4.conf" 
  pipeline.workers: 1

Badger · September 17, 2021, 6:20pm

I do not know what you mean by that.

Roberto_B · September 17, 2021, 8:04pm

Sorry, i forgotten to add the batch.size configuration. But now.it's clear, i must only configure 1 worker per aggregation pipeline and to increase the performance i can increase the Bach size of i correctly understood.

system · October 15, 2021, 8:04pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Pipeline to Pipeline forked configuration to enable use of the aggregation filter Logstash aggregations	4	116	April 25, 2024
Aggregate filter plugin + Logstash	2	268	February 27, 2020
How to use aggregate filter with multiple workers Logstash	4	2178	January 16, 2019
When should we set pipeline.workers to 1? Logstash	8	4644	April 1, 2020
Elapsed filter with multiple workers..does it work or not? Logstash	8	1898	December 29, 2017

Pipeline.workers configuration and aggregation filter

Related topics