Pipeline.workers configuration and aggregation filter

Hi all,

I'm going to use a brand new server with 4 vCPU and 16GB RAM, I've some pipelines (+60) and I'll run multiple pipeline (for eg: 1 pipeline for 10 "easy pipelines" 8 for "medium pipelines" and so on).

Some of these pipelines uses aggregation filter. I want to apply the best configuration in order to speed up the elaboration of the pipelines.

If i set pipeline.workers to 2 each *.conf will use 1 worker or the pipelines "shares" the 2 workers? what about the aggregation filter? I read about the fact that if I use an aggregation filter i must use only 1 worker per pipeline that use aggregation filter.It is correct?

Kind Regards

Roberto

Yes. In order to aggregate events those events must pass through the same instance of the aggregate filter, so you can only have one instance, which means one worker thread.

If you have expensive processing before the aggregate (e.g. dns or geoip lookups, an elasticsearch, http or jdbc_streaming filter) it is possible to use pipeline-to-pipeline communication to use multiple workers for the initial processing and then merge them into a single worker for the aggregate. Of course that will not retain the order of events.

let me explain, if i run logstash specifiing to run this pipeline (with aggregation filter):

- pipeline.id: checksum path.config: "/etc/logstash/conf.d/checksum_*.conf" pipeline.workers: 2
Each conf file will take 1 worker?

KR

Roberto

No, logstash will combine all the files that match that regexp into a single configuration, and two worker threads will run that combined configuration.

And what if i have the following and run logstash without option (and without aggregation)?

- pipeline.id: checksum_1
  path.config: "/etc/logstash/conf.d/checksum_1.conf" 
  pipeline.workers: 2
- pipeline.id: checksum_2 
  path.config: "/etc/logstash/conf.d/checksum_2.conf" 
  pipeline.workers: 2

logstash will use 2 worker per configuration (4 in total) ?

KR

Roberto

Correct.

And just to be clear if i do the same with 1 worker per aggregation configuration specifying the name of the conf file i'll speed up the elaboration?

- pipeline.id: checksum_1_aggr
  path.config: "/etc/logstash/conf.d/checksum_1.conf" 
  pipeline.workers: 1
- pipeline.id: checksum_2_aggr
  path.config: "/etc/logstash/conf.d/checksum_2.conf" 
  pipeline.workers: 1
- pipeline.id: checksum_3_aggr
  path.config: "/etc/logstash/conf.d/checksum_3.conf" 
  pipeline.workers: 1
- pipeline.id: checksum_4_aggr
  path.config: "/etc/logstash/conf.d/checksum_4.conf" 
  pipeline.workers: 1

I do not know what you mean by that.

Sorry, i forgotten to add the batch.size configuration. But now.it's clear, i must only configure 1 worker per aggregation pipeline and to increase the performance i can increase the Bach size of i correctly understood.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.