I'm going to use a brand new server with 4 vCPU and 16GB RAM, I've some pipelines (+60) and I'll run multiple pipeline (for eg: 1 pipeline for 10 "easy pipelines" 8 for "medium pipelines" and so on).
Some of these pipelines uses aggregation filter. I want to apply the best configuration in order to speed up the elaboration of the pipelines.
If i set pipeline.workers to 2 each *.conf will use 1 worker or the pipelines "shares" the 2 workers? what about the aggregation filter? I read about the fact that if I use an aggregation filter i must use only 1 worker per pipeline that use aggregation filter.It is correct?
Yes. In order to aggregate events those events must pass through the same instance of the aggregate filter, so you can only have one instance, which means one worker thread.
If you have expensive processing before the aggregate (e.g. dns or geoip lookups, an elasticsearch, http or jdbc_streaming filter) it is possible to use pipeline-to-pipeline communication to use multiple workers for the initial processing and then merge them into a single worker for the aggregate. Of course that will not retain the order of events.
No, logstash will combine all the files that match that regexp into a single configuration, and two worker threads will run that combined configuration.
Sorry, i forgotten to add the batch.size configuration. But now.it's clear, i must only configure 1 worker per aggregation pipeline and to increase the performance i can increase the Bach size of i correctly understood.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.