Understanding Logstash pipeline to pipeline communication

LogstashQuestions · April 12, 2025, 3:54am

I'm trying to fix some performance issues in a Logstash pipeline that was set up using pipeline to pipeline communication but I'm not sure if the way it was set up makes sense. There are 2 pipelines defined, with the first pipeline taking input, then filtering, then sending to a second pipeline. This second pipeline is solely responsible for outputting events to Elasticsearch.

I'm having trouble understanding if there are any benefits to doing this vs just having a single pipeline that handles input -> filter -> output. From reading the documentation, multiple pipelines are typically used when you have various different filters that run, or maybe for multiple different outputs. In this case though, there is just one (fairly small) filter and one single output.

So, is there any reason to have a second pipeline that only has an output to Elasticsearch instead of just putting that in the first pipeline?

Badger · April 12, 2025, 4:13am

Nothing in your description suggests a reason for splitting this into two pipelines. But it is unclear that removing a pipeline-to-pipeline link will improve performance much.

Perhaps the original design expected there would be a second consumer pipeline?

If you provide more detail we can probably provide better advice.

LogstashQuestions · April 12, 2025, 5:44am

There isn't a ton more detail as overall the pipelines are pretty simple. The input uses the s3snssqs input plugin to get events from S3, then does some pretty basic filtering, then sends events to the second pipeline which is just an output that sends events to Elasticsearch.

The main difference between the 2 pipelines seems to be in the batch_size. The first pipeline uses the default batch_size of 125 and the second pipeline has a batch_size of 2000.

Unfortunately this was set up years ago and I'm now inheriting it. I have some ideas for fixing the performance issues but I was mainly curious if there is a situation in which having a second pipeline that only handles output to Elasticsearch makes any difference. Do those details around the batch_size values explain why this pattern would be better?

Topic		Replies	Views
Multipipelines or one pipeline with lot of filters - Performance Logstash	2	871	June 22, 2018
Logstash pipelines reduces the number of events sent Logstash	2	618	April 27, 2022
Having 4 independent pipelines vs pipeline-to-pipeline communication Logstash	3	383	April 8, 2020
Logstash : Multi Pipelines Performances Logstash	5	967	June 17, 2019
Logstash sending all data to elasticsearch via wrong pipeline Logstash	11	1338	February 25, 2019

Understanding Logstash pipeline to pipeline communication

Related topics