Pipeline-to-pipeline communication with single input and output

Imagine the following setup:

filebeats --> logstash concentrator --> kafka --> logstash ingest --> elasticsearch

On the 'logstash ingest' side, I am using pipeline-to-pipeline communication to start several pipelines for various log formats(using the "distributor pattern" as described here). Using a single input, I read JSON formatted events from a Kafka topic and they go to the various pipelines. This works like a charm.

Now from here, i want the parsed results to all go into the same Elasticsearch instance.

What would be the best approach here?

  • Defining an identical output in every individual pipeline config, so every pipeline has its own output but is effectively writing to exactly the same cluster/index
  • Use the "collector pattern" as described here to concentrate all pipelines again into one output

Personally I would go with the collector pattern.

This is useful if the data is going to different indices as each bulk request will target a smaller number of shards.

Probably the preferred option if all data is going to the same index.

