Pipeline-to-pipeline communication with single input and output

Imagine the following setup:

filebeats --> logstash concentrator --> kafka --> logstash ingest --> elasticsearch

On the 'logstash ingest' side, I am using pipeline-to-pipeline communication to start several pipelines for various log formats(using the "distributor pattern" as described here). Using a single input, I read JSON formatted events from a Kafka topic and they go to the various pipelines. This works like a charm.

Now from here, i want the parsed results to all go into the same Elasticsearch instance.

What would be the best approach here?

  • Defining an identical output in every individual pipeline config, so every pipeline has its own output but is effectively writing to exactly the same cluster/index
  • Use the "collector pattern" as described here to concentrate all pipelines again into one output

Personally I would go with the collector pattern.

This is useful if the data is going to different indices as each bulk request will target a smaller number of shards.

Probably the preferred option if all data is going to the same index.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.