Questions about multiple pipelines and whether events are processed by all or only one

I would like help understanding how multiple pipelines defined in pipelines.yml work together. I originally had a single pipeline defined by commands-pipeline.conf file. I then introduced a new pipeline in commands-raw-pipeline.conf. I expected events to be processed by both pipelines, but it seems that an event is either processed by one pipeline or the other, not both. Is this expected?

Some details:

  • Each pipeline listens on a different port and the only FileBeat instance is configured to ship to both ports
  • Each pipeline processes the same data in similar ways. The first uses aggregates and the second does not. Otherwise, they set some of the same fields and some different fields
  • Both pipelines output to the same index with the same parameters except a different data_stream_dataset.

Here's my pipelines.yml:

- pipeline.id: commands-aggregated
  path.config: "/etc/logstash/conf.d/commands-pipeline.conf"
  pipeline.workers: 1
- pipeline.id: commands-raw
  path.config: "/etc/logstash/conf.d/commands-raw-pipeline.conf"

Going forward I plan to route the new pipeline to a new index and perhaps deprecate the original pipeline, but I still want to understand this behavior.

Yes, it is expected.

If you want data processed by multiple pipelines then use pipeline to pipeline communication with a forked path pattern.

Thank you, Badger, that documentation is useful.

I think the other piece I misunderstood was my Filebeat output configuration. As I mentioned, I had each pipeline listening on a different port and Filebeat was sending to both ports. Well, according to documentation, Filebeat doesn't send to both hosts. Instead, it sends to either in a round-robin fashion. This would explain why each event was processed by only one of my two pipelines.
Example of my Filebeat output config:

output.logstash:
  hosts: ["my-es.demo.com:5044", "my-es.demo.com:5045"]