Control Sequence of logstash pipelines

I want to write logstash pipelines to process csv files and intermediate Elasticsearch index. Below I have explained exact use case:

  1. pipeline 1 starts automatically when a file is detected in specified folder. It should start only after the file is completely copied to that folder. it should not process same file again.
    1.1 if multiple files are getting copied, it should wait for all and trigger only once, instead of triggering for each file
  2. pipeline 1 generates an index in Elasticsearch which is input to pipeline2
  3. pipeline2 should start only when pipeline 1 is completed processing the file.
  4. similarly pipeline3 should start after pipeline2 is completed processing.

I am easily able to achieve the same using shell scripts.
However, I want to know if there is any way to achieve same using logstash features or Elasticsearch temporary indexes or pipelines.yaml etc?

logstash is designed to provide continuous flow of events through pipelines. You could synchronize things between pipelines and even things external to logstash using ruby filters, but it wouldn't be pretty. You could also start a new logstash instance to do each processing step but the startup cost of starting a new JVM quite high.

Perhaps logstash is the wrong tool for this job.

1 Like

Thanks @Badger.
So there is no way in logstash that can be used in sequencing of pipelines?

With logstash alone no, you would need to use other tools to run logstash pipelines in the sequence you want.

1 Like