I want to write logstash pipelines to process csv files and intermediate Elasticsearch index. Below I have explained exact use case:
- pipeline 1 starts automatically when a file is detected in specified folder. It should start only after the file is completely copied to that folder. it should not process same file again.
1.1 if multiple files are getting copied, it should wait for all and trigger only once, instead of triggering for each file - pipeline 1 generates an index in Elasticsearch which is input to pipeline2
- pipeline2 should start only when pipeline 1 is completed processing the file.
- similarly pipeline3 should start after pipeline2 is completed processing.
I am easily able to achieve the same using shell scripts.
However, I want to know if there is any way to achieve same using logstash features or Elasticsearch temporary indexes or pipelines.yaml etc?