Hi, I have situation where our database team export around 100 csv files during a day which I need to import to Elasticsearch. It is being done by NFS, and I have logstash constantly running so when new file is created he starts to read it. Because very often we have problems (beacuse of NFS I suppose [ reading file that is being written to ]) I want to schedule every morning job to run all files again ind import them back (I'm overriding doc_id so it can't be duplicates).
input{
file{
id => "live_import_transaction_files"
path => "/logstash/transactions_*"
start_position => "beginning"
sincedb_path => "/etc/logstash/sincedb_transactions"
file_completed_action => "log"
file_completed_log_path => "/etc/logstash/file_completed"
Now, every day at 7:30am I want to reimport whole folder if some error occured, but my input and output are the same.
input{
file{
id => "daily_import_transaction_files"
path => "/logstash/transactions_*"
start_position => "beginning"
sincedb_path => "/dev/null"
ignore_older => 0
pipelines.yml would look like:
- pipeline.id: live_import_transaction_files path.config: "/etc/logstash/import_csv_es.conf" - pipeline.id: live_import_transaction_files path.config: "/etc/logstash/import_daily.conf"
Is there any way to trigger this other pipeline in crontab or some scheduler every day at the same time, because I can't run 2 pipelines with same input/output. Or is there any better approach, because I need to make sure that everything is on ES every morning around 8am.