Read files and once daily read all the files again

Hi, I have situation where our database team export around 100 csv files during a day which I need to import to Elasticsearch. It is being done by NFS, and I have logstash constantly running so when new file is created he starts to read it. Because very often we have problems (beacuse of NFS I suppose [ reading file that is being written to ]) I want to schedule every morning job to run all files again ind import them back (I'm overriding doc_id so it can't be duplicates).

input{
    file{
        id => "live_import_transaction_files"
        path => "/logstash/transactions_*"
        start_position => "beginning"
        sincedb_path => "/etc/logstash/sincedb_transactions"
        file_completed_action => "log"
        file_completed_log_path => "/etc/logstash/file_completed"

Now, every day at 7:30am I want to reimport whole folder if some error occured, but my input and output are the same.

input{
    file{
        id => "daily_import_transaction_files"
        path => "/logstash/transactions_*"
        start_position => "beginning"
        sincedb_path => "/dev/null"
        ignore_older => 0

pipelines.yml would look like:

- pipeline.id: live_import_transaction_files
  path.config: "/etc/logstash/import_csv_es.conf"
- pipeline.id: live_import_transaction_files
 path.config: "/etc/logstash/import_daily.conf"

Is there any way to trigger this other pipeline in crontab or some scheduler every day at the same time, because I can't run 2 pipelines with same input/output. Or is there any better approach, because I need to make sure that everything is on ES every morning around 8am.

I cannot speak to the scheduling problem, but that says to ignore any files which are more than zero seconds old, which means it will ignore all files.

no, this means it will include everything.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.