Input file reads only newly added files

I have a service that outputs two files into a file system, csv and manifest files, both have same file name but different extension.

I need to build a logstash config file that does the following:

  • Once the files are written, it reads both files (csv and manifest) that are either located in the main or subdirectories (nested folders)
  • Don't read those files if any new pairs are being added, I mean it only reads the newly added ones in any location under the main root.

Note: both files, csv and manifest, should be read together because the manifest has metadata that helps me to index the csv file when I'll push it to elasticsearch.

Question: sometimes the csv file will take 30 seconds to be written, it is a huge file, so I'm wondering if logstash will start read the file once it's created OR once it's closed and the service finished filling it.

Here is the code I'm using, I managed to read a csv file only, but not sure how to do that for both files as I mentioned above.

input {
  file {
    path => "/usr/share/input/**/*.*"
    start_position => beginning
    sincedb_path => "/dev/null"
    discover_interval => 2
    stat_interval => "1 s"
  }
}

filter {
    ...
       .... Code goes here ....
}

output {
    stdout { codec => rubydebug }
    elasticsearch {
        index => "%{blockId}"
        hosts => ["${HOSTS}"]
    }
}

Yes, it will. Consider a typical use case for logstash - web server logs. logstash will open the file, seek to EOF if configured to do so, and tail the file, reading new lines as they get written. If there are two log files they are read independently and the data is not ordered between them. You cannot tell logstash to process one file and then the other.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.