I am working on a project that uses Logstash to parse some csv files. For this the Logstash “file-input-plugin” is used where new files are read from a path and the parsing is done.
However, I am trying to implement a validation to this flow: whenever a new file is captured, calculate its SHA DIGEST and call an endpoint and validate that it’s a valid file.
DISCLAIMER: My experience with Ruby is NIL . Mostly accustomed to Java.
I found that they call a method “subscribe” of FileWatch::Tail at
Can I get some guidance as to where exactly can I write into something to have the custom logic – calculate file DIGEST -> Call endpoint and validate -> if valid then process file else raise an error – whenever a new file is grabbed?
@shabirmean
I wrote those changes to the file input.
The file input does not have a facility for this - the part you would want to hook into (but can't now) is the file discovery loop which scans for files to the 'to-be-processed' collection.
The only way you can do this is to script something yourself using a staging folder...
new files go into a staging folder -> staging
a script monitors for new files, does the digest + compare.
move file to a valid folder -> valid
LS detects the file in the valid folder and begins tailing it.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.