if got a problem with logstash reading files. In generall my input, filter and output is working. The files are generated and copied to the correct path, every hour but logstash doesn't read them, only on (re-)start.
that is not possible, i do have 10 *.csv Files in that Directory and i have 10 logstash-configs with different inputs, filters and output. If i would change it to *.csv that one config would read all csv-files but there is just on set of filters on each file.
Besides, what would be the benefit? Logstash is reading the file, but only at startup, in my opinion the input is basically working. I believe that there is an issue with my logstash config that prevents logstasf from reading the file permernantly at runtime (or when the files are newly created).
What I noticed in my environment (not sure if it's a configuration to make it work otherwise) is that logstash is watching only for new files, is not watching for content changes. Therefore, I've done a separate folder where I put only my files after the content is fully written. the files also have a timestamp in the name, therefore the path looks like this: /home/mihai/file*.csv and it's working.
thanks for your replay. I'll test using a own folder for the *csv-files. But i think, i already have new files. Everytime logstash reads these files it also deletes them (only after restart), so that any new file is in deed a new file.
logstash tracks the file based on the inode. If the inode does not change but the file gets modified without getting bigger then a file input will not re-read it. If you enable log.level trace then you will be able to see (voluminous) output from the filewatch module as it monitors the file for changes.
I've just changed the log level to trace and going to watch it.
I assuemed, that if logstash deletes the files after reading, that newly created files automatically get a new inode and thus should appear as new file.
I think i'll try to add datetime to the filenames.
I think i found a solution. I added datetime to the files, so every file generated every hour has a new unique name and it seems logstash does read every new file.
It depends on the filesystem, but it is not unusual that if a file is deleted, the next file that is created will re-use the inode, since it is the next one available.
thanks for your reply and the explanation, i thought it would be unlikely that newly created files would get the same inode, but it seems so. My above solution works
Thanks and regards
Boris
Season’s greetings and best wishes for a healthy 2021!
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.