With the file input, I understand that sincedb traces if that file has been processed or not. If I have a new file, that never has being processed by logstash (so is not in sincedb).
My input config is like this:
If I move that file to the folder where logstash is "looking" (/some/folder/*.log), nothing happened, until I edit that file and change anything inside the file, that file is processed.. the same as it would be in sincedb.
How I know that is not in sincedb? because I execute:
$ ls -i file.log
96993940 /some/folder/file.log
Then:
$ grep 96993940 /var/lib/logstash/.sincedb*
and nothing...
Any clues whay this happen? I'm editing log files that each has 3GB... so it's painfully slow.
I just find out that old files are not parsed, even If logstash never touched before (old meaning <1 day old or so). I found it because I had 5 files Logstash never touched it (i.e. weren't in .sincedb_*), 2 of those files were new (I created 5 minutes before) and only those 2 were processed. The 3 old ones weren't processed.
It seems almost like there is "something" that scan the whole disk at some point, and "tells" logstash those files are old and only have to touch them if something new arrives to them.
Interesting... and frustrating...
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.