I've noticed this strange behaviour where let's say my original .sincedb-foo file was created by logstash under user A - i.e. the .sincedb-foo file is owned by user A.
Now if I stop then start logstash up again but as user B, then even if .sincedb-foo is given global write permissions, logstash will start creating these "children" .sincedb files in this format: .sincedb-foo.2064.18329.863496 for example.
And there will be many many of these "children" .sincedb files (up to many hundred thousands even) created depending on how many logs there are and how many rotations happen which uses up a lot of space (tens of GB).
I couldn't find documentation around this and was wondering what this behaviour is?
Is there a way to configure logstash file input plugin to not behave like this? I.e. to just fail quickly for example if the .sincedb files are owned by a different user and to not spawn any children .sincedb files because it might fill up all the space.
The "child" files are an "atomic write" mechanism. Meaning that we write to another file and when done we delete .sincedb-foo and rename the child to .sincedb-foo. Otherwise Logstash could crash with an incomplete file - it still could crash in between the delete and rename but this is less likely.
The mtime of the file listed in the config as the sincedb path should be changing about every 5 seconds.
User B will need permission to rename the "child" files.
So I guess because user B is unable to rename child files to .sincedb-foo that's why the accumulation happens.
Is there a way to make logstash fail when it can't rename child files?
Also one interesting thing is in the logstash logs, I don't see any warns nor errors regarding it being unable to rename the child files - maybe it shows in debug mode but I haven't checked.
We only do the atomic write when not on Windows and the sincedb_path is not a /dev/null.
We don't usually get the chance to view the way the code reacts to these scenarios - the permutations are many.
It used to be in the past when the file input open all found files there could be no file handles free to write/rename the sincedb file.
We are only trapping the EACCES error code (Permission denied) from the list of Posix system errors so any other error would rise up to the main error handling in the pipeline, causing the input to be restarted (continually).
To truly handle this correctly, we need to validate the sincedb_path setting to actually complete a full atomic write with dummy data so that Logstash would halt with a Config Validation error rather than trying to continue. This is doable. First this PR must be merged. Please create an issue in the file input repo labelled as "enhancement".
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.