Logstash file input plugin: missing first lines after log rotation

Hello,

I´m using logstash with a pipeline that collects logs with the file input plugin. Everything works fine, but when log files are rotated, logstash doesn´t collect some of the first few lines of the new log file.

This is my input config:

input {
  file {
        path => [ "/logs_iib_pre/iib-*.log" ]
        start_position => "beginning"
        sincedb_path => "/dev/null"
  }
}

When a new log file is created by rotation, some of the first lines are missing from logstash, one is truncated and generates a "_grokparsefailure" and the rest is collected fine.

I have saw other disscusions like this but no one with a solution.

Thanks.

That sounds like inode re-use. There is no solution.

Imagine you have a file called /foo.txt ... If the file input has read the first part of /foo.txt and additional text is appended to it then we want the file input to just read the appended text. But suppose that the file gets rotated to /foo.txt.1 and a new file called /foo.txt is created. We want the file input to start reading the new /foo.txt from the beginning.

So how can the file input tell if it is a new file with the same name, or text appended to the old file? There is no solution to this, so the file input has to make assumptions, and sometimes those assumptions are wrong.

The file input could keep a checksum (or cryptographic hash) or the data read from each file, and re-read and checksum the data each time it revisits the file. This would be insanely expensive, but almost always work.

What the file input does is keep track of the inode associated with the file name. This is very cheap and usually works. If /foo.txt is renamed to /foo.txt.1 it will have the same inode (so the file input will not see /foo.txt.1 as a new file) and a new /foo.txt will have a different inode (so the file input will think it is a new file).

Some UNIX filesystems will use the first available inode when a file is created. That means that if /foo.txt has inode 12937, and it is deleted, then when a new /foo.txt is created is will also have inode 12937. The file input will treat it as the same file and ignore the text it has already read from it. I think this is what you are seeing.

Some filesystems use versioned inode numbers, so that the inode number changes each time it is used, many do not.

There are many issues on github related to this. This may be a good place to start exploring if you are interested.

Thank you Badger.

Although it will reuse the inode, it should detect a smaller size and reset the position to zero as the plugin documentation says:

If there is no solution, I think I will mitigate this issue with less file rotation (increasing file size or time between rotation) and with some logstash restarts to re-read the files and collect the missing lines.
Also "sincedb_clean_after" option should help to this problem.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.