How does sincedb_clean_after in file{} work?

espenk · December 19, 2018, 8:27am

Hi. I have been reading https://www.elastic.co/guide/en/logstash/master/plugins-inputs-file.html for a while, and stumbled upon the following statement: "Sincedb records can now be expired meaning that read positions of older files will not be remembered after a certain time period." I'm aware that this points to the sincedb_clean_after option made available in March/April 2018. However, I seek to confirm that it actually works on my system. Is is expected that inode records gets deleted from sincedb, or does Logstash check if the inode entries have been expired at discovery interval?

What is the expected behavior of this option? See the reply below, as to why I'm looking into this.

espenk · December 19, 2018, 6:01pm

Background to why I'm looking into this, is that:
I believe there still is a byte offset that is wrongly set in my test environment. Occasionally log lines are not properly read from the beginning. E.g "2018-12-24 host1 application2: did something" would be read into Logstash as "-24 host1 application2: did something".

The folder which Logstash watches is subject to the following:

Rsyslog receivers data and saves as logtype.YYYY-MM-DD-HH:MM.log
- Files increase in size over time, and are generated each minute.
- Size of files are roughly 500 MB at working hours, 100 MB at night.
Files get gzipped if they are older than 6 hours
- .gz files are excluded
If size of folder exceeds 40% of partitions size, remove files

During my testing the files that had this behavior were still available, and I confirmed that the log line looks fine at disk. In other words, the files which were faulty read were not deleted or gzipped. File deletions are expected to occur roughly after 7 hours.

Options set in file{}:
sincedb_clean_after => 3.2 hours,
ignore_older => 3 hours.
close_inactive => 1 minute
mode => tail
exclude => .gz

Logstash version: 6.5.2
CentOS 7

I have already verified that the logs do not contain any well-hidden 0x0a/0x0d chars that could be misinterpreted as newline.

espenk · December 21, 2018, 3:41pm

Think I may have found the issue, as I have not had parse failures after. Increasing close_inactive seem to be the fix for me. I wonder if it may be due to files being initially opened, but Logstash was unable to finish reading and closing the file handle within the expected 1 min. Thus, files were perhaps partially read?

This seems like a likely cause, as files may stop increasing in size before logstash closes the file handle. Thus no size increase and won't be detected within ignore_older.

system · January 18, 2019, 3:41pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
LS 6.4.0 sincedb with converted records despite started empty Logstash	1	316	November 2, 2018
Sincedb_clean_after not working in logstash plugin Logstash	2	488	May 19, 2020
Trouble with Logstash sincedb file Logstash	2	1012	January 4, 2019
Sincedb maintenance Logstash	6	2545	July 6, 2017
Logstash sincedb files Logstash	7	10602	August 31, 2017

How does sincedb_clean_after in file{} work?

Related topics