Logstash rotating log - sincedb issue on Windows

(Krishna Chaitanya) #1

I am running Logstash 5.2 on Windows7 and I have a rotating log use case, where I maintain a file batchlog.txt everyday and I read this file into logstash and output to ES (daily indices)

My file input config:

input {
  file {
    codec => json
    path => "..\batchlog.txt"
    sincedb_path => "..\.sincedb_test"

Every night, I rotate this log file by renaming it with previous day's date. And I create a new log file with same name.

Example: I rename batchlog.txt on 02-08-2017 00:00:00 to batchlog-02-07-2017.txt
and Then I create a new file batchlog.txt and start writing logs to this file.
So, based on my configuration, I only read 1 file (batchlog.txt) and that is what I need.

During this process of rotation, my sincedb file is changing the offset of rotated file.
At 11:58 PM, the offset value of batchlog.txt was 5536732 and after renaming the file, the offset changed to 108

State of sincedb at 11:58 PM:

384971208-64889-15073280 0 0 5536732

State of sincedb at 00:10 AM nextday:

384971208-64889-15073280 0 0 108     <-- This is the previous batchlog.txt; now renamed
384971208-1497637-3735552 0 0 20456    <-- This is the new batchlog.txt

This offset change is causing unexpected behavior during indexing as well. During rotation. some of the logs of previous day's file are being duplicated into current day's index.

Am I doing the procedure right? Does renaming the file causes this behavior of sincedb and duplication of logs? How do I actually rotate the files then?

(Krishna Chaitanya) #2

Is the offset change to sincedb file is because of race conditions while during file renaming and creating?

This behavior did not happen when the file size is significantly small. (~4KB)

(Krishna Chaitanya) #3

It happened again. The rotated file's sincedb offset is being changed during the rotation.

Before rotation:

  384971208-64889-15073280 0 0 108     
  384971208-1497637-3735552 0 0 5317059         <-- See offset here

After rotation:

  384971208-64889-15073280 0 0 108     
  384971208-1497637-3735552 0 0 354      <-- This is the previous batchlog.txt; now renamed
  384971208-75829-114884608 0 0 43549    <-- This is the new batchlog.txt

Any fix for this?

(Krishna Chaitanya) #4

Any help here?

(Krishna Chaitanya) #5

When I changed my file input path => batchlog.txt*, the sincedb is not changing the offset of rotated file now. It is staying put as is before and after the rename.

I have 90 days worth of files (90 files) with that pattern. So, does logstash keep track of each file? Does this make logstash contend for resources? Does it try to read from each file sequentially?

I am anyway not going to write to an old file. Can I optimize this and make logstash not to read old files?

(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.