Sincedb doesn't updated until tail reaches the end of the file

no_jihun · December 2, 2015, 3:22am

hello.

it seems sincedb updating does not occur until TAIL reaches the end of the file.

with this configuration

input {
    file {
            path => "/home1/.logstash/json_input/deco.json.*"
            codec => "json"
            start_position => "beginning"
            add_field => ["[@metadata][input_id]", "deco_json"]
            sincedb_path => "/home1/.logstash/sincedb/deco_json.db2"
    }

I daily move yesterday's log file into "/home1/.logstash/json_input/deco.json.YYYY.MM.DD"
It's size is approximately 1GB. and usually take 20~30minutes to process all line of the file.

when after yesterday's log file moved into logstash input directory, it is well tailed and out processed to elasticsearch.

one difficult thing I have with this situation,
If logstash or elasticsearch got break while process some line of the large file,
logstash have no sincedb position data, because it didn't reached EOF yet, sincedb position never wrote.
also this configuration have no effect.

 sincedb_write_interval

in that case I don't know from which line should I re process the log data that didn't processed.
so, no way to failover.

any advice?
is this expected behavior?

I am using logstash 1.5.x

thanks!

magnusbaeck · December 2, 2015, 6:51am

I think your observation is correct. The underlying filewatch library will read a file in a tight loop until it hits EOF and only afterwards will it consider updating sincedb. This seems like a bug to me—why not consider updating sincedb inside the read-32kb-and-yield loop? Could you please file an issue for the bug at https://github.com/logstash-plugins/logstash-input-file/issues/new? Here's the relevant code:

github.com

jordansissel/ruby-filewatch/blob/v0.6.5/lib/filewatch/tail.rb#L178-L199


loop do
  begin
    data = @files[path].sysread(32768)
    changed = true
    @buffers[path].extract(data).each do |line|
      yield(path, line)
      @sincedb[@statcache[path]] += (line.bytesize + delimiter_byte_size)
    end
  rescue Errno::EWOULDBLOCK, Errno::EINTR, EOFError
    break
  end
end


if changed
  now = Time.now.to_i
  delta = now - @sincedb_last_write
  if delta >= @opts[:sincedb_write_interval]
    @logger.debug? && @logger.debug("writing sincedb (delta since last write = #{delta})")
    _sincedb_write
    @sincedb_last_write = now

This file has been truncated. show original

no_jihun · December 2, 2015, 7:00am

thank you for the interest.

karthikvalluri85 · September 9, 2016, 10:30am

Okky thnks. I read few posts and added the sincedb path

        sincedb_path => "/persistent/log"        
         sincedb_write_interval => 10

Restarted my logstash ..it started loading data as seen in the attachment..however again it gets stuck...and elasticsearch doesn't have data

Topic		Replies	Views
Logstash sincedb file not getting updated in ELK 6.8.0 Logstash	8	1038	August 7, 2019
File Grew, but sincedb does not keep up with file Logstash	1	984	July 6, 2017
Can we change sincedb different location for different input? Logstash	14	1743	May 10, 2018
Sincedb not written when EOF not reached yet Logstash	1	503	February 15, 2018
Trouble with Logstash sincedb file Logstash	2	1019	January 4, 2019

Sincedb doesn't updated until tail reaches the end of the file

Related topics