I'm using logstash for quite a time. I tried using a custom delimiter in File plugin. I'm reading a static file. I see file plugin extracts 32KB data and passes it to tokenizer for splitting by delimiter.
data = watched_file.file_read(32768) changed = true watched_file.buffer_extract(data).each do |line| listener.accept(line) @sincedb[watched_file.inode] += (line.bytesize + @delimiter_byte_size) end
What happens when the last byte is not new line ( ie: part of a line ).
I have a custom gzip plugin similar to File plugin and my regex fails on the partial line and skips that. I lose an event in this case. Sample of my code.
Zlib::GzipReader.open(gzfile) do |stream|
while line = stream.read(1024) do
offset += line.unpack("C*").size
lineno += 1
@buffer.extract(line).each do |l|
if !exclude_patterns.nil? and l.match(excludeRegex)
next
end
if l.match(includeRegex)
event = LogStash::Event.new("message" => @converter.convert(l))
totevents += 1
event["file"] = gzfile
event["offset"] = offset
event["lineno"] = lineno
decorate(event)
queue << event
end
end
end
end
I have seen this on a custom delimiter which can happen on \n delimiter as well. Please enlighten me.