Logstash File plugin custom delimiter

I'm using logstash for quite a time. I tried using a custom delimiter in File plugin. I'm reading a static file. I see file plugin extracts 32KB data and passes it to tokenizer for splitting by delimiter.

 data = watched_file.file_read(32768)
 changed = true
 watched_file.buffer_extract(data).each do |line|
 @sincedb[watched_file.inode] += (line.bytesize + @delimiter_byte_size)

What happens when the last byte is not new line ( ie: part of a line ).

I have a custom gzip plugin similar to File plugin and my regex fails on the partial line and skips that. I lose an event in this case. Sample of my code.

Zlib::GzipReader.open(gzfile) do |stream|
while line = stream.read(1024) do
offset += line.unpack("C*").size
lineno += 1
@buffer.extract(line).each do |l|
if !exclude_patterns.nil? and l.match(excludeRegex)
if l.match(includeRegex)
event = LogStash::Event.new("message" => @converter.convert(l))
totevents += 1
event["file"] = gzfile
event["offset"] = offset
event["lineno"] = lineno
queue << event

I have seen this on a custom delimiter which can happen on \n delimiter as well. Please enlighten me.

There is a bug in BufferedTokenizer for custom delimiter.
In the extract method, the left over part of previous data chunk is added to entities array after splitting by delimiter.
As a result when a data chunk has part of delimiter it just gets added to an event without being split.

def extract(data)
    data = @input.join + data # added
    @input.clear  # added

    entities = data.split @delimiter, -1

Adding the first 2 lines fixes the issue.