Logstash 6.2.4 read_to_eof: no delimiter found in current chunk

ld_pvl · May 4, 2018, 11:09am

Hi guys,

I am using the new file input plugin 4.1.1 and keep getting the below warn message churning out numerous times - what is it about?

[WARN ][filewatch.tailmode.handlers.grow] read_to_eof: no delimiter found in current chunk

Cheers,

guyboertje · May 4, 2018, 11:14am

That error has just been fixed today. update to v4.1.2. Apologies.

ld_pvl · May 4, 2018, 11:19am

Thanks for the quick reply.

One question unrelated to the topic (maybe I should create another thread) but is there any problem in 4.1.1 that causes the same log events to be sent causing multiple duplicate docs in elasticsearch? Because I am facing this issue after upgrade where I see numerous duplicate docs.

guyboertje · May 4, 2018, 11:26am

The error caused the same piece of content to continually be reprocessed - I assume that is where the duplicates come from.

Are you using /dev/null for your sincedb path?

I suggest you start from scratch rereading the files to a new index and then check for duplicates. If you see duplicates after this please open a new topic "File input v4.1.2 tailing - duplicate docs ingested"

ld_pvl · May 4, 2018, 11:26am

Let me open a new topic

guyboertje · May 4, 2018, 11:27am

No. Create a new topic only after you confirm that duplicates are seen with the new version. v4.1.2.

ld_pvl · May 4, 2018, 11:28am

Ah, ok, roger.

P.S.: my sincedb is not /dev/null and I've started from scratch.

ld_pvl · May 4, 2018, 1:57pm

Related to the main topic - I'm getting this now after upgrading to 4.1.2. Should I take any action or is this some bug?

[2018-05-04T09:52:50,706][INFO ][filewatch.tailmode.handlers.grow] buffer_extract: a delimiter can't be found in current chunk, maybe there are no more delimiters or the delimiter is incorrect or the text before the delimiter, a 'line', is very large, if this message is logged often try increasing the `file_chunk_size` setting. {"delimiter"=>"\n", "read_position"=>827326, "bytes_read_count"=>66, "last_known_file_size"=>827392, "file_path"=>"/foo/bar/my.log"}

FYI, I got this error while doing some testing in relation to this topic: Logstash 6.2.1 Big .since-db file causes OutOfMemory - and added a reply there.

guyboertje · May 4, 2018, 2:41pm

@ld_pvl

This one is relevant and informative but whether it is important depends on the way the tailed files are being filled.

I'll break it down.

filewatch.tailmode.handlers.grow - this is the code that executes when a file is seen to have grown from last time in tail mode.
"delimiter"=>"\n" - standard newline delimiter.
"read_position"=>827326 - the offset in the file that the "chunk" was read from.
"bytes_read_count"=>66 - the number of bytes in the "chunk" normally this is 32768 (32K). As it is smaller than 32K we are reading the last few bytes of the file at the present time.
"last_known_file_size"=>827392 - the size of the file when we last checked (maybe 1 second, the scan_interval ago)

827326 + 66 = 827392 - yep, we read to the end of the file as we saw it at that time.

Interpretation:

The system will be writing the rest of the line later. There was no newline character in those 66 bytes because the system writing the file may not have written/flushed the rest of the line and the closing newline yet. If this is the case then we can ignore this message, it is INFO after all.
The system will not be writing the rest of the line later. If the system never writes the rest of the line plus the newline then this message is important. It means that those 66 bytes are lost, they are stuck in the buffer (each file has its own buffer) and there is no more content to unstick it. Remember, that tailing is really an endless stream of content so there is no way of knowing that the buffer can be artificially flushed.

Read mode to the rescue. In read mode, we assume that the file is a fixed length stream and in this case we can artificially flush the buffer when we reach the end of the stream.

In 4.1.2 we have the limitation that discovered files should not be growing while (or after) we read the file, the additional content may not be read. This means that you should do an atomic write or copy - meaning write or copy to a folder outside of the path glob and then rename the file so it becomes discoverable. We have plans to fix this.

Please tell me whether interpretation 1 or 2 applies to your case.

ld_pvl · May 6, 2018, 12:03am

Thanks a lot

guyboertje · May 9, 2018, 12:22pm

See this post for continuation.

system · June 6, 2018, 12:22pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
File input v4.1.2 tailing - duplicate docs ingested Logstash	5	1350	June 6, 2018
File input plugin 4.1.2 [filewatch.tailmode.handlers.shrink] mismatch on sincedb_value.watched_file, this should have been handled by Discoverer Logstash	1	255	July 4, 2018
Logstash stopped working after upgrade 6.2.4 --> 6.4.0 Logstash	16	1857	November 19, 2018
Input file contains tailing buffer - cannot input Logstash	4	658	July 6, 2017
Logstash 6.4 send duplicates when logrotate Logstash	5	510	January 4, 2019

Logstash 6.2.4 read_to_eof: no delimiter found in current chunk

Related topics