Filebeat loops while corrupted logs processing

yverbin · May 28, 2019, 12:10pm

Hello ! This issue can be related to:

Filebeat 6.4.2 and 6.5.1: Read line error: "parsing CRI timestamp" and "invalid CRI log format" and many more situations when container log files are corrupted. There is a known issue about it: Error grabbing logs: invalid character '\x00' looking for beginning of value · Issue #140 · docker/for-linux · GitHub In my opinion it can happens when system was powered off immediately
Whats happens ? When filebeat tries to parse corrupted file it shows identical error messages like "ERROR log/harvester.go:297 Read line error: invalid CRI log format; File:" or "Read line error: parsing CRI timestamp: parsing time " and stops continue log processing. Why does it happens ? Because if error occurs here https://github.com/elastic/beats/blob/3451d06cc192ca1077149f2e709ec248b6b6613f/filebeat/input/log/harvester.go#L263 harvester does not save state.Offset and after restart tries to continue from same point. It can be fixed by fix infinite loop when input files is corrupted · yverbin/beats@5f7614f · GitHub
I'd like to create PR

faec · May 28, 2019, 7:50pm

Thanks for the report and suggested fix! Could you give a little more info? It looks like the issue you reference was believed fixed a while ago. What version are you seeing it on, and do you have a reproducible failure? It would be helpful to see examples of logs that cause this failure in current versions so we can reopen that issue or create a new one.

yverbin · May 28, 2019, 8:10pm

Issue reproducible on 6.x, 7.0.1, 7.1.0 tags and master branch too. For example, corrupted part of file looks like:

{"log":"127.0.0.1 - - [28/May/2019:08:34:22 +0000] "GET /index.html HTTP/1.1" 200 612 "-" "curl/7.52.1" "-"\n","stream":"stdout","time":"2019-05-28T08:55:22.377165166Z"}
{"log":"127.0.0.1 - - [28/May/2019:08:34:22 +0000] "GET /beforeindex.html HTTP/1.1" 200 612 "-" "curl/7.52.1" "-"\n","stream":"stdout","time":"2019-05-55T08:47:22.978902019Z"}
@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@{"log":"127.0.0.1 - - [28/May/2019:08:34:23 +0000] "GET /afterindex.html HTTP/1.1" 200 612 "-" "curl/7.52.1" "-"\n","stream":"stdout","time":"2019-05-28T08:50:23.32390429Z"}
{"log":"127.0.0.1 - - [28/May/2019:08:34:23 +0000] "GET /newindex.html HTTP/1.1" 200 612 "-" "curl/7.52.1" "-"\n","stream":"stdout","time":"2019-05-28T08:55:23.624256261Z"}

where @^ - nul symbol (\x00)

This issue can be reproduced by writing some bytes to container log file directly:
echo -n -e '\x00\x00\x00\x00\x00\x00\x00\x00\x00' >> /var/lib/docker/containers/{containerid}/{containerid}-json.log

andyxator · June 4, 2019, 7:01pm

I faced the same problem on the latest version 7.1.0 too, it is very annoying
Would be very great if it was succeeded to solve a problem...
Thank you!

MaksimPuganov · June 14, 2019, 8:44am

I have the same problem. I haven't found a solution yet.

system · July 12, 2019, 8:52am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat 6.4.2 Read line error: invalid CRI log format; File Beats filebeat	18	3538	January 2, 2019
Read line error: invalid CRI log; filbeat halts processing logs Beats filebeat	12	6277	September 28, 2018
Filebeat 6.4.2 and 6.5.1: Read line error: "parsing CRI timestamp" and "invalid CRI log format" Beats filebeat	9	4235	February 13, 2019
Filebeat 7.2.0 Container Input: Error Reading Docker Output Beats filebeat	3	527	August 15, 2019
Filebeat invalid CRI log format error Beats filebeat	5	2421	March 17, 2021

Filebeat loops while corrupted logs processing

Related topics