Harvester.go infinitely loops on decoding JSON, fails to provide error context

The GitHub new bug report template said I should come here first, so I'm following the process.

There are two bad things happening with filebeat-oss:7.1.0:

  • it appears to be ignoring json.ignore_decoding_error: true and continually trying to read the same offset over and over (which, of course, deterministicly fails)
  • it fails to log the offset of the error, forcing one to go spelunking around in data/registry/filebeat/data.json to find the offensive file entry and grab its offset value; it would be far, far better if harvester.go would emit the pointer since it presumably has it in hand at the time
ERROR        log/harvester.go:281        Read line error: decoding docker JSON: invalid character 'l' after object key:value pair; File: /var/lib/docker/containers/4a0ec7b116d9ff05a006c2084de8d652c452bccf7ad4a42d1a636d678e5360ca/4a0ec7b116d9ff05a006c2084de8d652c452bccf7ad4a42d1a636d678e5360ca-json.log

by sniffing out the offensive offset from data.json, we can see it is an overwritten line starting at offset 0x1ae14000

$ xxd -s 0x1ae13fb2 -l 128 $the_file
1ae13fb2: 7b22 6c6f 6722 3a22 3230 3139 2d30 322d  {"log":"2019-02-
1ae13fc2: 3133 5f31 303a 3033 3a32 332e 3633 3031  13_10:03:23.6301
1ae13fd2: 3720 7469 6d65 3d5c 2232 3031 392d 3032  7 time=\"2019-02
1ae13fe2: 2d31 3354 3130 3a30 333a 3233 5a5c 2220  -13T10:03:23Z\"
1ae13ff2: 6c65 7665 6c3d 6572 726f 7220 6d73 7b22  level=error ms{"
1ae14002: 6c6f 6722 3a22 3230 3139 2d30 322d 3133  log":"2019-02-13
1ae14012: 5f31 303a 3033 3a33 372e 3831 3435 3720  _10:03:37.81457
1ae14022: 7469 6d65 3d5c 2232 3031 392d 3032 2d31  time=\"2019-02-1

If filebeat would just try advancing the file pointer one character at a time I would consider that to be a much better outcome than continually trying to read the same offset, and continuing to fail, stuck in an infinite loop until someone stops filebeat, manually adjusts the offset location, and then starts it back up again

It looks like a fix for this probably went in a few days ago, so hopefully this should be fixed by building from the current repo, or it will be in the next regular release.

Fantastic, thank you. I wish I had known about that PR before it landed, since I would have suggested logp.Info("Skipping unparsable line in file: %v at offset: %d", h.state.Source, h.state.Offset) but coulda-shoulda-woulda-etc-etc

I look forward to seeing that in an upcoming release, and I appreciate you pointing me to the PR

This PR does not prevent potential infinite loops in filebeat on parsing corrupted logs. It doesnt work in my case Filebeat loops while corrupted logs processing

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.