Filebeats cuts off first part of log event

I'm using beats, logstash, elasticsearch to read log info in json, add some fields and index it. Beats is set up to treat a single line in the logs as an event, and each json object should be on a single line. I noticed that there were some documents stored in elasticsearch that didn't have the right info, and after investigating found documents that looked like this...

  {
    "_index": "testresults",
    "_type": "doc",
    "_id": "ziVky2YBWK0nRFtEhD4U",
    "_score": 1,
    "_source": {
      "tags": [
        "_jsonparsefailure",
        "_elasticsearch_lookup_failure"
      ],
      "runNumber": 1,
      "firstFail": null,
      "message": """.4651157-04:00","jobID":"PL622"}"""
    }
  }

This is from logstash basically saying that the message sent to it from filebeats is not valid json, which it isn't. At this point, I assumed that the logs had carriage returns in the middle of the json object, so I turned on debug logging in beats and started watching for a bad message. And I found one...

2018-10-31T15:15:36.979-0400 DEBUG [publish] pipeline/processor.go:275 Publish event: {
"@timestamp": "2018-10-31T19:15:36.979Z",
"@metadata": {
"beat": "filebeat",
"type": "doc",
"version": "6.2.4"
},
"offset": 10316,
"message": "}",
"source": "d:\autotestdata\filebeat\PL622_tsrx-29.json",
"prospector": {
"type": "log"
},
"beat": {
"name": "jarvis",
"hostname": "jarvis",
"version": "6.2.4"
}
}

In this case the entire json except for the last "}" has been cut off. So I looked at the contents of PL622_tsrx-29.json. But I found that the entire json object was on a single line. So I copied that line to the second line in the file and saved it. Beats saw that the file had changed and behaved correctly. So I conclude that there is nothing wrong with the logs or how I've set up beats. For some reason, beats seems to occasionally mangle the message. Here's how I have beats configured...

filebeat.prospectors:

  • type: log
    enabled: true
    paths:
    • d:\autotestdata\filebeat\*.json

filebeat.config.modules:
path: C:\Program Files\Filebeat>*.yml
reload.enabled: false

output.logstash:
hosts: ["localhost:5044"]

logging.level: debug

I've looked at my data in elasticsearch, and it looks like about 1% of my log info is getting dropped due to this problem. Given the problem's indeterminate nature, I don't believe I can provide a test case that will reliably reproduce the problem. Is this a known issue? Can someone provide some guidance on how I can work around this problem?

Thanks,
Chris

Hi @hyattcs and welcome :slight_smile:

If the log files contain one JSON object per line they can be directly parsed with filebeat, take a look to the json options.

In any case it is weird to see that it is collecting incomplete lines. How are you writing to these files? Is it possible that it is truncating the file from time to time?

Hi Jaime, thanks for getting back. I didn't know about the json parsing options in filebeat. I'll look in to that.

There's a separate process running on another machine that's writing the files. The file is not being truncated. I've traced several of the failures back to their origin file, and the data in the file is complete.

I've done an internet search, and it looks like this is not an isolated incident. Others have seen the same problem...

I've been thinking about the possibility of a race condition between the file being written and filebeat reading the file. Is there a way in the filebeat configuration to set up delay between the file change detection and the initial file read? I just went through the prospector setup, and I didn't see anything obvious.

I've been able to determine that this is not a problem with filebeat. Turns out we were creating multiple log files with the same name. The first file would get saved without issue. Then the second file would override the first. If the second was bigger, filebeat would start where it left off with the first, which has the effect of cutting off the first half of the second file.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.