Filebeat skips fields for the first line in UTF8 with BOM encoding

Hi, I'm using filebeat for providing logs to ELK stack. Log files are composed of json lines. I also added custom fields to output with filebeat.

When I saved log files utf8 with bom signature, Filebeat doesn't add custom fields to the first line of each log file. If I save log files utf8 without the signature It runs without any problem.

I think, there is a bug with utf8 signature.

my filebeat config:

filebeat.inputs:
- type: log
  enabled: true
  encoding: utf-8
  harvester_buffer_size: 1024
  json.keys_under_root: false

  close_removed: true
  clean_removed: true
  close_inactive: 5m
  close_timeout: 5m
  
  paths:
    - C:\Prod\*.log

  fields:
    env: prod
    source: integration
    project: testproject

filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false

output.logstash:
  hosts: ["10.10.10.11:5001"]

logging.level: info

Hi @alatas and welcome :slight_smile:

Indeed this doesn't look good, could you check in the filebeat logs if you can see any error related to JSON parsing?

Yes, actually there is an error line in the log. After the error line, It wrote the event to the debug log with custom fields as should be. But, it didn't output this event with custom fields to logstash.

It also removes other additional fields like source, beat, offset etc. I've added logstash output below.

2018-08-08T22:39:17.167+0300	ERROR	reader/json.go:33	Error decoding JSON: invalid character 'ï' looking for beginning of value
2018-08-08T22:39:17.168+0300	DEBUG	[publish]	pipeline/processor.go:291	Publish event: {
  "@timestamp": "2018-08-08T19:39:17.167Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "doc",
    "version": "6.3.2"
  },
  "offset": 0,
  "json": {},
  "message": "{\"Test\":\"TestData\",\"Data\":{\"NestedField1\":\"DataField1\",\"NestedField2\":\"DataField2\"}}",
  "source": "/Users/sukru/Downloads/filebeat-6.3.2-darwin-x86_64/test/test1.log",
  "input": {
    "type": "log"
  },
  "fields": {
    "review": 1,
    "level": "debug"
  },
  ...

logstash output

{
          "Data" => {
        "NestedField2" => "DataField2",
        "NestedField1" => "DataField1"
    },
      "@version" => "1",
          "Test" => "TestData",
    "@timestamp" => 2018-08-08T19:40:18.187Z
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.