Filebeat decode_json_fields Can not index event


#1

I have a log event where the field "log" has the value (JSON):

{“caller”:“server.go:53",“msg”:“server starting”,“port”:8080}

In our filebeat configuration, we have the config:

processors:

  • decode_json_fields:
    fields: [“log”]
    target:
    overwrite_keys: true

But we see errors in the filebeat log:

WARN Can not index event (status=400): {"type":"mapper_parsing_exception","reason":"failed to parse [log]","caused_by":{"type":"illegal_argument_exception","reason":"unknown property [caller]"}}

I don't know what this "unknown property" refers to. Any ideas?

Jeff


(Andrew Kroh) #2

Please run filebeat with debug enabled so that we can see exactly what the JSON event looks like that Filebeat is publishing to Elasticsearch. You can put logging.level: debug in your config file then check the logs in /var/log/filebeat/filebeat.


#3

Hi Andrew,

With logging.level: debug and:

processors:

  • decode_json_fields:
    fields: [“log”]
    target:
    overwrite_keys: true

2017-05-26T09:35:48Z DBG Publish: {
"@timestamp": "2017-05-26T09:35:38.411Z",
"beat": {
"hostname": "ip-172-29-194-48",
"name": "ip-172-29-194-48",
"version": "5.4.0"
},
"fields": {
"availability_zone": "us-east-1a",
"instance_id": "i-033293a15813aa811"
},
"input_type": "log",
"log": {
"caller": "server.go:53",
"msg": "target blank",
"port": 8080
},
"offset": 2081,
"source": "/var/log/test.log",
"stream": "stdout",
"time": "2017-05-26T09:06:13.565020858Z",
"type": "log"
}
2017-05-26T09:35:48Z WARN Can not index event (status=400): {"type":"mapper_parsing_exception","reason":"failed to parse [log]","caused_by":{"type":"illegal_argument_exception","reason":"unknown property [caller]"}}

If I then change the filebeat config to have the target as the empty string, it seems to work:

processors:

  • decode_json_fields:
    fields: [“log”]
    target: ""
    overwrite_keys: true

2017-05-26T09:40:31Z DBG Publish: {
"@timestamp": "2017-05-26T09:40:26.728Z",
"beat": {
"hostname": "ip-172-29-194-48",
"name": "ip-172-29-194-48",
"version": "5.4.0"
},
"caller": "server.go:53",
"fields": {
"availability_zone": "us-east-1a",
"instance_id": "i-033293a15813aa811"
},
"input_type": "log",
"log": "{"caller":"server.go:53","msg":"target empty string","port":8080}",
"msg": "target empty string",
"offset": 2227,
"port": 8080,
"source": "/var/log/test.log",
"stream": "stdout",
"time": "2017-05-26T09:06:13.565020858Z",
"type": "log"
}
2017-05-26T09:40:31Z DBG output worker: publish 1 events
2017-05-26T09:40:31Z DBG Ping status code: 200
2017-05-26T09:40:31Z INFO Connected to Elasticsearch version 2.3.2
2017-05-26T09:40:31Z DBG PublishEvents: 1 events have been published to elasticsearch in 10.912823ms.
2017-05-26T09:40:31Z DBG send completed
2017-05-26T09:40:31Z DBG Events sent: 1

However, if we set the target to "log", we see errors again (looks the same as the blank case):

processors:

  • decode_json_fields:
    fields: ["log"]
    target: "log"
    overwrite_keys: true

Log:

2017-05-26T09:29:31Z DBG Publish: {
"@timestamp": "2017-05-26T09:29:21.640Z",
"beat": {
"hostname": "ip-172-29-194-48",
"name": "ip-172-29-194-48",
"version": "5.4.0"
},
"fields": {
"availability_zone": "us-east-1a",
"instance_id": "i-033293a15813aa811"
},
"input_type": "log",
"log": {
"caller": "server.go:53",
"msg": "target log 2",
"port": 8080
},
"offset": 1621,
"source": "/var/log/test.log",
"stream": "stdout",
"time": "2017-05-26T09:06:13.565020858Z",
"type": "log"
}
2017-05-26T09:29:31Z WARN Can not index event (status=400): {"type":"mapper_parsing_exception","reason":"failed to parse [log]","caused_by":{"type":"illegal_argument_exception","reason":"unknown property [caller]"}}

Setting the target to any other field name also seems to work:

Config:

processors:

  • decode_json_fields:
    fields: ["log"]
    target: "random"
    overwrite_keys: true

Log:

2017-05-26T09:44:17Z DBG Publish: {
"@timestamp": "2017-05-26T09:44:07.657Z",
"beat": {
"hostname": "ip-172-29-194-48",
"name": "ip-172-29-194-48",
"version": "5.4.0"
},
"fields": {
"availability_zone": "us-east-1a",
"instance_id": "i-033293a15813aa811"
},
"input_type": "log",
"log": "{"caller":"server.go:53","msg":"target random","port":8080}",
"offset": 2367,
"random": {
"caller": "server.go:53",
"msg": "target random",
"port": 8080
},
"source": "/var/log/test.log",
"stream": "stdout",
"time": "2017-05-26T09:06:13.565020858Z",
"type": "log"
}
2017-05-26T09:44:17Z DBG output worker: publish 1 events
2017-05-26T09:44:17Z DBG Run prospector
2017-05-26T09:44:17Z DBG Start next scan
2017-05-26T09:44:17Z DBG Check file for harvesting: /var/log/test.log
2017-05-26T09:44:17Z DBG Update existing file for harvesting: /var/log/test.log, offset: 2367
2017-05-26T09:44:17Z DBG Harvester for file is still running: /var/log/test.log
2017-05-26T09:44:17Z DBG Prospector states cleaned up. Before: 1, After: 1
2017-05-26T09:44:17Z DBG Ping status code: 200
2017-05-26T09:44:17Z INFO Connected to Elasticsearch version 2.3.2
2017-05-26T09:44:17Z DBG PublishEvents: 1 events have been published to elasticsearch in 23.16683ms.
2017-05-26T09:44:17Z DBG send completed

In summary:

To merge the decoded JSON fields into the root of the event, the target must be specified as an empty string. Leaving target blank (as shown in the docs) is the same as not including it, which means that the decoded JSON object replaces the string field from which it was read.

We see an illegal_argument_exception, unknown property error whenever the decoded json replaces the field from which it was read. It doesn't seem to matter if this is due to the target being left blank or being specifically set to the same field name (in our case "log"). Writing to the root of the event or to another field successfully publishes events.

Jeff


(system) #4

This topic was automatically closed after 21 days. New replies are no longer allowed.