Hello!
I am having a hard time finding the right configuration for Filebeat to be able to correctly parse nested JSON log lines. The main level is parsed without any issues, but nothing that I've tried has worked after that.
Here is the current configuration I have:
# filebeat-inputs
apiVersion: v1
data:
kubernetes.yml: |-
- type: log
paths:
- /var/lib/kubelet/pods/*/volumes/kubernetes.io~empty-dir/*/server.log
exclude_files: ['\.gz$|gc.log']
json:
add_error_key: true
# message_key: msg
- type: log
paths:
- /var/lib/kubelet/pods/*/volumes/kubernetes.io~empty-dir/*/io.log
json:
add_error_key: true
# message_key: msg
kind: ConfigMap
# filebeat-config
apiVersion: v1
data:
filebeat.yml: |-
filebeat.config:
inputs:
# Mounted `filebeat-inputs` configmap:
path: ${path.config}/inputs.d/*.yml
# Reload inputs configs as they change:
reload.enabled: true
reload.period: 10s
modules:
path: ${path.config}/modules.d/*.yml
# Reload module configs as they change:
reload.enabled: false
processors:
#- decode_json_fields:
# fields: ["msg"]
# target: ""
# max_depth: 5
- add_cloud_metadata:
- add_kubernetes_metadata:
in_cluster: true
default_indexers.enabled: false
default_matchers.enabled: false
include_pod_uid: true
indexers:
- pod_uid: ~
matchers:
- logs_path:
logs_path: /var/lib/kubelet/pods/
resource_type: pod
output.elasticsearch:
...
setup.template.name: "filebeat-%{[beat.version]}"
setup.template.pattern: "filebeat-%{[beat.version]}-*"
kind: ConfigMap
This is a sample log output I can see in my pod:
{"msg":"{\"id\":\"G-A_API_Client_Request_Received\",\"version\":\"1.0\",\"timestamp\":1547451531,\"metadata\":null,\"payload\":\"API received params {input=China,sh, components=country:CN, language=en}. Request Id 4c565eeb-53ba-4def-acd4-8090f0319836\",\"uuid\":\"056e7ef1-0ed3-4399-8fff-d6897a9181c0\"}","level":"INFO","logger":"c.f.e.r.d.DefaultEventLogger","thread":"dw-2993 - GET /api/v1/test/sample?searchTerm=sh&filters=country:CN&language=en","timestamp":1547451531386}
And this is the result I can see in Kibana:
{
"_index": "filebeat-6.4.0-test-2019.03",
"_type": "doc",
"_id": "OptMS2gBA7l4X6dkiYEK",
"_version": 1,
"_score": null,
"_source": {
"@timestamp": "2019-01-14T07:38:52.418Z",
"prospector": {
"type": "log"
},
"beat": {
"version": "6.4.0",
"name": "filebeat-zg5np",
"hostname": "filebeat-zg5np"
},
"host": {
"name": "filebeat-zg5np"
},
"kubernetes": {
...
},
"source": "/var/lib/kubelet/pods/365220c3-15d4-11e9-b099-42010a840057/volumes/kubernetes.io~empty-dir/applogs/server.log",
"offset": 227623,
"meta": {
"cloud": {
...
}
},
"json": {
"msg": "{\"id\":\"G-A_API_Client_Request_Received\",\"version\":\"1.0\",\"timestamp\":1547451531,\"metadata\":null,\"payload\":\"API received params {input=China,sh, components=country:CN, language=en}. Request Id 4c565eeb-53ba-4def-acd4-8090f0319836\",\"uuid\":\"056e7ef1-0ed3-4399-8fff-d6897a9181c0\"}",
"level": "INFO",
"logger": "c.f.e.r.d.DefaultEventLogger",
"thread": "dw-2993 - GET /api/v1/test/sample?searchTerm=sh&filters=country:CN&language=en",
"timestamp": 1547451531386
},
"input": {
"type": "log"
}
},
"fields": {
"@timestamp": [
"2019-01-14T07:38:52.418Z"
]
},
"sort": [
-9223372036854776000
]
}
Based on what I have read, the single json.add_error_key=true
line should be enough to do all the parsing (nested included) of the logs. I also tried including the json.message_key
one in combination with the decode_json
processor (commented out in the config I posted here), but that did not work either.
I even renamed the original field which the logger called message
to msg
in case there was a conflict of some kind because of it.
Based on a few other posts around, I double checked that the fields it is trying to parse do not exist already as other types in the index mapping and did not find anything. I suppose, by not having them at the root level, but inside the json
field, that is ruled out. Is that a correct assumption?
Any hints in the right direction would be great
Thanks!