Filebeat - proper configuration to parse nested JSON


(Mario Mechoulam) #1

Hello!
I am having a hard time finding the right configuration for Filebeat to be able to correctly parse nested JSON log lines. The main level is parsed without any issues, but nothing that I've tried has worked after that.

Here is the current configuration I have:

# filebeat-inputs
apiVersion: v1
data:
  kubernetes.yml: |-
    - type: log
      paths:
        - /var/lib/kubelet/pods/*/volumes/kubernetes.io~empty-dir/*/server.log
      exclude_files: ['\.gz$|gc.log']
      json:
        add_error_key: true
        # message_key: msg
    - type: log
      paths:
        - /var/lib/kubelet/pods/*/volumes/kubernetes.io~empty-dir/*/io.log
      json:
        add_error_key: true
        # message_key: msg
kind: ConfigMap

# filebeat-config
apiVersion: v1
data:
  filebeat.yml: |-
    filebeat.config:
      inputs:
        # Mounted `filebeat-inputs` configmap:
        path: ${path.config}/inputs.d/*.yml
        # Reload inputs configs as they change:
        reload.enabled: true
        reload.period: 10s
      modules:
        path: ${path.config}/modules.d/*.yml
        # Reload module configs as they change:
        reload.enabled: false

    processors:
      #- decode_json_fields:
      #    fields: ["msg"]
      #   target: ""
      #    max_depth: 5
      - add_cloud_metadata:
      - add_kubernetes_metadata:
          in_cluster: true
          default_indexers.enabled: false
          default_matchers.enabled: false
          include_pod_uid: true
          indexers:
            - pod_uid: ~
          matchers:
            - logs_path:
                logs_path: /var/lib/kubelet/pods/
                resource_type: pod

    output.elasticsearch:
      ...

    setup.template.name: "filebeat-%{[beat.version]}"
    setup.template.pattern: "filebeat-%{[beat.version]}-*"

kind: ConfigMap

This is a sample log output I can see in my pod:

{"msg":"{\"id\":\"G-A_API_Client_Request_Received\",\"version\":\"1.0\",\"timestamp\":1547451531,\"metadata\":null,\"payload\":\"API received params {input=China,sh, components=country:CN, language=en}. Request Id 4c565eeb-53ba-4def-acd4-8090f0319836\",\"uuid\":\"056e7ef1-0ed3-4399-8fff-d6897a9181c0\"}","level":"INFO","logger":"c.f.e.r.d.DefaultEventLogger","thread":"dw-2993 - GET /api/v1/test/sample?searchTerm=sh&filters=country:CN&language=en","timestamp":1547451531386}

And this is the result I can see in Kibana:

{
  "_index": "filebeat-6.4.0-test-2019.03",
  "_type": "doc",
  "_id": "OptMS2gBA7l4X6dkiYEK",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2019-01-14T07:38:52.418Z",
    
    "prospector": {
      "type": "log"
    },
    "beat": {
      "version": "6.4.0",
      "name": "filebeat-zg5np",
      "hostname": "filebeat-zg5np"
    },
    "host": {
      "name": "filebeat-zg5np"
    },
    "kubernetes": {
      ...
    },
    "source": "/var/lib/kubelet/pods/365220c3-15d4-11e9-b099-42010a840057/volumes/kubernetes.io~empty-dir/applogs/server.log",
    "offset": 227623,
    "meta": {
      "cloud": {
        ...
      }
    },
    "json": {
      "msg": "{\"id\":\"G-A_API_Client_Request_Received\",\"version\":\"1.0\",\"timestamp\":1547451531,\"metadata\":null,\"payload\":\"API received params {input=China,sh, components=country:CN, language=en}. Request Id 4c565eeb-53ba-4def-acd4-8090f0319836\",\"uuid\":\"056e7ef1-0ed3-4399-8fff-d6897a9181c0\"}",
      "level": "INFO",
      "logger": "c.f.e.r.d.DefaultEventLogger",
      "thread": "dw-2993 - GET /api/v1/test/sample?searchTerm=sh&filters=country:CN&language=en",
      "timestamp": 1547451531386
    },
    "input": {
      "type": "log"
    }
  },
  "fields": {
    "@timestamp": [
      "2019-01-14T07:38:52.418Z"
    ]
  },
  "sort": [
    -9223372036854776000
  ]
}

Based on what I have read, the single json.add_error_key=true line should be enough to do all the parsing (nested included) of the logs. I also tried including the json.message_key one in combination with the decode_json processor (commented out in the config I posted here), but that did not work either.
I even renamed the original field which the logger called message to msg in case there was a conflict of some kind because of it.
Based on a few other posts around, I double checked that the fields it is trying to parse do not exist already as other types in the index mapping and did not find anything. I suppose, by not having them at the root level, but inside the json field, that is ruled out. Is that a correct assumption?

Any hints in the right direction would be great :slight_smile:
Thanks!


(system) closed #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.