Condition with decode_json_fields processor


(Jochenchrist) #1

Hi,

I try to collect docker logs with filebeats 6.1.

The application logs are written as JSON, which I want to decode with decode_json_fields processor.
Spring Boot's Bootstrapping also writes some plain log messages, so I need to decode_json_fields conditionally.

Example:

{"log":"\n","stream":"stdout","time":"2018-01-11T08:12:18.6298524Z"}
{"log":"\n","stream":"stdout","time":"2018-01-11T08:12:19.2633836Z"}
{"log":"  .   ____          _            __ _ _\n","stream":"stdout","time":"2018-01-11T08:12:19.2636263Z"}
{"log":" /\\\\ / ___'_ __ _ _(_)_ __  __ _ \\ \\ \\ \\\n","stream":"stdout","time":"2018-01-11T08:12:19.2636359Z"}
{"log":"( ( )\\___ | '_ | '_| | '_ \\/ _` | \\ \\ \\ \\\n","stream":"stdout","time":"2018-01-11T08:12:19.2638339Z"}
{"log":" \\\\/  ___)| |_)| | | | | || (_| |  ) ) ) )\n","stream":"stdout","time":"2018-01-11T08:12:19.2640029Z"}
{"log":"  '  |____| .__|_| |_|_| |_\\__, | / / / /\n","stream":"stdout","time":"2018-01-11T08:12:19.2640115Z"}
{"log":" =========|_|==============|___/=/_/_/_/\n","stream":"stdout","time":"2018-01-11T08:12:19.2641995Z"}
{"log":" :: Spring Boot ::        (v1.5.9.RELEASE)\n","stream":"stdout","time":"2018-01-11T08:12:19.273083Z"}
{"log":"\n","stream":"stdout","time":"2018-01-11T08:12:19.2731073Z"}
{"log":"{\"@timestamp\":\"2018-01-11T08:12:19.584+00:00\",\"@version\":1,\"message\":\"Starting DemoApplication v0.0.1-SNAPSHOT on caa07cb53010 with PID 1 (/app.jar started by root in /)\",\"logger_name\":\"com.example.demo.DemoApplication\",\"thread_name\":\"main\",\"level\":\"INFO\",\"level_value\":20000}\n","stream":"stdout","time":"2018-01-11T08:12:19.6096127Z"}
{"log":"{\"@timestamp\":\"2018-01-11T08:12:19.619+00:00\",\"@version\":1,\"message\":\"No active profile set, falling back to default profiles: default\",\"logger_name\":\"com.example.demo.DemoApplication\",\"thread_name\":\"main\",\"level\":\"INFO\",\"level_value\":20000}\n","stream":"stdout","time":"2018-01-11T08:12:19.6199775Z"}

This is my current config:

filebeat.prospectors:
- type: docker
  paths:
   - '/var/lib/docker/containers/*/*.log'
  containers.ids: '*'
  json.message_key: log
  json.keys_under_root: true
  json.overwrite_keys: true

processors:
  - decode_json_fields:
      when: 
        regexp:
          log: "{\\\".*"
      fields: ["log"]
      target: ""
      overwrite_keys: true
  - add_docker_metadata: ~

output.elasticsearch:
  hosts: ["elasticsearch:9200"]

I'd expect the regexp when condition to check if the log contains an encoded JSON, based on the documentation in https://www.elastic.co/guide/en/beats/filebeat/current/defining-processors.html, which states the when condition to be available for all processors.

However, in the filebeat log I get some errors:

2018/01/11 12:44:52.762759 json.go:32: ERR Error decoding JSON: json: cannot unmarshal number into Go value of type map[string]interface {}
2018/01/11 12:44:52.762968 json.go:32: ERR Error decoding JSON: EOF
2018/01/11 12:44:52.763130 json.go:32: ERR Error decoding JSON: EOF
2018/01/11 12:44:52.763379 json.go:32: ERR Error decoding JSON: invalid character '.' looking for beginning of value
2018/01/11 12:44:52.763506 json.go:32: ERR Error decoding JSON: invalid character '/' looking for beginning of value
2018/01/11 12:44:52.763685 json.go:32: ERR Error decoding JSON: invalid character '(' looking for beginning of value
2018/01/11 12:44:52.763867 json.go:32: ERR Error decoding JSON: invalid character '\\' looking for beginning of value
2018/01/11 12:44:52.764031 json.go:32: ERR Error decoding JSON: invalid character '\'' looking for beginning of value
2018/01/11 12:44:52.764168 json.go:32: ERR Error decoding JSON: invalid character '=' looking for beginning of value
2018/01/11 12:44:52.764302 json.go:32: ERR Error decoding JSON: invalid character ':' looking for beginning of value
2018/01/11 12:44:52.764403 json.go:32: ERR Error decoding JSON: EOF

So, I am wondering, why filebeat tries to decode these entries at all.
Does decode_json_fields respect the when condition?


(Andrew Kroh) #2

Try using single quotes around your regex so that you don't have to escape.


(Jochenchrist) #3

Good advise, but no change with

processors:
  - decode_json_fields:
      when: 
        regexp:
          log: '{\".*'
      fields: ["log"]
      target: ""
      overwrite_keys: true
  - add_docker_metadata: ~

(Andrew Kroh) #4

Try disabling the decode_json_fields processor entirely. It seems to me that Filebeat is failing on the first JSON unmarshaling step despite the JSON appearing to be entirely valid.


(Jochenchrist) #5

Well, the decode_json_fields processor actually works great for the vaild JSON log entries (the last 2 in the example above).

My concern are the error messages when a Non-JSON log entry is processed.

The question is, whether decode_json_fields respects the when condition, at all?


(Andrew Kroh) #6

Oh, I didn't notice you were using type: docker. I believe that automatically sets up the JSON parsing for you (per the docs). So I think you are actually running JSON parsing three times. So I believe these settings are redundant and should be removed.

  json.message_key: log
  json.keys_under_root: true
  json.overwrite_keys: true

(Jochenchrist) #7

Thanks Andrew for your replies.

I made some tests, and I think this simple config should work now:

filebeat.prospectors:
- type: docker
  paths:
   - '/var/lib/docker/containers/*/*.log'
  containers.ids: '*'

processors:
  - decode_json_fields:
      fields: ["message"]
      target: ""
      overwrite_keys: true
  - add_docker_metadata: ~

output.elasticsearch:
  hosts: ["elasticsearch:9200"]

I'm still not fully sure about the original problem (probably the trick was fields: ["message"]), but this works now pretty stable, without the when condition and without any errors produced by filebeats.


(Asher Shoshan) #8

Hi,
Have you accomplished what you wanted? i.e the "message" field is parsed, and you can see the different fields in Kibana?


(Jochenchrist) #9

Hi!

Well, I have a setup / workaround that works for my use case without error entries in the filebeat log.

However, the actual question of this topic is still unanswered:

Does the decode_json_fields processor respect the when condition?

In my experiments it did not.

But I'd expect it does, based on the documentation in https://www.elastic.co/guide/en/beats/filebeat/current/defining-processors.html


(Andrew Kroh) #10

Yes, it does. The conditions are implemented outside of the processors. The individual processors don't get a "say" in whether or not they respect the conditions.

Here's a simple example.

$ cat input.json 
{"log": "hello world!"}
{"log": "hello world!", "counter": 42}
$ cat filebeat.json.yml 
filebeat.prospectors:
- paths:
  - 'input.json'

processors:
- decode_json_fields:
    when.regexp.message: '^{'
    fields: ["message"]
    target: ""
    overwrite_keys: true

output.file:
  path: 'out'
  filename: filebeat.json
$ cat out/filebeat.json  | jq .
{
  "@timestamp": "2018-01-16T18:16:15.520Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "doc",
    "version": "6.1.1"
  },
  "source": "/Users/akroh/Downloads/filebeat-6.1.1-darwin-x86_64/input.json",
  "offset": 24,
  "message": "{\"log\": \"hello world!\"}",
  "beat": {
    "version": "6.1.1",
    "name": "x",
    "hostname": "x"
  },
  "log": "hello world!"
}
{
  "@timestamp": "2018-01-16T18:16:15.521Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "doc",
    "version": "6.1.1"
  },
  "beat": {
    "name": "x",
    "hostname": "x",
    "version": "6.1.1"
  },
  "log": "hello world!",
  "counter": 42,
  "source": "/Users/akroh/Downloads/filebeat-6.1.1-darwin-x86_64/input.json",
  "offset": 63,
  "message": "{\"log\": \"hello world!\", \"counter\": 42}"
}

(system) #11

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.