Condition with decode_json_fields processor

Hi,

I try to collect docker logs with filebeats 6.1.

The application logs are written as JSON, which I want to decode with decode_json_fields processor.
Spring Boot's Bootstrapping also writes some plain log messages, so I need to decode_json_fields conditionally.

Example:

{"log":"\n","stream":"stdout","time":"2018-01-11T08:12:18.6298524Z"}
{"log":"\n","stream":"stdout","time":"2018-01-11T08:12:19.2633836Z"}
{"log":"  .   ____          _            __ _ _\n","stream":"stdout","time":"2018-01-11T08:12:19.2636263Z"}
{"log":" /\\\\ / ___'_ __ _ _(_)_ __  __ _ \\ \\ \\ \\\n","stream":"stdout","time":"2018-01-11T08:12:19.2636359Z"}
{"log":"( ( )\\___ | '_ | '_| | '_ \\/ _` | \\ \\ \\ \\\n","stream":"stdout","time":"2018-01-11T08:12:19.2638339Z"}
{"log":" \\\\/  ___)| |_)| | | | | || (_| |  ) ) ) )\n","stream":"stdout","time":"2018-01-11T08:12:19.2640029Z"}
{"log":"  '  |____| .__|_| |_|_| |_\\__, | / / / /\n","stream":"stdout","time":"2018-01-11T08:12:19.2640115Z"}
{"log":" =========|_|==============|___/=/_/_/_/\n","stream":"stdout","time":"2018-01-11T08:12:19.2641995Z"}
{"log":" :: Spring Boot ::        (v1.5.9.RELEASE)\n","stream":"stdout","time":"2018-01-11T08:12:19.273083Z"}
{"log":"\n","stream":"stdout","time":"2018-01-11T08:12:19.2731073Z"}
{"log":"{\"@timestamp\":\"2018-01-11T08:12:19.584+00:00\",\"@version\":1,\"message\":\"Starting DemoApplication v0.0.1-SNAPSHOT on caa07cb53010 with PID 1 (/app.jar started by root in /)\",\"logger_name\":\"com.example.demo.DemoApplication\",\"thread_name\":\"main\",\"level\":\"INFO\",\"level_value\":20000}\n","stream":"stdout","time":"2018-01-11T08:12:19.6096127Z"}
{"log":"{\"@timestamp\":\"2018-01-11T08:12:19.619+00:00\",\"@version\":1,\"message\":\"No active profile set, falling back to default profiles: default\",\"logger_name\":\"com.example.demo.DemoApplication\",\"thread_name\":\"main\",\"level\":\"INFO\",\"level_value\":20000}\n","stream":"stdout","time":"2018-01-11T08:12:19.6199775Z"}

This is my current config:

filebeat.prospectors:
- type: docker
  paths:
   - '/var/lib/docker/containers/*/*.log'
  containers.ids: '*'
  json.message_key: log
  json.keys_under_root: true
  json.overwrite_keys: true

processors:
  - decode_json_fields:
      when: 
        regexp:
          log: "{\\\".*"
      fields: ["log"]
      target: ""
      overwrite_keys: true
  - add_docker_metadata: ~

output.elasticsearch:
  hosts: ["elasticsearch:9200"]

I'd expect the regexp when condition to check if the log contains an encoded JSON, based on the documentation in https://www.elastic.co/guide/en/beats/filebeat/current/defining-processors.html, which states the when condition to be available for all processors.

However, in the filebeat log I get some errors:

2018/01/11 12:44:52.762759 json.go:32: ERR Error decoding JSON: json: cannot unmarshal number into Go value of type map[string]interface {}
2018/01/11 12:44:52.762968 json.go:32: ERR Error decoding JSON: EOF
2018/01/11 12:44:52.763130 json.go:32: ERR Error decoding JSON: EOF
2018/01/11 12:44:52.763379 json.go:32: ERR Error decoding JSON: invalid character '.' looking for beginning of value
2018/01/11 12:44:52.763506 json.go:32: ERR Error decoding JSON: invalid character '/' looking for beginning of value
2018/01/11 12:44:52.763685 json.go:32: ERR Error decoding JSON: invalid character '(' looking for beginning of value
2018/01/11 12:44:52.763867 json.go:32: ERR Error decoding JSON: invalid character '\\' looking for beginning of value
2018/01/11 12:44:52.764031 json.go:32: ERR Error decoding JSON: invalid character '\'' looking for beginning of value
2018/01/11 12:44:52.764168 json.go:32: ERR Error decoding JSON: invalid character '=' looking for beginning of value
2018/01/11 12:44:52.764302 json.go:32: ERR Error decoding JSON: invalid character ':' looking for beginning of value
2018/01/11 12:44:52.764403 json.go:32: ERR Error decoding JSON: EOF

So, I am wondering, why filebeat tries to decode these entries at all.
Does decode_json_fields respect the when condition?

Try using single quotes around your regex so that you don't have to escape.

Good advise, but no change with

processors:
  - decode_json_fields:
      when: 
        regexp:
          log: '{\".*'
      fields: ["log"]
      target: ""
      overwrite_keys: true
  - add_docker_metadata: ~

Try disabling the decode_json_fields processor entirely. It seems to me that Filebeat is failing on the first JSON unmarshaling step despite the JSON appearing to be entirely valid.

Well, the decode_json_fields processor actually works great for the vaild JSON log entries (the last 2 in the example above).

My concern are the error messages when a Non-JSON log entry is processed.

The question is, whether decode_json_fields respects the when condition, at all?

1 Like

Oh, I didn't notice you were using type: docker. I believe that automatically sets up the JSON parsing for you (per the docs). So I think you are actually running JSON parsing three times. So I believe these settings are redundant and should be removed.

  json.message_key: log
  json.keys_under_root: true
  json.overwrite_keys: true

Thanks Andrew for your replies.

I made some tests, and I think this simple config should work now:

filebeat.prospectors:
- type: docker
  paths:
   - '/var/lib/docker/containers/*/*.log'
  containers.ids: '*'

processors:
  - decode_json_fields:
      fields: ["message"]
      target: ""
      overwrite_keys: true
  - add_docker_metadata: ~

output.elasticsearch:
  hosts: ["elasticsearch:9200"]

I'm still not fully sure about the original problem (probably the trick was fields: ["message"]), but this works now pretty stable, without the when condition and without any errors produced by filebeats.

Hi,
Have you accomplished what you wanted? i.e the "message" field is parsed, and you can see the different fields in Kibana?

Hi!

Well, I have a setup / workaround that works for my use case without error entries in the filebeat log.

However, the actual question of this topic is still unanswered:

Does the decode_json_fields processor respect the when condition?

In my experiments it did not.

But I'd expect it does, based on the documentation in Define processors | Filebeat Reference [8.11] | Elastic

1 Like

Yes, it does. The conditions are implemented outside of the processors. The individual processors don't get a "say" in whether or not they respect the conditions.

Here's a simple example.

$ cat input.json 
{"log": "hello world!"}
{"log": "hello world!", "counter": 42}
$ cat filebeat.json.yml 
filebeat.prospectors:
- paths:
  - 'input.json'

processors:
- decode_json_fields:
    when.regexp.message: '^{'
    fields: ["message"]
    target: ""
    overwrite_keys: true

output.file:
  path: 'out'
  filename: filebeat.json
$ cat out/filebeat.json  | jq .
{
  "@timestamp": "2018-01-16T18:16:15.520Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "doc",
    "version": "6.1.1"
  },
  "source": "/Users/akroh/Downloads/filebeat-6.1.1-darwin-x86_64/input.json",
  "offset": 24,
  "message": "{\"log\": \"hello world!\"}",
  "beat": {
    "version": "6.1.1",
    "name": "x",
    "hostname": "x"
  },
  "log": "hello world!"
}
{
  "@timestamp": "2018-01-16T18:16:15.521Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "doc",
    "version": "6.1.1"
  },
  "beat": {
    "name": "x",
    "hostname": "x",
    "version": "6.1.1"
  },
  "log": "hello world!",
  "counter": 42,
  "source": "/Users/akroh/Downloads/filebeat-6.1.1-darwin-x86_64/input.json",
  "offset": 63,
  "message": "{\"log\": \"hello world!\", \"counter\": 42}"
}
1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.