Unable to parse docker json-file


(Asaf Shabat) #1

Hello,

I'm trying to parse docker json-file output to Logstash using Filebeat and to break the message log into fields.
Currently, this is what I can parse into Kibana:

{
  "_index": "filebeat-2017.12.19",
  "_type": "doc",
  "_id": "pr50bmABIujUTb2Ryv8M",
  "_version": 1,
  "_score": null,
  "_source": {
    "offset": 6761353,
    "log": "\u001b[0m\u001b[0m11:07:12.951 INFO  [org.wildfly.extension.undertow] (ServerService Thread Pool -- 54) WFLYUT0021: Registered web context: /rest\n",
    "prospector": {
      "type": "log"
    },
    "source": "/var/lib/docker/containers/7688c378a4de513a8c5e587843512476ea996700e93a792e71c5962a190bb779/7688c378a4de513a8c5e587843512476ea996700e93a792e71c5962a190bb779-json.log",
    "message": "{\"log\":\"\\u001b[0m\\u001b[0m11:07:12.951 INFO  [org.wildfly.extension.undertow] (ServerService Thread Pool -- 54) WFLYUT0021: Registered web context: /rest\\n\",\"stream\":\"stdout\",\"time\":\"2017-12-19T11:07:12.952311089Z\"}",
    "docker": {
      "container": {
        "name": "wildfly01",
        "image": "dockerepos.dom.local:5000/wildfly:10.0.0.Final",
        "id": "7688c378a4de513a8c5e587843512476ea996700e93a792e71c5962a190bb779",
        "labels": {
          "license": "GPLv2",
          "build-date": "20170801",
          "vendor": "CentOS"
        }
      }
    },
    "tags": [
      "beats_input_codec_plain_applied",
      "_grokparsefailure",
      "_jsonparsefailure"
    ],
    "@message": {
      "time": "2017-12-19T11:07:12.952311089Z",
      "log": "\u001b[0m\u001b[0m11:07:12.951 INFO  [org.wildfly.extension.undertow] (ServerService Thread Pool -- 54) WFLYUT0021: Registered web context: /rest\n",
      "stream": "stdout"
    },
    "@timestamp": "2017-12-19T11:07:14.787Z",
    "stream": "stdout",
    "@version": "1",
    "beat": {
      "name": "server_devenv01",
      "hostname": "server_devenv01",
      "version": "6.0.1"
    },
    "host": "server_devenv01",
    "topic": "Local-Dev-wildfly",
    "time": "2017-12-19T11:07:12.952311089Z"
  },
  "fields": {
    "@message.time": [
      "2017-12-19T11:07:12.952Z"
    ],
    "@timestamp": [
      "2017-12-19T11:07:14.787Z"
    ]
  },
  "sort": [
    1513681634787
  ]
}

I want to break the @message field into some other fields, by the following pattern:
%{DATE:date} %{TIME:time} %{LOGLEVEL:loglevel}%{SPACE} [(?[^]]+)] ((?[^]]+))%{SPACE} %{GREEDYDATA:message}

Currently I have the following configurations:

filebeat.yml:
filebeat.prospectors:

- type: log

  enabled: true

  paths:
    - '/var/lib/docker/containers/*/*.log'

  processors:
  - add_docker_metadata: ~

  fields:
    topic: Local-Dev-wildfly

  fields_under_root: true

  multiline.pattern: '^\[[:space:]]+|]$'

  multiline.match: after

/etc/logstash/conf.d/10-filter.conf

filter {
json {
source => "message"
target => "@message"
}
json {
source => "message"
}
grok {
match => { '@message' => '%{DATE:date} %{TIME:time} %{LOGLEVEL:loglevel}%{SPACE} [(?[^]]+)] ((?[^]]+))%{SPACE} %{GREEDYDATA:message}' }
}
}

How can I break the @message log to a specific fields (LOGLEVEL, LOGGER, THREAD and etc...?

Thanks!


(Carlos Pérez Aradros) #2

We introduced the docker prospector with Filebeat 6.1, specific for this use case: https://www.elastic.co/guide/en/beats/filebeat/6.1/configuration-filebeat-options.html#config-containers

The configuration to use it looks like:

- type: docker
  containers.ids:
    - '*'
  processors:
  - add_docker_metadata: ~

Then you can probably do the rest of processing from logstash, with a cleaner original message

Best regards


(Asaf Shabat) #3

Wow! you surprised me that you're supporting Docker logs OOTB now!
But I'm returning to the main question, how can I parse the message log after it transferred from Filebeat to Logstash?
I get the following message in Kibana:

e[0me[0m13:01:45.773 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: WildFly Full 10.0.0.Final (WildFly Core 2.0.10.Final) started in 23791ms - Started 794 of 1135 services (487 services are lazy, passive or on-demand)

What should I write in the logstash-filter.conf in order to parse the message?
Currently I have this configuration in there:

filter {
    grok {
      match => { 'message' => '%{DATE:date} %{TIME:time} %{LOGLEVEL:loglevel}%{SPACE} \[(?<logger>[^\]]+)\] \((?<thread>[^\]]+)\)%{SPACE} %{GREEDYDATA:message}' }
  }
}

and I'm unable to break the message above into fields like LOGLEVEL, LOGGER and etc...


(Pier-Hugues Pellerin) #4

Hello,
If I look at the message, I see color escape chars in the log \u001b[0m\u001b[0m, this is probably messing up your grok patterns, I would look at your configuration If it can probably be removed from the logs.

After, I would either use the https://grokdebug.herokuapp.com/ to create the grok patterns OR
I would switch to the dissect filter which I believe should work well in your case and it's much easier to deal with and faster than grok.

Thanks


(Asaf Shabat) #5

Thank you very much!!
I changed the filter plugin to Dissect instead of Grok and it works well!

This is the new filter pattern:

filter {
dissect {
mapping => {
"message" => "%{time} %{loglevel} [%{logger}] (%{thread}) %{message}"
}
}
}


(Asaf Shabat) #6

Exekias,
Does "docker type" module has the ability to compress the data?
Currently, I'm using the "compression_level: 9" in the Logstash section inside the filebeat.yml configuration file.

Is the compression_level applies to the docker type module?


(Carlos Pérez Aradros) #7

yes, compression_level parameter affects to all output, including docker prospector


(Asaf Shabat) #8

Thanks.


(system) #9

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.