Docker logs to ES with FileBeat

Hi,

I need help to parse my docker logs to ES with FileBeat - without using logstash.
The main issue is that my "log" line transfers to the ES as a string instead of being parsed.

Versions:
FileBeat - 5.1.1
ES - 2.3

My filebeat.yml configured as below:

filebeat:
 prospectors:
     - paths: ["/tmp/**/*-json.log"]
       json.message_key: log
       json.keys_under_root: true
       json.add_error_key: true
output:
 elasticsearch:
    hosts: ["ES_URL:PORT"]
    index: "docker-swarm"
    template.name: "docker-swarm"

Every new line in the docker json logs looks like that:

{"log":"{\"name\":\"test\",\"hostname\":\"4e7c4d8ef9ce\",\"pid\":16,\"level\":30,\"msg\":\"got health request\",\"time\":\"2016-12-26T10:58:05.221Z\",\"src\":{\"file\":\"/usr/src/app/src/index.js\",\"line\":42,\"func\":\"health\"},\"v\":0}\n","stream":"stdout","time":"2016-12-26T10:58:05.222365772Z"}

The template that I uploaded to the ES is:

{
  "template": "docker-swarm",
  "settings": {},
  "mappings": {
    "docker-swarm": {
      "properties": {
        "name": {
          "index": "not_analyzed",
          "type": "string"
        },
        "hostname": {
          "index": "not_analyzed",
          "type": "string"
        },
        "msg": {
          "index": "not_analyzed",
          "type": "string"
        },
        "time": {
          "type": "date",
          "format": "yyyy-MM-dd'T'HH:mm:ss.SSSZZ"
        }
      }
    }
  }
}

What I'm getting in the ES is that the "log" key is a string and not parsed:

{
  "took" : 14,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "docker-swarm",
      "_type" : "log",
      "_id" : "AVk7Su1ZT58pSSDcWEiy",
      "_score" : 1.0,
      "_source" : {
        "@timestamp" : "2016-12-26T13:21:20.309Z",
        "beat" : {
          "hostname" : "b22567b66c53",
          "name" : "b22567b66c53",
          "version" : "5.1.1"
        },
        "input_type" : "log",
        **"log" : "{\"name\":\"test\",\"hostname\":\"4e7c4d8ef9ce\",\"pid\":16,\"level\":30,\"msg\":\"got health request\",\"time\":\"2016-12-26T10:58:05.221Z\",\"src\":{\"file\":\"/usr/src/app/src/index.js\",\"line\":42,\"func\":\"health\"},\"v\":0}",**
        "offset" : 327,
        "source" : "/tmp/log/gil-json.log",
        "stream" : "stdout",
        "time" : "2016-12-26T10:58:05.222365772Z",
        "type" : "log"
      }
    } ]
  }
}

Thanks in advanced,
Gil

You have several options for parsing the second level JSON:

  • Configure the decode_json_fields processor in Filebeat
  • Use the Ingest Node of Elasticsearch, which also has a JSON decoder processor
  • Use Logstash, which can do that and much more

I suspect the easiest for you would be the first option. Let us know if you have issues with the processor (currently marked experimental).

3 Likes

Exactly what I've been looking for!
Many thanks!!!

Another question related to this, maybe you can help me here also

Let's say that my "log" key can be either an object (like in my first example) or a string.

  1. How can I setup my filebeat.yml file to support both cases?
  2. If I can't - how can I use the drop_event or any other processor to ignore the {"log": "string"} lines

Thanks!

I got the problem.

The application logs are being exported as objects ( with the 'log' as key, this is done by docker's file logging driver together with bunyan as logging node module ), while other logs, that are not directly from my application are being exported as String.

to add 2 more questions to Gil's questions,

  1. Is there any way I could drop all events that have String as 'log' value ?
  2. does the 'decode_json_fields' support the contains condition ?

could you please share a code snippet. 10x

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.