Filebeat multiline docker logs


(Meril) #1

Hi

In production system we use filebeat 6.1 to manage docker log files with format like:
...
{"log":"2017-11-26T16:59:56.912-0000 - INFO - blahblah","stream":"stdout","time":"2017-11-26T16:59:56.91609507Z"}
...

Conf is like this:

filebeat.prospectors:
- type: log
  document_type: xxx
  paths:
  - '/var/lib/docker/containers/*/*.log'
  fields_under_root: true
  fields:
  source_system: filebeat_docker
  processors:
  - add_docker_metadata: ~
  - decode_json_fields:
  fields: ['message']
  target: messagejson
output:
  logstash:

Than in logstash we take log part as message for applying grok filters

...
mutate {
  replace => { "message" => "%{[messagejson][log]}" }
  }
...

This works great for "normal" lines but when there are multiline logs like the following:

{"log":"2017-11-26T16:59:56.912-0000 - ERROR - Error: Bad Request\n","stream":"stdout","time":"2017-11-26T16:59:56.91609507Z"}
{"log":" at xxxx \n","stream":"stdout","time":"2017-11-26T16:59:56.916118577Z"}
{"log":" at yyyy \n","stream":"stdout","time":"2017-11-26T16:59:56.916122447Z"}

the problem arise.
How could I merge the previous json rows in a single json row line to be processed with logstash?

Many thanks
Maril Fleen


(Noémi Ványi) #2

You can set up multiline by setting the appropriate options of your prospector:

  multiline.pattern: '^{'
  multiline.negate: true
  multiline.match:  after

In your case:

filebeat.prospectors:
- type: log
  document_type: xxx
  paths:
  - '/var/lib/docker/containers/*/*.log'
  # multiline settings start
  multiline.pattern: '^{'
  multiline.negate: true
  multiline.match:  after
  # multiline settings end
  fields_under_root: true
  fields:
  source_system: filebeat_docker
  processors:
  - add_docker_metadata: ~
  - decode_json_fields:
  fields: ['message']
  target: messagejson

More on multiline of Beats: https://www.elastic.co/guide/en/beats/filebeat/current/multiline-examples.html


(Meril) #3

Thanks Noéml for your very quick answer.
But... forgive me if I still don't understand...
If I use multiline (as you suggested) I suppose it will process those lines
and "merge" all of them as a single string message.

Basically from 3 messages

1 {"log":"2017-11-26T16:59:56.912-0000 - ERROR - Error: Bad Request\n","stream":"stdout","time":"2017-11-26T16:59:56.91609507Z"}
2 {"log":" at xxxx \n","stream":"stdout","time":"2017-11-26T16:59:56.916118577Z"}
3 {"log":" at yyyy \n","stream":"stdout","time":"2017-11-26T16:59:56.916122447Z"}

I obtain a single multiline message

1 {"log":"2017-11-26T16:59:56.912-0000 - ERROR - Error: Bad Request\n","stream":"stdout","time":"2017-11-26T16:59:56.91609507Z"}
{"log":" at xxxx \n","stream":"stdout","time":"2017-11-26T16:59:56.916118577Z"}
{"log":" at yyyy \n","stream":"stdout","time":"2017-11-26T16:59:56.916122447Z"}

That's a good staring point. But I'm missing how transform those lines in a single message like:

2017-11-26T16:59:56.912-0000 - ERROR - Error: Bad Request\n
at xxxx \n
at yyyy \n

to be processed by grok parser in logstash.
Basically how I can remove all the docker json stuff? The message is not anymore a json but a sting with 3 json attached...
Should I do that in logstash? How?

regards
maril


(Carlos Pérez Aradros) #4

Hi @meril,

We implemented docker prospector exactly for your use case, you can use it like this:

filebeat.prospectors:
- type: docker
  containers.ids:
  - '*'
  processors:
  - add_docker_metadata: ~

It will take care of the Docker JSON format, filling the message, @timestamp and stream fields from it.

Then you can use the rest of parameters, like multiline, on the result.

Best regards


(Meril) #5

Some more info:
I think there is a misunderstanding.
My problem is not in merging multiline json log files, like this:

{"log":" blablabla blablabla blablabla blablabla blablabla","stream":
"stdout","time":"2017-11-26T16:59:56.916118577Z"}

This can be done through Noémi suggest (multiline).

My problem is having multiline INSIDE docker json logs. Like this

{"log":"2017-11-26 - ERROR - blabla \n","stream":"stdout",...}
{"log":"      at xxxx \n","stream":"stdout",...}
{"log":"      at yyyy \n","stream":"stdout",...}

I could use a multiline pattern like:
({"log":"[[0-9]{4}-[0-9]{2}-[0-9]{2})

but this would result in a join of all json rows as a unique string that I cannot figure out how to process to obtain a final message like:
2017-11-26T16:59:56.912-0000 - ERROR - Error: Bad Request\n
at xxxx\n
at yyyy\n

to be processed with grok

regards
Meril


(Meril) #6

uppps we crossed...

Wow that sounds good.
Thank you Carlos.
I'll try your solution (but before I have to upgrade to 6.1)
Many Thanks


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.