Multiline stderr logs mixed with stout logs

I am having an issue collecting docker containers logs.

The issue is that, for example, when a Java Stacktrace gets logged on multiple lines on stderr stream, it could be that other logs of the stdout stream are logged in between, resulting in this kind of docker log files:

{"log":"1st line of multiline log","stream":"stderr","time":"2020-05-20T11:04:44.302079084Z"}
{"log":"A Simple oneline log here","stream":"stdout","time":"2020-05-20T11:04:44.302098452Z"}
{"log":"A Simple oneline log here","stream":"stdout","time":"2020-05-20T11:04:44.302102025Z"}
{"log":"A Simple oneline log here","stream":"stdout","time":"2020-05-20T11:04:44.302299019Z"}
{"log":"A Simple oneline log here","stream":"stdout","time":"2020-05-20T11:04:44.302304733Z"}
{"log":"A Simple oneline log here","stream":"stdout","time":"2020-05-20T11:04:44.302309811Z"}
{"log":"2nd line of multiline log","stream":"stderr","time":"2020-05-20T11:04:44.30225182Z"}
{"log":"3rd line of multiline log","stream":"stderr","time":"2020-05-20T11:04:44.302319095Z"}
{"log":"4th line of multiline log","stream":"stderr","time":"2020-05-20T11:04:44.302321649Z"}
{"log":"5th line of multiline log","stream":"stderr","time":"2020-05-20T11:04:44.302323951Z"}
{"log":"6th line of multiline log","stream":"stderr","time":"2020-05-20T11:04:44.302326262Z"}
{"log":"7th line of multiline log","stream":"stderr","time":"2020-05-20T11:04:44.302333692Z"}
{"log":"8th line of multiline log","stream":"stderr","time":"2020-05-20T11:04:44.30233636Z"}
{"log":"9th line of multiline log","stream":"stderr","time":"2020-05-20T11:04:44.302338783Z"}

As you can see, there are some stdout logs mixed in between of a multiline stderr log.

Now, configuring the input as follows:

    - type: container
      paths:
    - '/var/lib/docker/containers/*/*.log*'
      multiline:
    pattern: '^\*{5} ERROR \*{5}$|^\*{5} WARNING \*{5}$|^\*{17}$|^\[?[0-9]{4}[-.][0-9]{2}[-.][0-9]{2}[ T][0-9]{2}:[0-9]{2}:[0-9]{2}|^\[?[0-9]{2}[-.][0-9]{2}[-.][0-9]{4} [0-9]{2}:[0-9]{2}:[0-9]{2}'
    negate: true
    match: after
    max_lines: 1000
    timeout: 7s
      stream: all
      max_bytes: 100000
      close_inactive: 2m
      ignore_older: 24h
      fields_under_root: true
      fields:
    source_system: filebeat_docker 

The result is the separation of the 1st line of multiline log and the other lines in two separate documents in Elasticsearch.

First attempt of resolution
To solve the problem I defined two inputs on the same file, one that collects stdout logs, and the other one that collects stderr logs:

- type: container
  paths:
    - '/var/lib/docker/containers/*/*.log*'
  multiline:
    pattern: '^\*{5} ERROR \*{5}$|^\*{5} WARNING \*{5}$|^\*{17}$|^\[?[0-9]{4}[-.][0-9]{2}[-.][0-9]{2}[ T][0-9]{2}:[0-9]{2}:[0-9]{2}|^\[?[0-9]{2}[-.][0-9]{2}[-.][0-9]{4} [0-9]{2}:[0-9]{2}:[0-9]{2}'
    negate: true
    match: after
    max_lines: 1000
    timeout: 7s
  stream: stdout
  max_bytes: 100000
  close_inactive: 2m
  ignore_older: 24h
  fields_under_root: true
  fields:
    source_system: filebeat_docker
- type: container
  paths:
    - '/var/lib/docker/containers/*/*.log*'
  multiline:
    pattern: '^\*{5} ERROR \*{5}$|^\*{5} WARNING \*{5}$|^\*{17}$|^\[?[0-9]{4}[-.][0-9]{2}[-.][0-9]{2}[ T][0-9]{2}:[0-9]{2}:[0-9]{2}|^\[?[0-9]{2}[-.][0-9]{2}[-.][0-9]{4} [0-9]{2}:[0-9]{2}:[0-9]{2}'
    negate: true
    match: after
    max_lines: 1000
    timeout: 7s
  stream: stderr
  max_bytes: 100000
  close_inactive: 2m
  ignore_older: 24h
  fields_under_root: true
  fields:
    source_system: filebeat_docker

Unfortunately, as mentioned in the Filebeat documentation, defining two inputs on the same file is not recommended, and could cause unpredictable behaviors.
However, it works, the multiline log is created as a single document and the stdout logs too.

...BUT the CPU resources that Filebeat uses increased from 1% average to 30% average.

Do you have any other suggestions on how to solve this problem?

Thank you!

We struggle the same issue when Docker input multiline mix stderr and stdout.
IMHO it's not right and should be fixed but I haven't been able to find such issue on GitHub.

Thanks for your reply @Alexander_A

I did not find any issue on GitHub too...
I was hoping someone had a better approach to solve this problem.

Another alternative is to start two separate Filebeat processes (or containers), one which read stdout logs, and the other one that reads stderr logs. This could result in lower CPU Usage, but I am unsure about that because of the open file handles that will double, and there will be more I/O concurrency on the same file from different processes.

I have not tested this yet though.

We have tried to separate configs for stderr and stdout but non of these inputs worked.
I want to report a bug on Github but it says:

Please post all questions and issues on https://discuss.elastic.co/c/beats
before opening a Github Issue. Your questions will reach a wider audience there,
and if we confirm that there is a bug, then you can open a new issue.

If I go to https://discuss.elastic.co/c/beats I get message "Oops! That page doesn’t exist or is private."
I suppose if this topic will be closed with no reply I will report a bug.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.