Collecting Java logs from a dockerised application?

I currently have a Java application which generates a set of Log4j2 log files (RollingFile appender) which is harvested by Filebeat (and sent on to Logstash and ES in the usual way). This all works fine.

But now there is a desire to run the Java application in a Docker container (using docker-compose, at least for the time being).

What is the best approach? One option, which sounds the least disruptive to the existing code and architecture, would be

  • use a volume to write the application's log files to a host directory, continuing to use log4j2 and the RollilngFile appender
  • run Filebeat on the host in the usual way (nobody has told me that Filebeat has to run under Docker)

but there are no doubt many others, so I'm interested in how others have done this.

(I am, I hope, reasonably familiar with Java, log4j2 and the Elastic stack. I know very little about docker and docker-compose.)

Oh, and a related question (which isn't really an Elastic question): it seems reasonable to want to include the log4j2.xml file in the container (for ease of deployment in the usual case), but be able to override it at run time (eg to wind up the logging level for debugging). Is there a conventional best practice way of doing this?

Hi @TimWard,

The recommended way to do logging in Docker is to send your logs to stdout, then Docker takes care of handling them.

Filebeat is able to understand default Docker's json-file format. In a nutshell: Docker writes logs for all running containers under /var/lib/docker/containers/. You can use these Filebeat settings to fetch and process them:

filebeat.inputs:
  - type: docker
    containers.ids: 
      - '*'

processors:
  - add_docker_metadata: ~

You can run Filebeat as a Docker container if that makes things easier, you just need to make sure you bind mount the following paths into the container:

  • /var/lib/docker/containers to allow Filebeat access the logs.
  • /var/run/docker.sock to give access to the Docker daemon (for metadata fetching).

More info about docker input can be found here: https://www.elastic.co/guide/en/beats/filebeat/current/configuration-filebeat-options.html#config-containers.

1 Like

Ta. I'd come across this suggestion, but isn't the problem that other stuff might go to stdout as well as my logs? Which could make processing the logs further downstream something of a pain (eg having to handle grok parse filters in logstash) and, I gather, would make putting multiline log entries back together (specifically Java stack traces) also something of a pain?

Or is this not a problem in practice?

@TimWard docker , with the json-flie log driver, logs all output to stdout , and this in json format, so its easy to ingest into elasticsearch as @exekias explaind.
The multiline Problem for e.g Java exeptions is also easy to fix, its described in the docs
https://www.elastic.co/guide/en/beats/filebeat/current/_examples_of_multiline_configuration.html

This is where metadata attached by add_docker_metadata helps you, you can decide to use different pipelines based on docker.container.name, for instance. Every log line coming from Filebeat will give you this information, check: https://www.elastic.co/guide/en/beats/filebeat/current/add-docker-metadata.html

As for java & multiline you could move to a more advanced use case, and do logging retrieval based on autodiscover, have a look to https://www.elastic.co/guide/en/beats/filebeat/6.2/configuration-autodiscover.html#_docker_2 and this quick video summary:

You can match image names to decide on multiline settings, or for instance, use docker labels to inject your multiline settings to Filebeat. We plan to do this a feature in 6.3.

Thanks all.

My concern over multiline handling (which I've got working fine for the non-dockerised version of the application) was: what if multiple things within the same container write to stdout, interleaving their lines, some in my standard Java log format and others in whatever random format someone else chose to write to stdout or stderr, there's surely no way the Filebeat multiline handling can cope with that? (Having said which I haven't seen any instances of this happening yet.)

I'm aware of the Docker metadata features which I will investigate in due course, but I was under the impression they were a 6x feature and I haven't yet found the time to upgrade from 5.6.

I've now looked at having the application log to a Console appender, and then picking up the Docker log from outside the container with Filebeat.

But what I have observed so far is that when the container terminates the directory

/var/lib/docker/containers/

which contains the log file(s) vanishes. Doesn't this mean that there's a fair chance that the log file will be deleted before Filebeat has read the last few lines of the file which are the ones which tell you why the container crashed in the first place? So at first sight this approach doesn't look useful - what am I missing?

Ah, OK, that was because I stopped the container with "docker-compose down". If I just change the application to crash with an NPE the log doesn't get deleted so that looks OK.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.