Hello there, a quick question, we have an app that is using filebeat to ship logs, currently we're installing the filebeat package on the container itself, but due to a redesign we're toying with the idea of removing the package and using the filebeat container itself.
The simple case is a single node deploy where the app and filebeat containers would share a logs volume, the app writes and filebeat reads. In order to get this working we had to tweak the GID from the app to make it equal to the filebeat GID (1000). So far so good the trick works.
The trouble would be with a cluster deployment (our app in cluster), which would require a filebeat container for every one of the workers. I know the Pod concept would solve this for k8s but Docker based deployments would be way more verbose.
The current approach of shipping our Docker image with the filebeat package works fine for single and cluster deployments.
I read this post about using the filebeat container for shipping logs from the host itself through a mounted log file but my use case is a little more complex.
Is it a good practice to use the filebeat container like this to ship logs from another container or we should just keep using filebeat in our container?
Have you considered using Filebeat's autodiscover mechanism? You can configure Filebeat with autodiscover and the container input to read the logs of your application's Docker container. Then you'd mount the Docker container logs path when running your Filebeat container so that Filebeat can access your application's Docker container logs.
Thanks for the reply @shaunak! I did not know about auto discovery, we're using filebeat 7.7.1 so far.
Actually our container is a "fat" one, we use s6-overlay to start a few processes, so the logs are mixed, not a pristine Docker app like you'd expect. That's why we chose to install filebeat on the same container at first, now we wonder if using a separate container with filebeat may be a plausible idea or just keep using that way.
I don't know that there's a simple answer to this.
An advantage of bundling your Filebeat process alongside your application process in the same container would be that you'd get a bit more resiliency. If your application container dies for some reason, the Filebeat process dies with it but it doesn't affect log collection from other application containers. On the other hand, an advantage of extracting Filebeat into it's own container could be that you could use that single Filebeat container to ingest logs from multiple application containers.
So it really depends on your preference, I think. And as you said, when you throw an orchestration system like k8s into the mix things become a bit simpler (you have the Pod concept for replacing/emulating a "fat" container) and more resilient (you can have a single Filebeat container-pod and k8s will ensure that it stays up or comes back up).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.