The "add_kubernetes_metadata" processor works only if logs are read from /var/lib/docker/containers/*/*.log, it doesn't work with logs from /var/log/containers/*.log.
This is caused by the way the container ID is extracted from the path in the processor.
exekias from the Elastic team says /var/log/containers/*.log are just symlinks to /var/lib/docker/containers/*/*.log. Of course, he's right and reading logs directly from /var/lib/docker/containers/*/*.log enables extracting the container ID, hence enriching the logs with Kubernetes metadata.
However, there are two reasons, the processor should also work with /var/log/containers/*.log:
You may want to exclude log files from certain pods, e.g. the filebeat pod itself with the exclude_files: ['filebeat-*.log'] option. That would work only in /var/log/containers, as only the symlinks there contain the pod name.
You may want to read only the log files of docker containers used by active Kubernetes pods, not any other docker containers running on the system. That also works only by following the symlinks in /var/log/containers.
Are there any plans on changing this before the 6.0.0 release?
I just found a third reason while analyzing my logs:
The "source" field in the log documents would be much more informative if it contained a value like /var/log/containers/kube-proxy-4d7nt_kube-system_kube-proxy-1bddb0001161285462528b7170a53d13dfe4e17b541319485b9020eef5433266.log instead of /var/lib/docker/containers/1bddb0001161285462528b7170a53d13dfe4e17b541319485b9020eef5433266/1bddb0001161285462528b7170a53d13dfe4e17b541319485b9020eef5433266-json.log
Here's a third pull request that solves the issue without requiring a processor configuration and without regular expressions (more details in the PR): https://github.com/elastic/beats/pull/5011
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.