Filebeat and Kubernetes: excluding log files

jeremievallee · January 31, 2018, 2:49pm

Hi there,

I'm having trouble configuring filebeat on Kubernetes.
Let's say you want filebeat to get the containers logs from Kubernetes, but you would like to exclude some files (for example because you don't want to get logs from filebeat, which is also running as a pod on Kubernetes).

I thought this prospector config would be right, but no luck so far:

- type: docker
  containers:
    ids:
      - "*"
    path: "/var/log/containers"
  exclude_files:
    - "/var/log/containers/filebeat*.log"
    - "/var/log/containers/logstash*.log"
  processors:
    - add_kubernetes_metadata:
        in_cluster: true
        default_matchers.enabled: false
        matchers:
        - logs_path:
            logs_path: /var/log/containers/

Am I doing something wrong, or is it just not possible at the moment ?
This work seems to been made possible thanks to this PR : https://github.com/elastic/beats/pull/4981 .
Using filebeat:6.1.3, btw.

Many thanks,
Jeremie

exekias · January 31, 2018, 3:07pm

Hi @jeremievallee,

exclude_files parameter expects a list of regular expressions, you wrote something in glob format, try setting something like:

- '/var/log/containers/filebeat.*'
- '/var/log/containers/logstash.*'

Also, if you don't mind reading those files and then discarding the output, you could use drop_event processor: https://www.elastic.co/guide/en/beats/filebeat/current/drop-event.html

jeremievallee · January 31, 2018, 4:10pm

Hi @exekias, thanks for the response. I tried with the single quotes, but still not working unfortunately. But actually, even trying without any exclude_files key also does not work:

- type: docker
  containers:
    ids:
      - '*'
    path: "/var/log/containers"
  processors:
    - add_kubernetes_metadata:
        in_cluster: true
        default_matchers.enabled: false
        matchers:
        - logs_path:
            logs_path: /var/log/containers/

So, perhaps even this config ^ is false ? I know that by default filebeat goes in /var/lib/docker/containers to get the logs. However since the logs in that folder are referenced via an id rather than a name, it's impossible for me to exclude the ones I want, as I can't know the ids.

My understanding from that merged PR (https://github.com/elastic/beats/pull/4981) was that it was possible to configure Filebeat to get the logs from the /var/log/containers directory instead, which does contain the name of the applications. Is that true ? Do you have an example for this use case ?

Surely I can't be the only one needing to remove some of these logs (I hope )

Thanks,
Jeremie

Btw I did try with the drop_event processor, and it works, however it does become a heavy query on the logs when the list of containers to blacklist grows, which seems very inefficient to me.

exekias · January 31, 2018, 4:27pm

I think that config should work, could you explain what's the current behavior?

exekias · January 31, 2018, 4:30pm

Also for your use case, we have been working on Kubernetes autodiscover, but it hasn't been released yet (will be available with 6.2): https://github.com/elastic/beats/pull/6055

jeremievallee · January 31, 2018, 4:41pm

The current behaviour is that filebeat starts on every node but does not pick up any file at all. I did mount the /var/log/containers directory to Filebeat, and I checked it could access the files inside that folder.

I will investigate more.

exekias · January 31, 2018, 5:33pm

In most cases /var/log/containers contains symlinks to/var/lib/docker/containers/, is this the case? You would need to mount both?

jeremievallee · January 31, 2018, 5:45pm

Yes I am mounting both. Actually, the files in /var/log/containers contain symlinks to files in /var/log/pods which contain symlinks to /var/lib/docker/containers.
I started by mounting these 3 folders, but didn't help.
I'm now testing to mount /var/log entirely, see if that would work.

jeremievallee · February 1, 2018, 8:49am

No luck, unfortunately.
Also tried with using the log type instead of docker

    - type: log
      paths:
        - /var/log/containers/*.log
      json.message_key: log
      json.keys_under_root: true
      processors:
        - add_kubernetes_metadata:
            in_cluster: true
            default_matchers.enabled: false
            matchers:
            - logs_path:
                logs_path: /var/log/containers/

But that didn't help, filebeat doesn't pick anything. Perhaps @Sven_Woltmann would know ?

Worse case scenario, I know that reading the files and using drop_event works with the following config:

    - type: docker
      containers.ids:
      - "*"
      processors:
        - add_kubernetes_metadata:
            in_cluster: true
        - drop_event:
            when:
              equals:
                kubernetes.container.name: "filebeat"
        - drop_event:
            when:
              equals:
                kubernetes.container.name: "logstash"

It's just that I don't think this is really efficient and scalable. But perhaps I'm wrong ?

Many thanks,
Jeremie

Sven_Woltmann · February 1, 2018, 10:00am

Hi @jeremievallee,

it seems that the format of your configuration file is wrong, e.g. processors should not be on the same level as type and path, rather on the same level as filebeat.prospectors (type's and path's parent).

Here's my config file running on our production cluster. However, I have not yet updated to 6.1, still using 6.0 beta plus my changes. So my config might not be fully up to date.

filebeat.prospectors:
  - type: log
	paths:
	  - "/var/log/containers/*.log"

	# Don't read my own logs, and some others:
	exclude_files:
	  - filebeat-.*\.log
	  - default-http-backend-.*\.log
	  - nginx-ingress-controller-.*\.log

	# Keys are copied top level in the output document:
	json.keys_under_root: true

	# Filebeat adds a "error.message" and "error.type: json" key in case of JSON unmarshalling errors:
	json.add_error_key: true

	# Allow Filebeat to harvest symlinks in addition to regular files:
	symlinks: true

filebeat.shutdown_timeout: 5s

filebeat.registry_file: /var/log/containers/filebeat_registry

name: ${NODE_NAME}

processors:
  # In logs from our microservices, "log" contains a JSON object.
  # In logs from Kubernetes services, "log" contains the log message.
  # Fortunately, Filebeat detects the difference and decodes "log" only when it contains an escaped JSON String.
  - decode_json_fields:
	  fields: ["log"]
	  # Merge the decoded JSON fields into the root of the event:
	  target: ""
  - add_kubernetes_metadata:
	  in_cluster: true

output.elasticsearch:
  hosts:
	- xxxxxxxx
	- xxxxxxxx
	- xxxxxxxx
  username: xxxxxxxx
  password: xxxxxxxx
  index: "filebeat-%{[beat.version]}-kube-prod-%{+yyyy.MM.dd}"
  bulk_max_size: 2500

# These are required since 6.0.0-beta2 if output.elasticsearch.index is defined
setup.template.name: "filebeat-%{[beat.version]}"
setup.template.pattern: "filebeat-%{[beat.version]}-*"

Hope that helps.

Sven

Sven_Woltmann · February 1, 2018, 10:05am

I forgot to mention, I'm mounting these three folders into the filebeat containers:

/var/log/containers (files here are symlinks to /var/log/pods/...)
/var/log/pods (files here are symlinks to /var/lib/docker/containers/...)
/var/lib/docker/containers

jeremievallee · February 1, 2018, 11:35am

Hi @Sven_Woltmann! Thanks a lot for your messages, I finally got it working!

You're right, indentation was off, and also I didn't have the symlinks: true option on. Here's my final config, and I can confirm it works with filebeat:6.1.3:

filebeat.prospectors:
  - type: log
    paths:
      - "/var/log/containers/*.log"
    exclude_files:
      - filebeat-.*\.log
      - logstash-.*\.log
    json.message_key: log
    json.add_error_key: true
    json.keys_under_root: true
    symlinks: true
    tail_files: true

processors:
  - add_kubernetes_metadata:
      in_cluster: true
      default_matchers.enabled: false
      matchers:
      - logs_path:
          logs_path: /var/log/containers/

filebeat.shutdown_timeout: 5s

output.logstash:
  hosts: ["logstash:5044"]

Also worth mentioning that I'm mounting:

/var/lib/docker/containers
/var/log/pods
/var/log/containers

All three in readOnly mode.

Thanks a lot @exekias and @Sven_Woltmann for your help

Jeremie

system · March 1, 2018, 11:35am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat for kubernetes containers. How include / exclude logs by kubernetes labes? Beats filebeat	3	1498	September 20, 2019
Is there a complete, working example of a filebeats config on kubernetes? Beats filebeat	6	1767	July 27, 2019
Filebeat white/black list for container logs in kubernetes Beats docker , filebeat	8	2609	June 9, 2020
Filebeats Getting Logs from Container in K8s Beats	3	1346	June 6, 2019
Ship kubernetes containers Logs using filebeat Beats filebeat	3	1383	September 12, 2016

Filebeat and Kubernetes: excluding log files

Related topics