Filebeat Kubernetes metadata - Add namespace labels to kubernetes metadata

As operators of a multi tenant Kubernetes cluster and operators of the Elastic Stack we want to be able to drop logs by namespace when log rates exceed a certain number. We'd like to control this by adding labels to a namespace and including the drop logic in logstash.

Our logs flow like this:
Filebeat -> Kafka -> Logstash -> Elasticsearch

The pod's metadata is added to each log, but it would be nice to also include the namespace's metadata (namespace labels in particular).

Downsides to applying the label directly to pods:

  • We would be touching workloads we may not own. The application owner owns their deployments, replicasets, and pods.
  • Pods can be frequently redeployed which would clear out the label.

Pros to apply the label to the namespace:

  • It's easier to label an entire namespace instead of chasing individual noisy pods.
  • Kubernetes resource quotas are scoped to a namespace. It makes sense to follow the same pattern for logging quotas too.

Any thoughts on this? I was planning to open a github issue, but the contribution guidelines suggest opening a discussion here first.

References:
https://www.elastic.co/guide/en/beats/filebeat/master/add-kubernetes-metadata.html
https://www.elastic.co/guide/en/beats/filebeat/master/exported-fields-kubernetes-processor.html

Hi @jeffspahr,

Would PodPresets help here?
They are alpha, not a definitive solution but a work-around.

The proposal sounds not bad, but I'd like to ask, assuming in the namespace there will be a limited set of parent objects (replicasets, statefulsets, daemonsets, jobs), wouldn't it be straightforward to label those parents?

If we were to that extra information for kubernetes object, namespace labels would sit in a somehow different layer from the pod, we would need to discuss it but most probably it would be an opt-in feature. Configuration items like add_namespace_labels: true under the already existing add_kubernetes_metadata feels like complicating the configuration

If parent labeling is not solution you are looking at, would you mind opening an issue at https://github.com/elastic/beats so we can continue there?

Thanks for the reply @pmercado.

PodPresets look interesting, but I don't think they solve the issue I'm chasing.

You definitely can add labels to the parent objects like a deployment and have that propagate down to the pod. That works well if the app owner knows ahead of time that they do not want to send log data to elastic. If the label needs to be added after the fact to enforce a logging quota, this would be done by an automated system enforcing quotas or by an operator of the Elastic Stack. It wouldn't be appropriate for that person or system to add a label to the pod spec in a deployment they don't own which would cause pods to be redeployed.

I opened a github issue here https://github.com/elastic/beats/issues/13873.

Thanks!
Jeff

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.