Metricbeat Kubernetes metadata processor

I'm wondering if anyone has come up with a working configuration for the add_kubernetes_metadata processor for Metricbeat. The documentation of this feature is quite sparse and so I've been struggling to find a working configuration. The current processors configuration I have in my Metricbeat configuration file for the kubernetes module is:

    processors:
    - add_kubernetes_metadata:
        in_cluster: true
        default_indexers.enabled: false
        default_matchers.enabled: false
        indexers:
        - pod_name:
        matchers:
          - fields:
              lookup_fields: ["kubernetes.pod.name"]

And the Metricbeat log files do not indicate any issue with the provided configuration but it is also not adding the expected metadata to the records being written into Elasticsearch. I've also tried a similar configuration as above where the default indexer(s) and matcher(s) are used but the results were the same.

At this point I suspect the issue is related to the indexers and matchers as I had similar issues with the Filebeat add_kubernetes_metadata processor and once I disabled the default matcher and provided an alternate matcher configuration then everything worked fine.

Thanks,

Dave

How did you change the filebeat matcher configuration?

Have you tried to adapt the matcher configuration in metricbeat?

Can you post the complete configuration of you module? Beats use YAML and are sensitive to proper indentation (we normally use 2 spaces, never use tabs).

My responses to your questions:

How did you change the filebeat matcher configuration?

Here is the add_kubernetes_metadata processor configuration for Filebeat:

  processors:
  - add_kubernetes_metadata:
      in_cluster: true
      default_matchers.enabled: false
      matchers:
      - logs_path:
          logs_path: /var/log/containers/

Have you tried to adapt the matcher configuration in metricbeat?

Only in the manner that I showed in my original post. I've determined by looking at the beats code what the indexer and matcher options are but I am still fuzzy on how to properly configure them to get the metadata added to the stored metrics. In the above matcher configuration I replaced the default matcher which uses the metricset.host field to use the kubernetes.pod.name field since the metricset.host field is set to the exact same value on every node for the kubernetes module, "127.0.0.1:10250". I've wondered if that is part of the issue but I haven't been able to track down where the metricset.host value gets set for the Kubernetes module to determine if there is some way to configure the processor so this value is unique.

Can you post the complete configuration of you module? Beats use YAML and are sensitive to proper indentation (we normally use 2 spaces, never use tabs).

I've run the contents of the entire metricbeat.yml file through yamllint.com which says it is valid YAML and there are no errors reported in the Metricbeat log files as I generally see if there is a YAML or other formatting issue within the Metricbeat configuration file. In any case, what follows is our Metricbeat configuration with sensitive fields redacted with '********':

############################### Metricbeat general config #######################################

name: ${SERVER_NODE}

fields_under_root: true
fields:
  metric_topic: "metricsets"

metricbeat.config.modules:
  path: /etc/metricbeat/metricbeat.yml
  reload.enabled: true
  reload.period: 10s

metricbeat.modules:

 # ------------------------------- System Module -------------------------------
  - module: system
    metricsets:
      # CPU stats
      - cpu

      # System Load stats
      # - load

      # Per CPU core stats
      #- core

      # IO stats
      #- diskio

      # Per filesystem stats
      - filesystem

      # File system summary stats
      # - fsstat

      # Memory stats
      - memory

      # Network stats
      - network

      # Per process stats
      - process

      # Sockets (linux only)
      #- socket

      # System Uptime
      - uptime

    enabled: true
    # Setting the period value fairly long until we can ensure that Kafka is
    # stable and can handle a higher volume of data
    period: 60s
    processes: ['abrtd', 'auditd', 'dockerd', 'kubelet', 'syslogd', 'systemd']
    process.cgroups.enabled: false
    process.include_top_n.by_cpu: 5
    process.include_top_n.by_memory: 5

    # Filter out the data for the reported filesystems that don't need to be
    # monitored so we don't fill up the monitoring indices with unneeded data
    processors:
      # Filters for the Metricbeat container's filesystem
      - drop_event.when.regexp.system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|run)($|/)'
      # Filters for the host/server filesystem
      - drop_event.when.regexp.system.filesystem.mount_point: '^/hostfs/(sys|cgroup|proc|dev|etc|host|var|run)($|/)'

 # ------------------------------- Kubernetes Module -------------------------------
  - module: kubernetes

    metricsets:
      # Kubernetes Node MetricSet
      - node

      # Kubernetes Pod MetricSet
      - pod

      # Kubernetes System MetricSet
      - system

      # Kubernetes Container  MetricSet
      - container

      # Kubernetes Event MetricSet
      - event

      # Kubernetes Volume MetricSet
      - volume

    period: 60s

    enabled: true

    # Queries the kubelet port
    hosts: ["https://127.0.0.1:10250"]
    ssl.verification_mode: none
    ssl.certificate_authorities: ["/etc/kubelet/ca.crt"]
    ssl.certificate: "/etc/kubelet/kubelet.crt"
    ssl.key: "/etc/kubelet/kubelet.key"

    processors:
    - add_kubernetes_metadata:
        in_cluster: true
        namespace: cne-logging
        default_indexers.enabled: false
        default_matchers.enabled: false
        indexers:
        - pod_name:
        matchers:
          - fields:
              lookup_fields: ["kubernetes.pod.name"]

############################# Console output ##########################################

output.console:
  # Boolean flag to enable or disable the output module.
  enabled: false

  # Pretty print json event
  pretty: false

############################# Kafka output ##########################################

output.kafka:
  # initial brokers for reading cluster metadata
  hosts: ["kafka-headless:9092"]

  # The username for connecting to Kafka. If username is configured, the
  # password must be configured as well.
  # Only SASL/PLAIN is supported.
  username: ********

  # The password for connecting to Kafka
  password: ********

  #message topic selection + partitioning
  topic: '%{[metric_topic]}'

  partition.round_robin:
    reachable_only: true

  required_acks: -1
  compression: gzip
  max_message_bytes: 1000000

############################# Logging #########################################

# There are three options for the log ouput: syslog, file, stderr.
logging:
  # Only log messages of the following levels: critical, error, and warning
  level: info

  # Disable syslog
  to_syslog: false

  # Disable file logging
  to_files: false

Events:	<none>

Thanks for any assistance you can provide.

For anyone looking for an answer to this question/issue, here is what ultimately worked for me in terms of K8s metadata processor configuration where the SERVER_NODE environment variable is set to the host name of the node/server on which the Metricbeat instance is running:

    processors:
    - add_kubernetes_metadata:
        in_cluster: true
        host: ${SERVER_NODE}
        default_indexers.enabled: false
        default_matchers.enabled: false
        indexers:
        - pod_name:
        matchers:
        - field_format:
            format: "%{[kubernetes.namespace]}/%{[kubernetes.pod.name]}"

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.