High CPU usage after updating filebeat from 7.12.0 to 8.6.2

After updating to filebeat to 8.6.2 I observe an increase in cpu usage. also tested on 8.6.1 same thing, went back to 8.0.0 and could also observe an increase there, however less than in 8.6.2 and 8.6.1. Is there anything that can explain that?

filebeat.autodiscover:
      providers:
        - type: kubernetes
          node: ${NODE_NAME}
          hints.enabled: true
          hints.default_config:
            type: container
            paths:
              - /var/log/containers/*${data.kubernetes.container.id}.log
          add_resource_metadata:
            cronjob: false
            deployment: false
            namespace:
              enabled: true
    fields_under_root: true
    fields:
      kubernetes.cluster: {{ .Values.name }}
      kubernetes.stage: {{ (split "-" .Values.name)._1 }}
    processors:
      - add_host_metadata:
          netinfo.enabled: false
          when.not.equals.kubernetes.namespace_labels.namespace-type: application
      - drop_fields:
          fields: ['ecs.version', 'kubernetes.namespace_uid']
          when.not.equals.kubernetes.namespace_labels.namespace-type: application
      - drop_fields:
          fields: ['kubernetes.node.uid', 'kubernetes.pod.ip', '/^kubernetes.node.labels.*/']
      # the "index-name" field is used by ELK to determine the effective index
      # the effective index is "index-name" suffixed by the current day
      - copy_fields:
          fields:
            - from: kubernetes.labels.logging_acc_k8s_zone/index-name
              to: index-name
          fail_on_error: false
          ignore_missing: true
          when.not.has_fields: ['index-name']
      # all applications in our namespaces will use the acccps-k8s-logs index, if not overwritten by a label
      - add_fields:
          target: ''
          fields:
            index-name: acccps-k8s-logs
          when:
            and:
            - not.has_fields: ['index-name']
            - or:
              - equals.kubernetes.namespace_labels.namespace-type: shared
              - equals.kubernetes.namespace_labels.namespace-type: helper
      - add_fields:
          fields:
            agent.hostname: ${HOSTNAME}
          target: ""
      - copy_fields:
          fields:
            - from: container.image.name
              to: kubernetes.container.image
          target: "kubernetes"
      - decode_json_fields:
          fields: ['message']
          overwrite_keys: true
          target: ""
      # the "tenant" field is just for convinience
      - copy_fields:
          fields:
            - from: kubernetes.namespace_labels.tenant
              to: tenant
          fail_on_error: false
          ignore_missing: true
          when.not.has_fields: ['tenant']
      # drop events without index-name, because ELK can't handle them anyway
      - drop_event:
          when.not.has_fields: ['index-name']
    output.logstash:
      hosts:
      - {{ printf "%s:%d" .Values.log_sink.address (.Values.log_sink.port | int) }}
      ssl:
        certificate_authorities:
          - "/etc/puki-certs/pukirootca1.pem"

above is my config file, when updating to 8.6.2, I drop some fields, add some and copied some see changes below

 - drop_fields:
          fields: ['kubernetes.node.uid', 'kubernetes.pod.ip', '/^kubernetes.node.labels.*/']
      - add_fields:
          fields:
            agent.hostname: ${HOSTNAME}
      - copy_fields:
          fields:
            - from: container.image.name
              to: kubernetes.container.image
          target: "kubernetes"

Tried to comment out those changes to see if they are root cause, but it did not help since the cpu usage was still high.

Any idea why this is happening?

I've noticed as well that Filebeat has slowly increased in CPU usage overtime. I'm not 100% sure what causes it, but one of the things I've been looking at recently is switching from the default_config from type: container to type: filestream, as I think part of the "issue" is that in larger scale deployments the total number of files has an impact on Filebeat's performance, and that filestream is in theory supposed to be a much more efficient input type.

I haven't actually had a chance to test this theory yet, so not 100% sure it will make a difference. This comment shows somewhat how to implement the filestream type, but is missing the id value, I think you can do something like:

filebeat.autodiscover:
     providers:
       - type: kubernetes
         node: ${NODE_NAME}
         hints.enabled: true
         hints.default_config:
           type: filestream
           id: kubernetes-container-logs-${kubernetes.pod.name}-${kubernetes.container.id}
           paths:
             - /var/log/containers/*${data.kubernetes.container.id}.log
           parsers:
             - container:
               stream: all
               format: auto

Note: Currently it is not possible to switch from container to filestream without reprocessing logs. This feature appears to be added in 8.7 (https://github.com/elastic/beats/pull/34292)

Another observation, where possible, you have some processors (ex: drop_fields, copy_fields) which don't have ignore_missing: true set to true. I've seen in the past that this can have some negative performance impacts as Filebeat will need to deal with error handling on missing fields.

1 Like

Hmm, could you provide the entire config you're using, and do you see any warning or error logs from Filebeat?

Sorry, I meant could you provide the configuration you tried with type: filestream. This appears to be the config with type: container.

That config looks correct, do you see anything in the Filebeat logs that would highlight any issues?

not at all

Hmm. something I'm just noticing from the example I copied, is that it has invalid yaml.

parsers:
  - container:
    stream: all
    format: auto

Should instead be:

parsers:
  - container:
      stream: all
      format: auto

Where stream and format are under container

didn't also pay attention to that. However it doesn't fix the issue. the logs are still the same

Any other suggestion here?

Hmm, the only other thing I can think of is try enabling debug logging, this will hopefully show something that might be useful for why this isn't working as intended.

I was able to get around to testing this, here is a working configuration for autodiscover:

filebeat.autodiscover:
  providers:
    - type: kubernetes
      node: ${NODE_NAME}
      hints:
        enabled: true
        default_config:
          type: filestream
          id: kubernetes-container-logs-${data.kubernetes.pod.name}-${data.kubernetes.container.id}
          paths:
            - /var/log/containers/*${data.kubernetes.container.id}.log
          prospector.scanner.symlinks: true: # Optional, but most people probably use symlinks
          parsers:
          - container:
              stream: all
              format: auto

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.