Filebeat not collection logs of elastic pods

I am not sure since when this is happening, but my filebeat has stopped collecting the logs of elastic pods that I have deployed using eck.

In total I do have these 3 pods:

elastic-es-masterdata-100-0
elastic-es-masterdata-100-1
elastic-es-masterdata-100-2

kubectl get logs for elastic-es-masterdata-100-0

{"type": "server", "timestamp": "2020-12-28T09:55:14,039Z", "level": "INFO", "component": "o.e.x.s.a.TokenService", "cluster.name": "elastic", "node.name": "elastic-es-masterdata-100-0", "message": "refresh keys" }
{"type": "server", "timestamp": "2020-12-28T09:55:14,304Z", "level": "INFO", "component": "o.e.x.s.a.TokenService", "cluster.name": "elastic", "node.name": "elastic-es-masterdata-100-0", "message": "refreshed keys" }
{"type": "server", "timestamp": "2020-12-28T09:55:14,341Z", "level": "INFO", "component": "o.e.l.LicenseService", "cluster.name": "elastic", "node.name": "elastic-es-masterdata-100-0", "message": "license [05fd3886-5ac3-4f63-af8d-98ed7adf3828] mode [basic] - valid" }
{"type": "server", "timestamp": "2020-12-28T09:55:14,342Z", "level": "INFO", "component": "o.e.x.s.s.SecurityStatusChangeListener", "cluster.name": "elastic", "node.name": "elastic-es-masterdata-100-0", "message": "Active license is now [BASIC]; Security is enabled" }
{"type": "server", "timestamp": "2020-12-28T09:55:14,351Z", "level": "INFO", "component": "o.e.h.AbstractHttpServerTransport", "cluster.name": "elastic", "node.name": "elastic-es-masterdata-100-0", "message": "publish_address {10.101.8.112:9200}, bound_addresses {0.0.0.0:9200}", "cluster.uuid": "vJDbksYdQoqxG-J2XIxFCg", "node.id": "bmuLp9vgRISUXQ92x3SHrg"  }
{"type": "server", "timestamp": "2020-12-28T09:55:14,351Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "elastic", "node.name": "elastic-es-masterdata-100-0", "message": "started", "cluster.uuid": "vJDbksYdQoqxG-J2XIxFCg", "node.id": "bmuLp9vgRISUXQ92x3SHrg"  }

Shows a bunch of logs. So everything is fine there.
However, on the filebeat pod that is running on the same node and therefore should be responsible for collecting the related logs I get the following message:

2020-12-28T09:55:00.108Z        INFO    log/harvester.go:302    Harvester started for file: /var/log/containers/elastic-es-masterdata-100-0_elastic-system_elasticsearch-e7afb8b67f88b2a036ffe91b427b6dd1c0dd3a0d0589372dba510ede9f2aa74d.log
2020-12-28T09:55:00.108Z        INFO    log/harvester.go:302    Harvester started for file: /var/log/containers/elastic-es-masterdata-100-0_elastic-system_elasticsearch-e7afb8b67f88b2a036ffe91b427b6dd1c0dd3a0d0589372dba510ede9f2aa74d.log
2020-12-28T09:55:00.108Z        INFO    log/harvester.go:302    Harvester started for file: /var/log/containers/elastic-es-masterdata-100-0_elastic-system_elasticsearch-e7afb8b67f88b2a036ffe91b427b6dd1c0dd3a0d0589372dba510ede9f2aa74d.log
2020-12-28T09:55:00.108Z        INFO    log/harvester.go:302    Harvester started for file: /var/log/containers/elastic-es-masterdata-100-0_elastic-system_elasticsearch-e7afb8b67f88b2a036ffe91b427b6dd1c0dd3a0d0589372dba510ede9f2aa74d.log
2020-12-28T09:55:00.108Z        INFO    log/harvester.go:302    Harvester started for file: /var/log/containers/elastic-es-masterdata-100-0_elastic-system_elasticsearch-e7afb8b67f88b2a036ffe91b427b6dd1c0dd3a0d0589372dba510ede9f2aa74d.log
2020-12-28T09:55:15.185Z        ERROR   fileset/factory.go:103  Error creating input: Can only start an input when all related states are finished: {Id: native::50472987-66305, Finished: false, Fileinfo: &{elastic-es-masterdata-100-0_elastic-system_elasticsearch-e7afb8b67f88b2a036ffe91b427b6dd1c0dd3a0d0589372dba510ede9f2aa74d.log 0 416 {980206655 63744746098 0x64d0ce0} {66305 50472987 1 33184 0 0 0 0 0 4096 0 {1609149298 980206655} {1609149298 980206655} {1609149298 980206655} [0 0 0]}}, Source: /var/log/containers/elastic-es-masterdata-100-0_elastic-system_elasticsearch-e7afb8b67f88b2a036ffe91b427b6dd1c0dd3a0d0589372dba510ede9f2aa74d.log, Offset: 29213, Timestamp: 2020-12-28 09:55:14.754030991 +0000 UTC m=+418.853525927, TTL: -1ns, Type: container, Meta: map[], FileStateOS: 50472987-66305}
}', won't start runner: Can only start an input when all related states are finished: {Id: native::50472987-66305, Finished: false, Fileinfo: &{elastic-es-masterdata-100-0_elastic-system_elasticsearch-e7afb8b67f88b2a036ffe91b427b6dd1c0dd3a0d0589372dba510ede9f2aa74d.log 0 416 {980206655 63744746098 0x64d0ce0} {66305 50472987 1 33184 0 0 0 0 0 4096 0 {1609149298 980206655} {1609149298 980206655} {1609149298 980206655} [0 0 0]}}, Source: /var/log/containers/elastic-es-masterdata-100-0_elastic-system_elasticsearch-e7afb8b67f88b2a036ffe91b427b6dd1c0dd3a0d0589372dba510ede9f2aa74d.log, Offset: 29213, Timestamp: 2020-12-28 09:55:14.754030991 +0000 UTC m=+418.853525927, TTL: -1ns, Type: container, Meta: map[], FileStateOS: 50472987-66305}
2020-12-28T09:55:15.193Z        ERROR   fileset/factory.go:103  Error creating input: Can only start an input when all related states are finished: {Id: native::50472987-66305, Finished: false, Fileinfo: &{elastic-es-masterdata-100-0_elastic-system_elasticsearch-e7afb8b67f88b2a036ffe91b427b6dd1c0dd3a0d0589372dba510ede9f2aa74d.log 0 416 {980206655 63744746098 0x64d0ce0} {66305 50472987 1 33184 0 0 0 0 0 4096 0 {1609149298 980206655} {1609149298 980206655} {1609149298 980206655} [0 0 0]}}, Source: /var/log/containers/elastic-es-masterdata-100-0_elastic-system_elasticsearch-e7afb8b67f88b2a036ffe91b427b6dd1c0dd3a0d0589372dba510ede9f2aa74d.log, Offset: 29213, Timestamp: 2020-12-28 09:55:14.754030991 +0000 UTC m=+418.853525927, TTL: -1ns, Type: container, Meta: map[], FileStateOS: 50472987-66305}
}', won't start runner: Can only start an input when all related states are finished: {Id: native::50472987-66305, Finished: false, Fileinfo: &{elastic-es-masterdata-100-0_elastic-system_elasticsearch-e7afb8b67f88b2a036ffe91b427b6dd1c0dd3a0d0589372dba510ede9f2aa74d.log 0 416 {980206655 63744746098 0x64d0ce0} {66305 50472987 1 33184 0 0 0 0 0 4096 0 {1609149298 980206655} {1609149298 980206655} {1609149298 980206655} [0 0 0]}}, Source: /var/log/containers/elastic-es-masterdata-100-0_elastic-system_elasticsearch-e7afb8b67f88b2a036ffe91b427b6dd1c0dd3a0d0589372dba510ede9f2aa74d.log, Offset: 29213, Timestamp: 2020-12-28 09:55:14.754030991 +0000 UTC m=+418.853525927, TTL: -1ns, Type: container, Meta: map[], FileStateOS: 50472987-66305}

Someone having an idea?
Log collection for all other pods is working fine

Below my filebeat config

  filebeat.yml: |-
    output.elasticsearch.hosts: ['https://${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
    output.elasticsearch.protocol: "https"
    output.elasticsearch.ssl.verification_mode: "none"
    output.elasticsearch.username: ${ELASTICSEARCH_USERNAME}
    output.elasticsearch.password: "${ELASTICSEARCH_PASSWORD}"
    output.elasticsearch.index: "%{[kubernetes.namespace]}-filebeat-%{+xxxx.ww}"
    setup.template.enabled: true
    setup.template.name: "filebeat-%{[agent.version]}"
    setup.template.pattern: "*-filebeat-*"
    setup.template.order: 150
    setup.template.overwrite: true
    setup.ilm.enabled: false
    filebeat.autodiscover.providers:
    - type: kubernetes
      node: ${NODE_NAME}
      hints.enabled: true
      hints.default_config:
        type: container
        paths: ["/var/log/containers/*-${data.kubernetes.container.id}.log"]
        multiline.pattern: '^[[:space:]]'
        multiline.negate: false
        multiline.match: after
        exclude_lines: ["^\\s+[\\-`('.|_]"]  # drop asciiart lines
    processors:
      - add_host_metadata:
          netinfo.enabled: false
      - add_cloud_metadata:
      - add_kubernetes_metadata:
          host: ${NODE_NAME}
          matchers:
          - logs_path:
              logs_path: "/var/log/containers/"
      - decode_json_fields:
          fields: ["message"]
          process_array: true
          max_depth: 1
          target: ""
          overwrite_keys: true
          add_error_key: true
      - drop_event: #namespaces to be excluded from logging
          when.or:
          - equals.kubernetes.namespace: "test"
          - equals.kubernetes.namespace: "default"

Nobody having an idea?

The response here is very slow if any ...
my filebeat also stoped to collect logs ..
running on ECK in EKS

Hi @umen any update on your side regarding that issue?

@raulgs
yes , for me it is internal use and i don't see point to create test certificate so what i did is

output.elasticsearch:
      hosts: ['${ELASTICSEARCH_HOST:https://172.10.10.108}:${ELASTICSEARCH_PORT:9200}']
      ssl.verification_mode: 'none'  

Also pay attention that i used IP and not the host , as the DNS not working.
Another helpful hint i got from running this in not of the filebeat pods :

filebeat test output -e -d "*

But now i have many other problems related to kibna ...

this did not work for me unfortunately.
the elastic pod logs are still not being collected :thinking:

@raulgs
enter to one of your pods and execute :

filebeat test output -e -d "*"

post here the output

@umen this is what I get using the correct filebeat.yml

sh-4.2# filebeat test output -c filebeat.yml -e -d "*"
2021-01-08T07:26:46.874Z        INFO    instance/beat.go:645    Home path: [/usr/share/filebeat] Config path: [/usr/share/filebeat] Data path: [/usr/share/filebeat/data] Logs path: [/usr/share/filebeat/logs]
2021-01-08T07:26:46.874Z        DEBUG   [beat]  instance/beat.go:697    Beat metadata path: /usr/share/filebeat/data/meta.json
2021-01-08T07:26:46.874Z        INFO    instance/beat.go:653    Beat ID: bd3fa3b0-4bd9-4fce-9000-6cc51ec44e59
2021-01-08T07:26:46.879Z        DEBUG   [add_cloud_metadata]    add_cloud_metadata/providers.go:126     add_cloud_metadata: starting to fetch metadata, timeout=3s
2021-01-08T07:26:46.880Z        DEBUG   [conditions]    conditions/conditions.go:98     New condition equals: map[kubernetes.namespace:{0 escrow false}]
2021-01-08T07:26:46.880Z        DEBUG   [conditions]    conditions/conditions.go:98     New condition equals: map[kubernetes.namespace:{0 escrow-ote false}]
2021-01-08T07:26:46.880Z        DEBUG   [conditions]    conditions/conditions.go:98     New condition equals: map[kubernetes.namespace:{0 logging false}]
2021-01-08T07:26:46.880Z        DEBUG   [conditions]    conditions/conditions.go:98     New condition equals: map[kubernetes.namespace:{0 monitoring false}]
2021-01-08T07:26:46.880Z        DEBUG   [conditions]    conditions/conditions.go:98     New condition equals: map[kubernetes.namespace:{0 kubernetes-dashboard false}]
2021-01-08T07:26:46.880Z        DEBUG   [conditions]    conditions/conditions.go:98     New condition equals: map[kubernetes.namespace:{0 test false}]
2021-01-08T07:26:46.880Z        DEBUG   [conditions]    conditions/conditions.go:98     New condition equals: map[kubernetes.namespace:{0 default false}]
2021-01-08T07:26:46.880Z        DEBUG   [conditions]    conditions/conditions.go:98     New condition equals: map[kubernetes.namespace:{0 ismail false}]
2021-01-08T07:26:46.880Z        DEBUG   [conditions]    conditions/conditions.go:98     New condition equals: map[kubernetes.namespace:{0 sebi false}]
2021-01-08T07:26:46.881Z        DEBUG   [add_cloud_metadata]    add_cloud_metadata/providers.go:162     add_cloud_metadata: received disposition for azure after 1.778347ms. result=[provider:azure, error=failed with http status code 404, metadata={}]
2021-01-08T07:26:46.881Z        DEBUG   [add_cloud_metadata]    add_cloud_metadata/providers.go:162     add_cloud_metadata: received disposition for digitalocean after 1.867099ms. result=[provider:digitalocean, error=failed with http status code 404, metadata={}]
2021-01-08T07:26:46.881Z        DEBUG   [add_cloud_metadata]    add_cloud_metadata/providers.go:162     add_cloud_metadata: received disposition for gcp after 1.964065ms. result=[provider:gcp, error=failed with http status code 404, metadata={}]
2021-01-08T07:26:46.881Z        DEBUG   [add_cloud_metadata]    add_cloud_metadata/providers.go:162     add_cloud_metadata: received disposition for aws after 2.049167ms. result=[provider:aws, error=<nil>, metadata={"account":{"id":"118596554645"},"availability_zone":"eu-central-1a","image":{"id":"ami-05ee66d01f959b32a"},"instance":{"id":"i-0de2984e3104e8bb9"},"machine":{"type":"c5.2xlarge"},"provider":"aws","region":"eu-central-1"}]
2021-01-08T07:26:46.881Z        DEBUG   [add_cloud_metadata]    add_cloud_metadata/providers.go:129     add_cloud_metadata: fetchMetadata ran for 2.173725ms
2021-01-08T07:26:46.881Z        INFO    [add_cloud_metadata]    add_cloud_metadata/add_cloud_metadata.go:93     add_cloud_metadata: hosting provider type detected as aws, metadata={"account":{"id":"118596554645"},"availability_zone":"eu-central-1a","image":{"id":"ami-05ee66d01f959b32a"},"instance":{"id":"i-0de2984e3104e8bb9"},"machine":{"type":"c5.2xlarge"},"provider":"aws","region":"eu-central-1"}
2021-01-08T07:26:46.881Z        DEBUG   [conditions]    conditions/conditions.go:98     New condition equals: map[kubernetes.namespace:{0 escrow false}] or equals: map[kubernetes.namespace:{0 escrow-ote false}] or equals: map[kubernetes.namespace:{0 logging false}] or equals: map[kubernetes.namespace:{0 monitoring false}] or equals: map[kubernetes.namespace:{0 kubernetes-dashboard false}] or equals: map[kubernetes.namespace:{0 test false}] or equals: map[kubernetes.namespace:{0 default false}] or equals: map[kubernetes.namespace:{0 ismail false}] or equals: map[kubernetes.namespace:{0 sebi false}]
2021-01-08T07:26:46.881Z        DEBUG   [processors]    processors/processor.go:120     Generated new processors: add_host_metadata=[netinfo.enabled=[false], cache.ttl=[5m0s]], add_cloud_metadata={"account":{"id":"118596554645"},"availability_zone":"eu-central-1a","image":{"id":"ami-05ee66d01f959b32a"},"instance":{"id":"i-0de2984e3104e8bb9"},"machine":{"type":"c5.2xlarge"},"provider":"aws","region":"eu-central-1"}, add_kubernetes_metadata, decode_json_fields=message, drop_fields={"Fields":["agent.ephemeral_id","agent.hostname","agent.id","agent.type","agent.name","agent.version","ecs.version","host.id","host.name","cloud.account.id","cloud.instance.id","cloud.machine.type","cloud.project.id","input.type","log.offset"],"IgnoreMissing":true}, drop_event, condition=equals: map[kubernetes.namespace:{0 escrow false}] or equals: map[kubernetes.namespace:{0 escrow-ote false}] or equals: map[kubernetes.namespace:{0 logging false}] or equals: map[kubernetes.namespace:{0 monitoring false}] or equals: map[kubernetes.namespace:{0 kubernetes-dashboard false}] or equals: map[kubernetes.namespace:{0 test false}] or equals: map[kubernetes.namespace:{0 default false}] or equals: map[kubernetes.namespace:{0 ismail false}] or equals: map[kubernetes.namespace:{0 sebi false}]
2021-01-08T07:26:46.882Z        INFO    [index-management]      idxmgmt/std.go:184      Set output.elasticsearch.index to 'filebeat-7.10.1' as ILM is enabled.
2021-01-08T07:26:46.882Z        INFO    eslegclient/connection.go:99    elasticsearch url: https://elastic-es-http.elastic-system.svc:9200
2021-01-08T07:26:46.882Z        WARN    [tls]   tlscommon/tls_config.go:93      SSL/TLS verifications disabled.
elasticsearch: https://elastic-es-http.elastic-system.svc:9200...
  parse url... OK
  connection...
    parse host... OK
    dns lookup... OK
    addresses: 10.101.36.111
    dial up... OK
  TLS...2021-01-08T07:26:46.884Z        WARN    [tls]   tlscommon/tls_config.go:93      SSL/TLS verifications disabled.

    security... WARN server's certificate chain verification is disabled
    handshake... OK
    TLS version: TLSv1.3
    dial up... OK
2021-01-08T07:26:46.894Z        DEBUG   [esclientleg]   eslegclient/connection.go:290   ES Ping(url=https://elastic-es-http.elastic-system.svc:9200)
2021-01-08T07:26:46.894Z        WARN    [tls]   tlscommon/tls_config.go:93      SSL/TLS verifications disabled.
2021-01-08T07:26:46.895Z        INFO    add_kubernetes_metadata/kubernetes.go:71        add_kubernetes_metadata: kubernetes env detected, with version: v1.18.12
2021-01-08T07:26:46.895Z        DEBUG   [kubernetes]    add_kubernetes_metadata/matchers.go:72  logs_path matcher log path: /var/log/containers/
2021-01-08T07:26:46.895Z        DEBUG   [kubernetes]    add_kubernetes_metadata/matchers.go:73  logs_path matcher resource type: container
2021-01-08T07:26:46.895Z        DEBUG   [kubernetes]    add_kubernetes_metadata/matchers.go:72  logs_path matcher log path: /var/lib/docker/containers/
2021-01-08T07:26:46.895Z        DEBUG   [kubernetes]    add_kubernetes_metadata/matchers.go:73  logs_path matcher resource type: container
2021-01-08T07:26:46.895Z        INFO    [kubernetes]    kubernetes/util.go:99   kubernetes: Using node 5-21-282-1063-1-23b4269b provided in the config  {"libbeat.processor": "add_kubernetes_metadata"}
2021-01-08T07:26:46.895Z        DEBUG   [kubernetes]    add_kubernetes_metadata/kubernetes.go:162       Initializing a new Kubernetes watcher using host: 5-21-282-1063-1-23b4269b      {"libbeat.processor": "add_kubernetes_metadata"}
2021-01-08T07:26:46.903Z        DEBUG   [esclientleg]   eslegclient/connection.go:313   Ping status code: 200
2021-01-08T07:26:46.903Z        INFO    [esclientleg]   eslegclient/connection.go:314   Attempting to connect to Elasticsearch version 7.10.1
2021-01-08T07:26:46.903Z        DEBUG   [esclientleg]   eslegclient/connection.go:364   GET https://elastic-es-http.elastic-system.svc:9200/_license?human=false  <nil>
2021-01-08T07:26:46.917Z        DEBUG   [license]       licenser/check.go:31    Checking that license covers %sBasic
2021-01-08T07:26:46.917Z        INFO    [license]       licenser/es_callback.go:51      Elasticsearch license: Basic
  talk to server... OK
  version: 7.10.1

I can't see anything suspicious here,
but if you don't get the logs you most see in one of the endpoints of elastic some error.
What about looking at the logs in :
elastic - cluster master nodes
Kibana
you can also look at the path of the logs in the nodes /var/logs/container

That is what I was expecting.
The Filebeat pods are collecting all logs expect the elastic pods.
Which is kind of strange as we haven't changed anything other then the version of filebeat and elastic during the last couple of month...
So not sure what else to check in regards of that. As shown in my first message - this is the error that I get in the filebeat logs regarding the elastic pods.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.