Metricbeat-Kubernetes startup error when upgrading from version 7.2 to 7.16.1

cgnusr01 · February 15, 2022, 5:20pm

At a recently upgraded cluster from version 7.2 to 7.16.1, we tried to update Metricbeat for Kubernetes PODs. However on startup we got the following error:

ERROR metrics/metrics.go:304 error determining cgroups version: error reading /proc/11483/cgroup: open /proc/11483/cgroup: no such file or directory

After some research we assume that this may be related to the currently Open Issue: Monitoring: allow specifying /proc or hostfs path. · Issue #23267 · elastic/beats · GitHub

Then, we tried to go at least to version 7.13. But again we got an error, which I paste below:

2022-02-14T15:14:33.811Z        WARN    [elasticsearch] elasticsearch/client.go:408     Cannot index event publisher.Event{Content:beat.Event{Timestamp:time.Time{wall:0xc07aba5664322f24, ext:123112042418, loc:(*time.Location)(0x55f6ac9f4ee0)}, Meta:null, Fields:{"agent":{"ephemera
l_id":"e1a17184-4daa-45b6-a4db-ae06a9f042b7","hostname":"gr-central-prod-backend06","id":"514bfcb6-f987-4ce9-9867-86522b6a86cd","name":"gr-central-prod-backend06","type":"metricbeat","version":"7.13.1"},"ecs":{"version":"1.9.0"},"event":{"dataset":"system.diskio","duration":610911
,"module":"system"},"fields":{"env":"production"},"host":{"disk":{"read.bytes":0,"write.bytes":388341760},"name":"gr-central-prod-backend06"},"metricset":{"name":"diskio","period":30000},"service":{"type":"system"},"tags":["backend"]}, Private:interface {}(nil), TimeSeries:true},
Flags:0x0, Cache:publisher.EventCache{m:common.MapStr(nil)}} (status=400): {"type":"mapper_parsing_exception","reason":"failed to parse","caused_by":{"type":"illegal_argument_exception","reason":"Limit of total fields [1000] has been exceeded while adding new fields [2]"}}
2022-02-14T15:14:33.811Z        WARN    [elasticsearch] elasticsearch/client.go:408     Cannot index event publisher.Event{Content:beat.Event{Timestamp:time.Time{wall:0xc07aba566444ac11, ext:123113254007, loc:(*time.Location)(0x55f6ac9f4ee0)}, Meta:null, Fields:{"agent":{"ephemera
l_id":"e1a17184-4daa-45b6-a4db-ae06a9f042b7","hostname":"gr-central-prod-backend06","id":"514bfcb6-f987-4ce9-9867-86522b6a86cd","name":"gr-central-prod-backend06","type":"metricbeat","version":"7.13.1"},"ecs":{"version":"1.9.0"},"event":{"dataset":"system.network","duration":23067
50,"module":"system"},"fields":{"env":"production"},"host":{"name":"gr-central-prod-backend06","network":{"in":{"bytes":10503604304,"packets":8045067},"out":{"bytes":10188621484,"packets":6522114}}},"metricset":{"name":"network","period":30000},"service":{"type":"system"},"tags":[
"backend"]}, Private:interface {}(nil), TimeSeries:true}, Flags:0x0, Cache:publisher.EventCache{m:common.MapStr(nil)}} (status=400): {"type":"mapper_parsing_exception","reason":"failed to parse","caused_by":{"type":"illegal_argument_exception","reason":"Limit of total fields [1000
] has been exceeded while adding new fields [2]"}}

We tried to increase the index.mapping.total_fields.limit to e.g. 2000 but this did not help.

In the end, we had to revert back to version 7.2...

Please for your assistance.

Andrea_Spacca · February 16, 2022, 11:25am

Hello @cgnusr01 ,

What's the actual error message after increasing the value?

Is metricbeat set up to write to the default index?
Did you update the mappings at index level or at index template level?

Could you detail the setup you have?

The documents in the error seem to be from system module, rather than from kubernetes.pod metricset

cgnusr01 · February 17, 2022, 9:11am

Hello @Andrea_Spacca

What's the actual error message after increasing the value?

The actual error message in this case was:

{
  "took": 491,
  "timed_out": false,
  "_shards": {
    "total": 12,
    "successful": 11,
    "skipped": 11,
    "failed": 1,
    "failures": [
      {
        "shard": 0,
        "index": "metricbeat-kube-7.13.1-2022.02.14-000001",
        "node": "ti0MftEaQk2lV0VMglBfTA",
        "reason": {
          "type": "script_exception",
          "reason": "runtime error",
          "script_stack": [
            "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:100)",
            "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:28)",
            "doc['kubernetes.replicaset.replicas.desired'].empty ? false:\n    (doc['kubernetes.replicaset.replicas.ready'].empty? false: doc['kubernetes.replicaset.replicas.desired'].value!=doc['kubernetes.replicaset.replicas.ready'].value)\n    \n ",
            "    ^---- HERE"
          ],
          "script": "doc['kubernetes.replicaset.replicas.desired'].empty ? false: ...",
          "lang": "painless",
          "position": {
            "offset": 4,
            "start": 0,
            "end": 234
          },
          "caused_by": {
            "type": "illegal_argument_exception",
            "reason": "No field found for [kubernetes.replicaset.replicas.desired] in mapping"
          }
        }
      }
    ]
  },
  "hits": {
    "max_score": null,
    "hits": []
  }
}

Is metricbeat set up to write to the default index?

Yes

Did you update the mappings at index level or at index template level?

No, we did not interfered with these at all

The Kubernetes ConfigMap holding the Metribeat properties is this:

metricbeat.config.modules:
      path: ${path.config}/modules.d/*.yml
      reload.enabled: false
    processors:
      - add_cloud_metadata:
      - if:
          or:
            - equals.system.network.name: "ens3f2"
            - equals.system.network.name: "ens3f3"
            - equals.system.network.name: "bond1.741"
        then:
          - add_fields:
              fields:
                vlan: "741"
        else:
          - if:
              or:
                - equals.system.network.name: "ens3f4"
                - equals.system.network.name: "ens3f5"
                - equals.system.network.name: "bond2.751"
            then:
              - add_fields:
                  fields:
                    vlan: "751"
            else:
              - drop_event:
                  when:
                    has_fields: ['system.network.name']
      - drop_event:
          when:
            regexp:
              system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib|hostfs|run)($|/)'
    output.elasticsearch:
      hosts: ["172.28.162.21:9200","172.28.162.22:9200","172.28.162.23:9200"]
      loadbalance: true
      protocol: "https"
      username: "${ES_USERNAME}"
      password: "${ES_PWD}"
      ssl:
       certificate_authorities: ["/etc/elasticsearch-ca.pem"]
       verification_mode: "none"
  
    setup.template:
      name: 'metricbeat-kube-%{[agent.version]}'
      pattern: 'metricbeat-kube-%{[agent.version]}*'
      enabled: false
      settings:
        index.number_of_shards: 1
        index.number_of_replicas: 1
        index.codec: best_compression
 
    setup.ilm.enabled: true
    setup.ilm.policy_name: 'metricbeat-kube-%{[agent.version]}'
    #rollover_alias does not support variables (agent.version) due to https://github.com/elastic/beats/issues/12233 so we set it explicitly
    setup.ilm.rollover_alias: 'metricbeat-kube-7.13.1'
    tags: ["backend"]
    fields:
      env: ${ENVIRONMENT}

Andrea_Spacca · February 17, 2022, 9:33am

hi @cgnusr01

this error indicates that the limit on total fields in the mapping was solved.

It is rather an error in the ingestion pipeline: I cannot find this source in the beats default pipelines, did you create it?

you can try to change the pipeline code using the null safe operator (Operators: Reference | Painless Scripting Language [7.13] | Elastic):
doc.kubernetes?.replicaset?.replicas?.desired?.empty

cgnusr01 · February 17, 2022, 2:08pm

@Andrea_Spacca thanks so much for your prompt response. We haven't created or changed anything in the default pipeline of Metricbeat from the Kubernetes module. And when we go back to 7.2 it simply works...

Andrea_Spacca · February 17, 2022, 3:49pm

@cgnusr01
did you run metricbeat setup from the new version?

that should update the pipelines

cgnusr01 · February 18, 2022, 8:42am

@Andrea_Spacca actually we haven't. I see now that setup.template.enable = false in the ConfigMap. We will try it out.

system · March 18, 2022, 10:43am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Metricbeat kubernetes WARN unable to index event Beats metricbeat	2	1649	October 2, 2018
ES 2.3 -> 5.x metricbeat index field limit Beats metricbeat	12	3218	January 3, 2017
Metricbeat is throwing a weird error Beats metricbeat	1	562	March 22, 2019
Metricbeat 6.3.2 upgrade breaks? Beats	9	1059	February 28, 2019
Metricbeat 7.4.0 and Kubernetes API - Issue Beats	4	352	October 31, 2019

Metricbeat-Kubernetes startup error when upgrading from version 7.2 to 7.16.1

Related topics