Metricbeat reporting all processes as "Sleeping"

Hi

I have installed metricbeat 6.3.2 together with kube-state-metrics 1.3.1 in our openshift cluster and I'm able to see all the system metrics. The problem that I find is that all processes under system.process.state metric, are showing as "Sleeping", which is not right.

Also, I'm not able to see the kubernetes metric kubernetes.node.status.ready under state_node.

We are running Openshift 3.7

I'm starting thinking it may be related to the versions we are using..

Any idea why this could be happening?

Many thanks

Hi @gonzalomk,

Could you please provide the steps you used to deploy Metricbeat? I would also like to know the version of your kube-state-metrics instance.

Best regards

Hi

We are running kube-state-metrics version 1.3.1. We followed the official documentation to deploy metric beat. Here are the config files:

Daemonset config

kubernetes.yml:

module: kubernetes
  fields:
    clusterName: OCP1
  metricsets:
    - node
    - system
    - pod
    - container
    - volume
  period: 20s
  hosts: ["http://localhost:10255"]

system.yml:

module: system
  fields:
    clusterName: OCP1
  period: 20s
  metricsets:
    - cpu
    - load
    - memory
    - network
    - process
    - process_summary
    - core
    - diskio
    - socket
  processes: ['.*']
  process.include_top_n:
    by_cpu: 5      # include top 5 processes by CPU
    by_memory: 5   # include top 5 processes by memory

- module: system
  fields:
    clusterName: OCP1
  period: 20s
  metricsets:
    - filesystem
    - fsstat
  processors:
  - drop_event.when.regexp:
      system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib)($|/)'

Deployment config

kubernetes.yml:

module: kubernetes
  fields:
    clusterName: OCP1
  enabled: true
  metricsets:
    - state_node
    - state_deployment
    - state_replicaset
    - state_pod
    - state_container
  period: 20s
  hosts: ["kube-state-metrics:8080"]
  add_metadata: true
  in_cluster: true

- module: kubernetes
  fields:
    clusterName: OCP1
  enabled: true
  metricsets:
    - event

- module: kubernetes
  fields:
    clusterName: OCP1
  enabled: true
  metricsets:
    - apiserver
  hosts: ["https://kubernetes.default.svc.cluster.local:443"]

metricbeat config:

metricbeat.yml:

metricbeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: true
  reload.period: 20s

output.kafka:
  enabled: true
  hosts: ["kafka.******.*************:****"]
  topic: 'DEV_PAASUK_PLATFORM_MET'
  partition.round_robin:
    reachable_only: false
  required_acks: 1
  compression: gzip
  max_message_bytes: 1000000
  version: '0.10.2'
logging.level: debug
logging.json: true '

After upgrading metricbeat to version 3.4.0, we are able to see 'kubernetes.node.status.ready'

Regarding the process metric problem, in which we are only able to see processes with "Sleeping" state, we found out that filtering on the 'system.process.state' metric, we can see some on running state. the problem is that we can only see a few of them and not as many as we would expect. Any idea as of why this behavior?

Thanks

Is there any chance you can dump the /metrics endpoint of kube-state-metrics? Metricbeat reads from there to fill this Sleeping state.

Best regards

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.