Hi @emilioalvap
1 - The problem with memory-leak is observed without additional monitors.
2 - The current autodiscovery configuration is modified to track 6 pods
heartbeat.autodiscover:
providers:
- type: kubernetes
resource: pod
scope: cluster
node: ${NODE_NAME}
include_annotations: ["openshift.io.deployment-config.name"]
templates:
- condition:
contains:
kubernetes.annotations.openshift.io/deployment-config.name: "project-event-prod"
config:
- type: tcp
id: "${data.kubernetes.container.id}"
service.name: "project-events"
name: "[POD][TCP Check] Project-event-prod"
hosts: ["${data.host}:8082"]
schedule: "@every 5s"
timeout: 1s
tags: ["${data.kubernetes.namespace}","${data.kubernetes.pod.name}","${data.kubernetes.container.name}"]
3 - Heartbeat pods never reached the cpu limit
max consumed no more than 15-20% of the available cpu limit
resources:
limits:
cpu: 2000m
memory: 1536Mi
requests:
cpu: 100m
memory: 128Mi
4 - The total number of pods in the Openshift cluster is 4,500 (66 nodes). The number of pods deployed is no more than 20-30 per minute. The pods that the Heartbeat tracks are updated very rarely.