We have deployed one java service into Kubernetes with a Heap Size of around 8GB. We also run APM along with it using the below command.
java -javaagent:/home/user/elastic-apm-agent-1.23.0.jar -jar -Delastic.apm.service_name=A -Delastic.apm.server_urls=http://192.168.x.x:8200 -Delastic.apm.environment=Development -Delastic.apm.profiling_inferred_spans_enabled=true -Delastic.apm.enable_log_correlation=true -Delastic.apm.profiling_inferred_spans_excluded_classes=co.elastic.* /home/user/A.war
We are using the latest APM Version(1.23.0).
We are running our setup in Kubernetes. We found that the Service pod was getting restarted frequently with load testing and the root cause was OOM.
Observation: It does not seem memory leak as we could not found a memory leak pattern for the long-running pod. It seems more that some processes are taking a lot of Heap when getting load.
To Debug Further, we took some Heap Dump of the running pod and we found that APM was consuming more than 40% of Retained Heap, and also APM threads were in the blocked state in Thread Dump.
Attaching Heap Dump Snapshots & Required Details to you for debugging further.
113,652,168 bytes (35.77 %) of Java heap is used by 3,071 instances of java/util/concurrent/ConcurrentHashMap$Node
co/elastic/apm/agent/shaded/bytebuddy/pool/TypePool$CacheProvider$Simple at 0x7fd1b35a0
JVM & OS:
Could you please help me with this. Let me know if need any information.