APM java agent causes my service memory to be full and the service crashes

topstargogo · May 21, 2020, 3:31am

If you are asking about a problem you are experiencing, please use the following template, as it will help us help you. If you have a different problem, please delete all of this text

Kibana version: 7.7.0

Elasticsearch version: 7.7.0

APM Server version:7.7.0

APM Agent language and version:apm java agent , 1.16

Browser version:chrome
APM-SERVER config:

apm-server:
  host: "192.168.2.60:8200"
  rum:
    enabled: true
    event_rate:
      limit: 1000
      lru_size: 100000
    allow_origins : ['*']
  ilm:
    enabled: "false"
queue:
  mem:
    events: 4096
    flush.min_events: 2048
setup.template.enabled: true
setup.template.name: "apm-%{[observer.version]}"
setup.template.pattern: "apm-%{[observer.version]}-*"
setup.template.fields: "${path.config}/fields.yml"
setup.template.settings:
  index:
    number_of_shards: 5
    codec: best_compression
    number_of_routing_shards: 30
    mapping.total_fields.limit: 2000
output.elasticsearch:
  hosts: ["192.168.2.60:9200"]
  compression_level: 1
  username: "elastic"
  password: "461e.com"
  worker: 2
  indices:
    - index: "apm-%{[observer.version]}-sourcemap"
      when.contains:
        processor.event: "sourcemap"
  
    - index: "apm-%{[observer.version]}-error-%{+yyyy.MM.dd}"
      when.contains:
        processor.event: "error"
  
    - index: "apm-%{[observer.version]}-transaction-%{+yyyy.MM.dd}"
      when.contains:
        processor.event: "transaction"
  
    - index: "apm-%{[observer.version]}-span-%{+yyyy.MM.dd}"
      when.contains:
        processor.event: "span"
  
    - index: "apm-%{[observer.version]}-metric-%{+yyyy.MM.dd}"
      when.contains:
        processor.event: "metric"
  
    - index: "apm-%{[observer.version]}-onboarding-%{+yyyy.MM.dd}"
      when.contains:
        processor.event: "onboarding"
  bulk_max_size: 20480

We have put APM into production environment. Recently, we found a serious problem. APM java agent will cause the server memory and CPU to be full when the request is very large.But if apm java agent is not deployed, there will be no such problem.I reproduced this problem in the development environment.
I use the ab command to call my interface crazy.

ab -n 200000 -c 4000 '172.20.43.52:8080/api/v3/market_kline_for_app?symbol=1041&step=15&from=1589421627&to=1589871627'

At this time, it is found that the memory will rise, and then all the memory will be used. The CPU will also be full. Eventually, the service crashes and calls this interface becomes very slow.
I used jmap to dump the memory for analysis.

 jmap -dump:live,format=b,file=/home/1.hprof 1

Use mat to analyze the memory. The result is shown in the image.

Memory is used by co.elastic.apm.agent.grpc.helper.GrpcHelperImpl

topstargogo · May 21, 2020, 3:45am

More detailed information

topstargogo · May 21, 2020, 4:12am

These occupied memory will not be released even if you stop the stress test. You need to restart the server

Eyal_Koren · May 21, 2020, 9:24am

Thanks for reporting!

This seems to be related to a gRPC instrumentation related memory leak issue we are already aware of and looking into.
In order to verify, please set the disable_instrumentations config option to grpc and see if the problem is reproduced.
If not, please watch this GitHub issue, so you get notified once it is resolved.

topstargogo · May 21, 2020, 10:25am

Yes, I know this is caused by the call of grpc. If I put

elastic.apm.disable_instrumentations = grpc

I can’t collect the GRPC information of the service.I just want to collect the GRPC request information of my application

Eyal_Koren · May 21, 2020, 2:09pm

That's understood, I didn't suggest dropping support for gRPC, as I wrote - we are looking into this. I only asked that you verify that there is no any observable issue when this instrumentation is off.

system · June 11, 2020, 10:10am

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
APM java agent memory usage APM java	2	1321	August 26, 2020
APM client drop queue if memory usage too big APM	8	1307	November 4, 2022
Proper APM setup APM ruby , server	5	729	June 28, 2019
Elastic apm cannot collect GRPC requests APM java	16	1373	May 20, 2020
Apm java agent makes no space left on device APM java	2	754	August 26, 2020

APM java agent causes my service memory to be full and the service crashes

Related topics