APM agent suddenly stopped sending data to APM server

surya_dadi_dhamarake · October 23, 2023, 6:59am

Hi Team,

I have deployed elastic cloud deployment along with APM server and integrations server.
My Deployment version : 8.9.0
Kibana and integration servers with: 1GB RAM, up to 8.4vCPU

I have integrated APM agent with 10 applications running in different nodes, all are running fine continuously but there is a problem with 1 application. It suddenly stopped sending the data to APM server. When I look at the debug logs I found below Error.

2023-10-22 10:19:49,788 [elastic-apm-server-reporter] DEBUG co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Request flush because the request timeout occurred
2023-10-22 10:19:49,788 [elastic-apm-server-reporter] DEBUG co.elastic.apm.agent.report.AbstractIntakeApiHandler - Flushing 2925 uncompressed 954 compressed bytes
2023-10-22 10:19:50,460 [elastic-apm-configuration-reloader] DEBUG co.elastic.apm.agent.impl.ElasticApmTracerBuilder - Beginning scheduled configuration reload (interval is 30 sec)...
2023-10-22 10:19:50,460 [elastic-apm-configuration-reloader] DEBUG co.elastic.apm.agent.impl.ElasticApmTracerBuilder - Finished scheduled configuration reload
2023-10-22 10:19:54,790 [elastic-apm-server-reporter] WARN  co.elastic.apm.agent.report.AbstractIntakeApiHandler - Response body: null
2023-10-22 10:19:54,790 [elastic-apm-server-reporter] INFO  co.elastic.apm.agent.report.AbstractIntakeApiHandler - Backing off for 0 seconds (+/-10%)
2023-10-22 10:19:54,790 [elastic-apm-server-reporter] ERROR co.elastic.apm.agent.report.AbstractIntakeApiHandler - Error sending data to APM server: Read timed out, response code is -1

Can anyone please tell me what is wrong? I am not able to find anything related to this error and response code in the docs.

Jonas_Kunz · October 23, 2023, 10:35am

Hi @surya_dadi_dhamarake ,

The log message

2023-10-22 10:19:54,790 [elastic-apm-server-reporter] ERROR co.elastic.apm.agent.report.AbstractIntakeApiHandler - Error sending data to APM server: Read timed out, response code is -1

indicates that your application was not able to receive a response from the APM-server, for which very likely network connectivity problems are the root cause.

You can enable debug logging on your agent to rule out other possible root causes, such as a bad proxy configuration.

surya_dadi_dhamarake · October 23, 2023, 2:43pm

Hi @Jonas_Kunz ,

I have already enabled debug logging in my APM agent. The error log that you are mentioning is from the debug logging only. I couldn't see any other error other than this.

After the Error log mentioned in my first message, I could see multiple occurrences(may be 50+ times) of below logs

2023-10-22 10:19:59,368 [elastic-apm-remote-config-poller] DEBUG co.elastic.apm.agent.util.UrlConnectionUtils - Opening https://APMserverURL/config/v1/agents without proxy
2023-10-22 10:19:59,368 [elastic-apm-remote-config-poller] DEBUG co.elastic.apm.agent.configuration.ApmServerConfigurationSource - Reloading configuration from APM Server https://APMserverURL/config/v1/agents
2023-10-22 10:19:59,493 [elastic-apm-remote-config-poller] DEBUG co.elastic.apm.agent.configuration.ApmServerConfigurationSource - Configuration did not change
2023-10-22 10:19:59,493 [elastic-apm-remote-config-poller] DEBUG co.elastic.apm.agent.configuration.ApmServerConfigurationSource - Scheduling next remote configuration reload in 30s
2023-10-22 10:20:20,461 [elastic-apm-configuration-reloader] DEBUG co.elastic.apm.agent.impl.ElasticApmTracerBuilder - Beginning scheduled configuration reload (interval is 30 sec)...
2023-10-22 10:20:20,461 [elastic-apm-configuration-reloader] DEBUG co.elastic.apm.agent.impl.ElasticApmTracerBuilder - Finished scheduled configuration reload
2023-10-22 10:20:29,498 [elastic-apm-remote-config-poller] DEBUG co.elastic.apm.agent.util.UrlConnectionUtils - Opening https://APMserverURL/config/v1/agents without proxy
2023-10-22 10:20:29,498 [elastic-apm-remote-config-poller] DEBUG co.elastic.apm.agent.configuration.ApmServerConfigurationSource - Reloading configuration from APM Server https://APMserverURL/config/v1/agents
2023-10-22 10:20:29,638 [elastic-apm-remote-config-poller] DEBUG co.elastic.apm.agent.configuration.ApmServerConfigurationSource - Configuration did not change
2023-10-22 10:20:29,638 [elastic-apm-remote-config-poller] DEBUG co.elastic.apm.agent.configuration.ApmServerConfigurationSource - Scheduling next remote configuration reload in 30s
2023-10-22 10:20:50,463 [elastic-apm-configuration-reloader] DEBUG co.elastic.apm.agent.impl.ElasticApmTracerBuilder - Beginning scheduled configuration reload (interval is 30 sec)...
2023-10-22 10:20:50,463 [elastic-apm-configuration-reloader] DEBUG co.elastic.apm.agent.impl.ElasticApmTracerBuilder - Finished scheduled configuration reload

and then below logs

2023-10-22 10:48:09,615 [elastic-apm-shared] DEBUG co.elastic.apm.agent.report.ApmServerReporter - Could not add JsonWriter {"metricset":{"timestamp":1697932089615000,"tags":{"name":"PS MarkSweep"},"samples":{"jvm.gc.time":{"value":65891.0},"jvm.gc.count":{"value":11.0}}}}
 to ring buffer as no slots are available
2023-10-22 10:48:09,615 [elastic-apm-shared] DEBUG co.elastic.apm.agent.report.ApmServerReporter - Could not add JsonWriter {"metricset":{"timestamp":1697932089615000,"tags":{"name":"Compressed Class Space"},"samples":{"jvm.memory.non_heap.pool.committed":{"value":18087936.0},"jvm.memory.non_heap.pool.used":{"value":16583912.0},"jvm.memory.non_heap.pool.max":{"value":1073741824.0}}}}
 to ring buffer as no slots are available

Jonas_Kunz · October 24, 2023, 10:10am

The Could not add ... to ring buffer as no slots are available log message indicates that the internal queue of the APM agent used for buffering data before sending it is filling up, because currently no data can be send.

It might also be the case that the error you are seeing is caused by an overloaded APM server. Could you try disabling all other APM-agents sending to that server and check whether the error still persists?

surya_dadi_dhamarake · October 24, 2023, 1:21pm

Hi @Jonas_Kunz , I have even tried with 2 applications only but issue still persists. Is there any thing else that we have to look?

Jonas_Kunz · October 26, 2023, 11:07am

That seems strange. Could you provide

the full APM-agent debug logs
the APM server logs

so that we can further analyse. Both log files should cover the same period of time. You can use GH gists) to upload those logs.

surya_dadi_dhamarake · October 30, 2023, 12:13pm

Hi @Jonas_Kunz ,

I have seen a warning message also in our logs. Can you please tell me if this can be the issue?

2023-10-20 21:14:46,962 [https-jsse-nio-443-exec-119] WARN  co.elastic.apm.agent.bci.bytebuddy.ErrorLoggingListener - org.apache.commons.httpclient.HttpMethodDirector uses an unsupported class file version (pre Java 4)) and can't be instrumented. You may try setting the 'instrument_ancient_bytecode' config option to 'true', but notice that it may cause VerificationErrors or other issues.

Jack_Shirazi · October 31, 2023, 2:27pm

That warning simply states that org.apache.commons.httpclient.HttpMethodDirector won't be instrumented. It doesn't affect anything else. In particular APM communication would continue regardless of that warning. As Jonas said, the warning looks like comms to the APM server stopped, either for network problems or APM server overload. One other option is that something in the app switched the JVM to using a proxy. We have some logging for proxy usage, search the agent DEBUG logs for proxy

surya_dadi_dhamarake · November 3, 2023, 4:10am

Hi @Jack_Shirazi ,

I am getting only one log related to proxy and I am pasting that below. How can we identify that the apm server is over loaded. do we have any limit in number of services that we integrate? How many services a single instance of APM server with 1 GB ram can handle?

2023-10-23 14:42:38,667 [elastic-apm-remote-config-poller] DEBUG co.elastic.apm.agent.util.UrlConnectionUtils - Opening https://apm-server-url/config/v1/agents without proxy

Jack_Shirazi · November 3, 2023, 10:45am

That rules out proxy issues. Check the APM server logs and CPU load. The limit is throughput not number of services

kusalmed · November 3, 2023, 5:45pm

Could you try disabling all other APM-agents sending to that server and check whether the error still persists?

surya_dadi_dhamarake · November 7, 2023, 10:17am

Hi @Jack_Shirazi ,

It is good to know. May I know how much throughput a server can handle?

surya_dadi_dhamarake · November 29, 2023, 4:53am

This issue is only coming when we integrate APM with our on-prem applications. We have deployed some of the applications in AWS ECS. When we integrate APM with them, there is no issue. So I think it might be the issue with the connectivity between on prem node and the APM server. Can any one provide me clarity on this?

system · December 27, 2023, 4:54am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Peridiocally Java APM Agent experiences errors with connection to APM server APM java	7	797	August 10, 2023
Error sending data to APM server: Read timed out, response code is -1 APM server	2	1432	January 30, 2023
Remote server not sending data to apm server APM java , server	4	396	August 21, 2019
Connection timed out problem APM java	4	1754	October 12, 2020
Error sending data to APM server APM java , server	2	1501	May 29, 2019

APM agent suddenly stopped sending data to APM server

Related topics