I have deployed elastic cloud deployment along with APM server and integrations server.
My Deployment version : 8.9.0
Kibana and integration servers with: 1GB RAM, up to 8.4vCPU
I have integrated APM agent with 10 applications running in different nodes, all are running fine continuously but there is a problem with 1 application. It suddenly stopped sending the data to APM server. When I look at the debug logs I found below Error.
2023-10-22 10:19:49,788 [elastic-apm-server-reporter] DEBUG co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Request flush because the request timeout occurred
2023-10-22 10:19:49,788 [elastic-apm-server-reporter] DEBUG co.elastic.apm.agent.report.AbstractIntakeApiHandler - Flushing 2925 uncompressed 954 compressed bytes
2023-10-22 10:19:50,460 [elastic-apm-configuration-reloader] DEBUG co.elastic.apm.agent.impl.ElasticApmTracerBuilder - Beginning scheduled configuration reload (interval is 30 sec)...
2023-10-22 10:19:50,460 [elastic-apm-configuration-reloader] DEBUG co.elastic.apm.agent.impl.ElasticApmTracerBuilder - Finished scheduled configuration reload
2023-10-22 10:19:54,790 [elastic-apm-server-reporter] WARN co.elastic.apm.agent.report.AbstractIntakeApiHandler - Response body: null
2023-10-22 10:19:54,790 [elastic-apm-server-reporter] INFO co.elastic.apm.agent.report.AbstractIntakeApiHandler - Backing off for 0 seconds (+/-10%)
2023-10-22 10:19:54,790 [elastic-apm-server-reporter] ERROR co.elastic.apm.agent.report.AbstractIntakeApiHandler - Error sending data to APM server: Read timed out, response code is -1
Can anyone please tell me what is wrong? I am not able to find anything related to this error and response code in the docs.
2023-10-22 10:19:54,790 [elastic-apm-server-reporter] ERROR co.elastic.apm.agent.report.AbstractIntakeApiHandler - Error sending data to APM server: Read timed out, response code is -1
indicates that your application was not able to receive a response from the APM-server, for which very likely network connectivity problems are the root cause.
You can enable debug logging on your agent to rule out other possible root causes, such as a bad proxy configuration.
I have already enabled debug logging in my APM agent. The error log that you are mentioning is from the debug logging only. I couldn't see any other error other than this.
After the Error log mentioned in my first message, I could see multiple occurrences(may be 50+ times) of below logs
2023-10-22 10:19:59,368 [elastic-apm-remote-config-poller] DEBUG co.elastic.apm.agent.util.UrlConnectionUtils - Opening https://APMserverURL/config/v1/agents without proxy
2023-10-22 10:19:59,368 [elastic-apm-remote-config-poller] DEBUG co.elastic.apm.agent.configuration.ApmServerConfigurationSource - Reloading configuration from APM Server https://APMserverURL/config/v1/agents
2023-10-22 10:19:59,493 [elastic-apm-remote-config-poller] DEBUG co.elastic.apm.agent.configuration.ApmServerConfigurationSource - Configuration did not change
2023-10-22 10:19:59,493 [elastic-apm-remote-config-poller] DEBUG co.elastic.apm.agent.configuration.ApmServerConfigurationSource - Scheduling next remote configuration reload in 30s
2023-10-22 10:20:20,461 [elastic-apm-configuration-reloader] DEBUG co.elastic.apm.agent.impl.ElasticApmTracerBuilder - Beginning scheduled configuration reload (interval is 30 sec)...
2023-10-22 10:20:20,461 [elastic-apm-configuration-reloader] DEBUG co.elastic.apm.agent.impl.ElasticApmTracerBuilder - Finished scheduled configuration reload
2023-10-22 10:20:29,498 [elastic-apm-remote-config-poller] DEBUG co.elastic.apm.agent.util.UrlConnectionUtils - Opening https://APMserverURL/config/v1/agents without proxy
2023-10-22 10:20:29,498 [elastic-apm-remote-config-poller] DEBUG co.elastic.apm.agent.configuration.ApmServerConfigurationSource - Reloading configuration from APM Server https://APMserverURL/config/v1/agents
2023-10-22 10:20:29,638 [elastic-apm-remote-config-poller] DEBUG co.elastic.apm.agent.configuration.ApmServerConfigurationSource - Configuration did not change
2023-10-22 10:20:29,638 [elastic-apm-remote-config-poller] DEBUG co.elastic.apm.agent.configuration.ApmServerConfigurationSource - Scheduling next remote configuration reload in 30s
2023-10-22 10:20:50,463 [elastic-apm-configuration-reloader] DEBUG co.elastic.apm.agent.impl.ElasticApmTracerBuilder - Beginning scheduled configuration reload (interval is 30 sec)...
2023-10-22 10:20:50,463 [elastic-apm-configuration-reloader] DEBUG co.elastic.apm.agent.impl.ElasticApmTracerBuilder - Finished scheduled configuration reload
and then below logs
2023-10-22 10:48:09,615 [elastic-apm-shared] DEBUG co.elastic.apm.agent.report.ApmServerReporter - Could not add JsonWriter {"metricset":{"timestamp":1697932089615000,"tags":{"name":"PS MarkSweep"},"samples":{"jvm.gc.time":{"value":65891.0},"jvm.gc.count":{"value":11.0}}}}
to ring buffer as no slots are available
2023-10-22 10:48:09,615 [elastic-apm-shared] DEBUG co.elastic.apm.agent.report.ApmServerReporter - Could not add JsonWriter {"metricset":{"timestamp":1697932089615000,"tags":{"name":"Compressed Class Space"},"samples":{"jvm.memory.non_heap.pool.committed":{"value":18087936.0},"jvm.memory.non_heap.pool.used":{"value":16583912.0},"jvm.memory.non_heap.pool.max":{"value":1073741824.0}}}}
to ring buffer as no slots are available
The Could not add ... to ring buffer as no slots are available log message indicates that the internal queue of the APM agent used for buffering data before sending it is filling up, because currently no data can be send.
It might also be the case that the error you are seeing is caused by an overloaded APM server. Could you try disabling all other APM-agents sending to that server and check whether the error still persists?
I have seen a warning message also in our logs. Can you please tell me if this can be the issue?
2023-10-20 21:14:46,962 [https-jsse-nio-443-exec-119] WARN co.elastic.apm.agent.bci.bytebuddy.ErrorLoggingListener - org.apache.commons.httpclient.HttpMethodDirector uses an unsupported class file version (pre Java 4)) and can't be instrumented. You may try setting the 'instrument_ancient_bytecode' config option to 'true', but notice that it may cause VerificationErrors or other issues.
That warning simply states that org.apache.commons.httpclient.HttpMethodDirector won't be instrumented. It doesn't affect anything else. In particular APM communication would continue regardless of that warning. As Jonas said, the warning looks like comms to the APM server stopped, either for network problems or APM server overload. One other option is that something in the app switched the JVM to using a proxy. We have some logging for proxy usage, search the agent DEBUG logs for proxy
I am getting only one log related to proxy and I am pasting that below. How can we identify that the apm server is over loaded. do we have any limit in number of services that we integrate? How many services a single instance of APM server with 1 GB ram can handle?
2023-10-23 14:42:38,667 [elastic-apm-remote-config-poller] DEBUG co.elastic.apm.agent.util.UrlConnectionUtils - Opening https://apm-server-url/config/v1/agents without proxy
This issue is only coming when we integrate APM with our on-prem applications. We have deployed some of the applications in AWS ECS. When we integrate APM with them, there is no issue. So I think it might be the issue with the connectivity between on prem node and the APM server. Can any one provide me clarity on this?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.