Prometheus fields no longer available after enabling APM on Java process
We are using
Kibana 7.10.1
APM Agent 1.24
Browser used Chrome
Steps to reproduce:
We have enabled APM on one of our Java processes
-javaagent:/opt/mount1/../global_resources/elastic-apm-agent-1.24.0.jar
-Delastic.apm.service_name="Name of Service"
-Delastic.apm.verify_server_cert=false
-Delastic.apm.server_urls=https://{{ apm_server }}:8200
With the above enabled, the APM data metrics get sent to the APM server and is available in Kibana in the APM Dashboard.
APM Dashboards show metrics correctly for the Service in question.
The problem arises in that some of the prometheus fields/metrics are no longer available to be scraped to Grafana. The prometheus exporter that makes the metrics available to be scraped did not change in any way.
Once we disable APM by removing the reference to the jar the prometheus fields/metrics were again available
How could it be that the APM jar is updating/changing the metrics made available by the exporter
MeterRegistry is used for the Metrics
import io.micrometer.core.instrument.MeterRegistry
The above exporter makes the metrics available on the node so that they can be scraped by Grafana.
Some prometheus fields that were available prior to enabling APM are no longer there. We have not seen any errors in the application, just that the prometheus fields are missing.
Please provide some additional input, so we have a better starting point for analyzing this:
So some are are disappearing and some not? If yes:
Can you find a common factor to the ones not affected vs. the ones affected? For example - Meter names? Meter types? Meter tags?
Please provide a couple of examples of affected and non-affected meters (metricsets)
What happens if you enable the agent and disable the collection of an affected metric through the disable_metrics config? Does it restore this metric in Prometheus?
We have narrowed down the issue a little. We have a scenario with one Java service where the issue does not occur and another scenario where is does occur.
For the scenario where the issue happens we see that the service jar file is called using the "-cp" parameters. Below is an example. In particular look at "-cp /etc/hbase/conf:/etc/hadoop/conf:software-ingester.jar"
For another scenario where there are no issues with missing prometheus fields the "-cp" parameters are not used. Should the APM.jar behave any different when the "-cp" parameters are used in the Java call?
Should the APM.jar behave any different when the "-cp" parameters are used in the Java call?
It shouldn't, as long as you don't attach the agent jar through -cp, which is not allowed.
How do you attach the agent? What configurations do you apply for it?
Please answer my questions above, so we can properly assist.
Providing a debug level log (see log_level) may be useful as well, just make sure it includes the entire startup of your application. You can share it through gist.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.