We have originally setup the Java APM agent and configured it with the above environment variables. We are getting the transactions and errors as expected. However moving forward we are only interested in the agent capturing and sending error events, and we do not want transactions sent.
My question is how can we configure the agent to not send transactions to the APM Server.
This is not possible with only agent configuration, thus here maybe the simplest option is to use an ingest pipeline that drops the traces (or other types of documents that you are not interested in keeping: Parse data using ingest pipelines | APM User Guide [8.10] | Elastic
While you can definitely do that and make the APM product a fancy "error collector" tool, that will very likely break things in the UI and you are quite likely to not have a great experience with that.
Could you tell us a bit more about why (and if) your use-case requires to drop tracing data ? Usually the more information we get on the application the better we can understand and monitor it.
For our Flask apps we have ECS formatted the logs and use Filebeat for INFO. However for ERRORS we capture and send them as messages via APM agent.
With Java we have also formatted the logs using the ECS formatter and included the APM identifiers (trace.id, transaction.id and span.id) so we're not dropping tracing data. We just don't want the agent to handle sending these logs to the APM server as it is putting strain on our apis. Filebeat can handle sending the logs with the APM identifiers so they show up in the APM dashboard.
We're looking at how to replicate this with the Java agent by turning off the message sending for everything but errors but can't seem to get it working!
I think here it's important to remember that agents allow to capture and send 3 types of "signals":
logs
metrics
traces
In your case you seem mostly interested in logs, but you don't want to capture traces.
Even if the logs contain the IDs, those won't allow to rebuild traces if needed, at best you'll be able to correlate the logs to a given transaction, which is already interesting but won't give you the best experience.
As an alternative, you can also use the transaction sampling
head-based sampling: the apm agents will randomly capture only a fraction of all the transactions,, you could even set this to zero, but this means no transaction is created thus no correlation ID will be generated. This is usually the simplest strategy to reduce storage and ingest costs, also in the case where you have a high traffic on the monitored application you might not need to have each transaction captured in detail and only having details on a fraction of them is representative enough to monitor the whole.
tail-based sampling: the apm agents send all the tracing data and sampling is applied by apm-server based on transaction name or outcome (with different sampling rates per transaction).
I did find and was playing around with the transaction sampling, but wondered if there was another way to go about it. Your response is greatly appreciated.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.