I would like to ask what is the failure rate in APM services stands for? I have read the documentation that suggest that the it is based on event.outcome / the status code / and the errors if there are no status code, and I tried to ingest pipeline set metadata the value of event.outcome=failure, transaction.result=HTTP 5xx, and transaction.result=failure. None of these worked to increase the failure rate of the transactions. So how is the failure rate actually defined? And is there any way to change a successful transaction to become a failed one (without using code editing API)?
The failure rate is calculated from the span.outcome or transaction.outcome, and for HTTP method calls the behavior depends on the side of the request:
the default outcome value depends if an error was triggered during invocation, but there are protocol-specific definitions, for example with HTTP
when receiving the HTTP request (on the server side), the transaction is a failure only if it's a 5xx
when emitting the HTTP request (on the client side), the span is a failure only if it's a 4xx or 5xx
I see, and yes, there are some that were caught as 302 (redirect), but have "1 Error" tag. That also make me have another question, how to display this failure rate from the transaction page to kibana dashboard?
Hi, sorry for the late reply, yes, the "1 error" tag appears from the trace view. Currently I don't have access to the environment so I can't share any pictures yet. But what I remember is the event.outcome of the transaction is a success, and looking from the metadata of the transaction document, it does not contain any error attribute in it so I can't query it in any way as a failure. Also I would just like to display this error as part of the failure rate, either in a custom dashboard, or even better in the APM services/transaction failure rate graph.
Actually in your case it might be that an exception is thrown and captured by the agent in some part of your application (for example in a high-level framework like Spring MVC that might use an exception to indicate a missing resource), but this exception is caught elsewhere in the application in a low-level framwework like Servlets when it's mapped to the 404 HTTP status code.
When such thing happen, it means that the transaction might be captured as success, whereas there was an exception captured when the transaction was active (shown as an error in UI).
In order to deal with such cases, you could rely on the related error documents that have an transaction.id attribute instead of relying on the transactions that have an event.outcome = failure.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.