I have distributed tracing enabled on APM agents and also on Dynatrace. Everything seems working together except for waterfall visualization in Kibana. I tried to dig more into the problem and I found this situation:
APP A wants to send a request to APP B via some API gateway that has Dynatrace agent.
APP A will generate traceparent header like 00-traceid-parentid001-01
Dynatrace on APIGW will receive the header and change parentid to parentid002
APP B will receive traceparent header 00-traceid-parentid002-01
I was trying to debug Kibana source codes to see how it works. And the problem is when Dynatrace change parentid then elastic has no clue about it and apm will generate spans with new parentid without any connection to the previous transaction.
Technically I think that everything works as expected and there is nothing wrong on APM agent or in Dynatrace.
Does anybody have a clue how to keep both agents working together without breaking the tracing context?
The issue here is that the transactions/spans captured by Dynatrace are not known to Elastic APM which breaks the end-to-end trace. The transaction on App B has a parent ID that is unknown, and thus can't be linked back to the transactions on App A and APIGW.
One possible work-around would be to make Elastic and Dynatrace use different headers so they would not interfere with each other.
You can configure Elastic APM Java agent Agent to use another header by configuration. You also have to make sure to use a recent version of the agent to ensure the legacy header has higher priority over the standard W3C header.
Maybe some UI folks can chime in here as well. @sqren maybe?
It seems like it would be feasible to show the spans from both Service A and Service B in the same waterfall, even if there are some missing spans in the middle.
It seems like it would be feasible to show the spans from both Service A and Service B in the same waterfall, even if there are some missing spans in the middle.
Agreed, that sounds like a good enhancement. Although I don't know if it's possible. We have collapsing functionality to indicate parent/child relationships so we need to connect events according to this
Example:
Span A -> Span B -> Span C (where A is parent of B and B is parent of C)
If Span B is dropped, I'm not sure how we can connect Span A to Span C:
@lancer_enkor To reproduce this problem locally I'm interested in seeing the full trace that has this problem. As far as I understand, all events in the trace shares the same trace.id but parent.id is wrong for some of the (for the reasons you explained above).
I'd like you to get the full trace in Kibana Dev Tools by running this query:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.