APM agents together with Dynatrace breaks APM visualisation

,

I have the following situation:

I have distributed tracing enabled on APM agents and also on Dynatrace. Everything seems working together except for waterfall visualization in Kibana. I tried to dig more into the problem and I found this situation:

  • APP A wants to send a request to APP B via some API gateway that has Dynatrace agent.
  • APP A will generate traceparent header like 00-traceid-parentid001-01
  • Dynatrace on APIGW will receive the header and change parentid to parentid002
  • APP B will receive traceparent header 00-traceid-parentid002-01

I was trying to debug Kibana source codes to see how it works. And the problem is when Dynatrace change parentid then elastic has no clue about it and apm will generate spans with new parentid without any connection to the previous transaction.

Technically I think that everything works as expected and there is nothing wrong on APM agent or in Dynatrace.

Does anybody have a clue how to keep both agents working together without breaking the tracing context?

Hi @lancer_enkor ,

The issue here is that the transactions/spans captured by Dynatrace are not known to Elastic APM which breaks the end-to-end trace. The transaction on App B has a parent ID that is unknown, and thus can't be linked back to the transactions on App A and APIGW.

One possible work-around would be to make Elastic and Dynatrace use different headers so they would not interfere with each other.

You can configure Elastic APM Java agent Agent to use another header by configuration. You also have to make sure to use a recent version of the agent to ensure the legacy header has higher priority over the standard W3C header.

It seems like a reasonable compromise. I will try it.

Maybe some UI folks can chime in here as well. @sqren maybe?

It seems like it would be feasible to show the spans from both Service A and Service B in the same waterfall, even if there are some missing spans in the middle.

It seems like it would be feasible to show the spans from both Service A and Service B in the same waterfall, even if there are some missing spans in the middle.

Agreed, that sounds like a good enhancement. Although I don't know if it's possible. We have collapsing functionality to indicate parent/child relationships so we need to connect events according to this

Example:
Span A -> Span B -> Span C (where A is parent of B and B is parent of C)

If Span B is dropped, I'm not sure how we can connect Span A to Span C:

Span A -> Span C.

Either way, I've created an issue to investigate: [APM] Missing items in the trace waterfall shouldn't break it entirely · Issue #120464 · elastic/kibana · GitHub

@lancer_enkor To reproduce this problem locally I'm interested in seeing the full trace that has this problem. As far as I understand, all events in the trace shares the same trace.id but parent.id is wrong for some of the (for the reasons you explained above).

I'd like you to get the full trace in Kibana Dev Tools by running this query:

GET traces-apm*,apm-*/_search
{
  "query": {
    "term": {
      "trace.id": "<YOUR TRACE ID>"
    }
  }
}

You can find the trace.id In the waterfall UI by clicking "Metadata" tab:

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.