APM trace view in Kibana displays the traced services as external services when they are called via the API gateway

Kibana version: 8.6.2

Elasticsearch version: 8.6.2

APM Server version: 8.6.2

APM Agent language and version: Java, 1.30+

Browser version: Chrome 126.0.6478.115

Hello everyone,

we use APM in our project to monitor our services in operation.
When analysing the services, we noticed in the APM Trace View that the target service ('Service B') is displayed as an external component for the communication of the services via the API Gateway.

In these cases, the call always follows the following pattern: Service A --> API Gateway --> Service B
Distributed tracing is activated in the services A and B involved. The API gateway is not traced by the APM and is also an external system for our project.

The trace correctly displays the calls of the various services in their sequence

However, the type for service B is 'http' - I would expect Service B to be recognised here, as is the case with Service A.

What criteria are used to set the service type in the trace? Shouldn't the call to 'Service B' via the API gateway be recognised as an internal call, since 'Service A' and 'Service B' are monitored in the same system via the APM agent?

Thank you for your feedback!

Best regards,
Thilo

It should be the service.name field in the span. If that is absent then it has to fall back to the type of request

I think the spans you marked with ServiceA/... are client spans. The fact that they don't have corresponding Service B transaction spans as children means distributed tracing is likely not correctly working in your setup.

My guess here is that your API-Gateway does not preserve the W3C trace context headers, making it impossible for elastic APM to correlate the transactions from Service B with the ones from Service A.

Hello Jonas,

thank you for your feedback - based on your reply, I realised that my image contained an error.

Please find attached a corrected version of the image. This image shows that the transfer of header information via the API gateway works in principle – the spans of Service B are displayed within the transaction.

Header Information Service A:

http.request.headers.Traceparent:  00-4caa3c1fcd95dc020e8e19a9c9ef80e1-f966b2eca38df91a-01
http.request.headers.Tracestate: es=s:1
http.request.headers.X-B3-Parentspanid: f966b2eca38df91a
http.request.headers.X-B3-Sampled: 1
http.request.headers.X-B3-Spanid: b142ed5182867fc5
http.request.headers.X-B3-Traceid: 4caa3c1fcd95dc020e8e19a9c9ef80e1

Header Information Service B:

http.request.headers.Traceparent: 00-4caa3c1fcd95dc020e8e19a9c9ef80e1-931e915089f36b75-01
http.request.headers.Tracestate: es=s:1
http.request.headers.X-B3-Parentspanid: 931e915089f36b75
http.request.headers.X-B3-Sampled: 1
http.request.headers.X-B3-Spanid: 2f8c704820fcbd2f
http.request.headers.X-B3-Traceid: 4caa3c1fcd95dc020e8e19a9c9ef80e1

However, the information that Service B is also monitored by the same APM server is obviously lost.
If the call between Service A and Service B is made without using an API gateway, Service B is displayed as the type in the Kibana trace view.

However, the information that Service B is also monitored by the same APM server is obviously lost.
If the call between Service A and Service B is made without an API gateway, Service B is displayed as the type in the Kibana trace.

I am confused why the display in Kibana is different - in the last case, the URL for calling the service is the difference.

Do you know where those X-B3 headers come from? The Elastic APM Agent does not set those.

Looks like maybe some other tracing library is interfering and creating a span inbetween Service A and Service B.
As a result, the parent of the Service B span is not known to elastic, causing it to not be rendered in the waterfall. Maybe your API gateway has some built-in tracing causing this behaviour?

This header information is inserted into the request by the API gateway (KONG with ZipKing for tracing).
The two span.id values in the X-B3 fields are not present in our ELK.
This means that your explanation is correct – there is a span that is not part of the APM trace.

I would expect the APM to ignore the X-B3 fields for tracing, since the W3C-compliant header information is also available.

I also found this issue about it.

For me, this means that the solution must be implemented in the API gateway: use traceparent / tracestate header.

Thank you for your support!