Distributed Tracing question about how to implement

Hello,

I can't seem to get Distributed Tracing to work. Im starting to wonder if I don't understand Distributed Tracing correctly.

In my frontend I am creating a Transaction and a few spans to cover some JS functions. One of the JS functions makes a request to a backend Python service. I am sending the traceID from the Transaction created in the frontend as part of the request. The Python service is calling begin_transaction and setting trace_parent to be the traceID passed in. Backend code calls a couple methods with spans, finally ending its transaction with a response back to the frontend. The frontend then ends its transaction.

Client Transaction AAA (traceId of XXX) -> Span BBB -> Ajax (traceId of XXX) -> Python begin_transaction CCC (using traceId of XXX) -> Python Span DDD -> End Span DDD -> End Transaction CCC -> End Span BBB -> End Transaction AAA

Hopefully that makes sense, but that is a rough chain of the calls being done and what I would expect from a distributed trace in Kibana.

Am I doing something wrong? Is there a good tutorial some has I should check out?

Thanks for the help!

EDIT:
I should mention I see each of these individual transactions and spans in Kibana... I just don't see a single distributed trace.

Hi @Ahchoo, welcome to the forum!

It sounds a bit like you're explicitly sending the trace ID from the frontend to the backend -- is that correct? The RUM JS agent automatically instruments AJAX requests, and injects trace context in the outgoing request headers, so you shouldn't need to do that.

Note that if your frontend and backend are served from different hosts, you'll need to set up CORS to allow the distributed tracing headers to be propagated. See https://www.elastic.co/guide/en/apm/agent/rum-js/current/distributed-tracing-guide.html for some guidance around this.

Finally, how are you instrumenting the Python backend service? What framework are you using? If it's Django or Flask, the agent provides out of the box support for instrumenting them; see https://www.elastic.co/guide/en/apm/agent/python/current/index.html. If you use those, then the provided instrumentation will take care of extracting the distributed tracing headers.

Hey axw,

Yes, you are correct! I was explicitly sending the trace ID. I did not realize that there was a Parent Trace ID put into the outgoing AJAX request headers. I have made some changes to my code to use 'HTTP_ELASTIC_APM_TRACEPARENT', but still not sure I am doing everything right.

I am not using Flask or Django, we are using Bottle. So I changed my code to call begin_transaction('request', bottle.request.headers.environ.get('HTTP_ELASTIC_APM_TRACEPARENT')) before every route method is called and after the route method is finished I am calling end_transaction(bottle.request.path).

Im still not seeing a single distributed trace. I can see the frontend trace from button click and several ajax calls to the backend. I also see each of the backend calls as separate traces, but I don't see a single distributed trace. I would have expected it to be apart of the frontend trace, or some where else in Kibana, but I am not seeing it.

To continue a distributed trace with the python client you'll want to use the trace_parent kwarg in the begin_transaction call. That value should be a elasticapm.utils.disttracing.TraceParent, so using your example:

trace_parent = bottle.request.headers.get('Elastic-Apm-Traceparent')
begin_transaction('request', trace_parent=elasticapm.utils.disttracing.TraceParent.from_string(trace_parent)) 

Thank you @gil and @axw I now seem to have Distributed tracing working at least for my initial page-load request.

I'm not seeing any distributed tracing on any custom frontend transactions I start. Im trying to record from a button click (where I start a custom transaction) all the way through frontend and backend services.

EDIT:
Ok so I think I might have partially solved my problem. I found some other posts talking about adding { managed: true } to my custom transaction on the frontend. That appears to work, distributed traces are showing up on my custom transactions now.

I now have a new issue though... The managed transaction is ending much sooner than I would like. Is there a way to control this? Or is there a way I can not use the managed flag, but still get the same results, maybe getting and setting the traceparent of a custom transaction? I would prefer to control the start and ending of transactions.

Hi James,

As you already figured out, custom transactions does not support automatic instrumentations (XHR, Fetch, Resources) which means the user would have to start span and attach it to the transaction manually.

And yes by setting { managed: true } in the startTransaction option the transaction would be managed automatically which means the transaction would end as soon as current tasks(api calls) are done.

  1. However you can hold of the current managed transaction from closing by adding a task and removing the task when you want to end the transaction.
const tr = apm.startTransaction('custom', 'custom', { managed: true })

const id = tr.addTask();
// the above call would make sure the transaction cannot be automatically ended till all scheduled tasks are done. 


// remove task once you are done instrumenting
tr.removeTask(id)

tr.end()

By this way, you are in full control of the transaction plus also you get the benefits of instrumenting other parts of code.

  1. Since the above solution is not what we would like for a custom transaction, we would expose an API in the next versions to allow users to inject headers for the custom transactions/spans to make the DT work as intended. created an issue for the same - https://github.com/elastic/apm-agent-rum-js/issues/468

Thanks for the feedback and detailed info.

Cheers,
Vignesh

Thank you @vigneshshanmugam that worked perfectly!