User centric APM support

We have been using Elastic APM and service map for a few months now and it has been quite powerful and helpful. One of the things that we have been struggling to find though has been an ability to see our user end-to-end interactions with our application from the APM window to analyse user experience in conjunction with how our services perform. I understand that this can be achieved partially by filtering user and date to see interactions of a single user, but it is not exactly what we are looking for. I was wondering if there is any way that APM timeline can be customised to show service break down interactions based on users? I understand that the timeline works based on transaction_id. Is it possible to modify it somehow to assign the same transaction id to the user interactions regardless of what service it hits to show the timeline across multiple microservices for a single user request?

Imagine something like the following.

User A -> serviceA.createResource -> Kafka topic -> service B.consumeResource ->...

Currently, the timeline ends at Kafka topic because it gets to the world of the second service.

Hi and thanks for the question.

Did you try using our RUM agent?

If both sender and receiver services are Java and you installed the Java agent on them, then this is not expected - your traces should include both. Is that the case?

Hi, we haven't configured RUM agent yet. We are planning to use RUM with our React app to see how it can instrument customer actions as well. However, I am not sure if RUM is the missing item here. It will certainly enhance our APM view by bringing a few new metrics. To be clear, we already can see users extracted by APM agents in the backend world. We use JWT tokens and it seems that elastic APM agent can extract user name from tokens accurately. However, what we are missing here is that transactions seem to be extracted independently and it feels a bit disconnected.

If both sender and receiver services are Java and you installed the Java agent on them, then this is not expected - your traces should include both. Is that the case?

I can confirm that the transactions are being captured for all services, but they are not picked up in the APM timeline view. In service map, I can see the relation is being shown as one service publishes a message to a Kafka topic and the other service read it from the same topic, but in the timeline view, it's disconnected. I am not sure if we are missing any specific configuration here.

The Distributed Tracing feature means that user actions traced by the RUM agent will be correlated with corresponding traces coming from backend agents. It means that each agent will create a transaction with a transaction ID, but since they will share the same trace ID, these transactions will be presented together on the timeline.

If your Kafka is of version 0.11.0 or higher, this should apply for the producer and consumer transactions as well - they should be correlated by sharing the same trace ID send as a record header.
If you indeed use version 0.11.0 or higher, I would like to try and see why this is not working, so please let me know.

Thanks!

I think I need to give RUM a try and see how it changes the interactions, so no comment on that.

Regarding Kafka though, we are using Kafka client 2.0.4 and Kafka broker 2.4. We are using Spring Cloud Stream, which is based on Spring Boot, Spring Kafka and Spring Integration frameworks. Does it smart enough to detect the same trace_id when the request goes to Kafka? How can it detect that the same message that's produced in one service and consumed in another gets the same trace_id? Does it inject trace_id to the Kafka message header? Is there a way I can check to see if it works properly and possibly find why it's not working as expected?

Exactly.

You can set log_level to DEBUG, run a few requests that involve record sending and receiving and share the full logs (from startup and until after Kafka client stuff occurred), maybe it will give us some hints.
In addition, see if you find an Elastic header on a header of which send action was traced.

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.