APM - distirubted tracing in microservice-architectures

Hello everyone.

I really like APM and what you can do with it, but in my opinion there is something missing when it comes to distributed tracing in a microservices architeture. This is due to the fact, that traces are not objects themselves, but just a grouping of transactions. When you now have a system consisting of distributed services, each contributing to a trace with seperate transactions, there is (to my knowledge) no way how to monitor the complete end-to-end workflow. You can for example set alarms for individual services, but not for the whole set of services.

This is also discussed in this github-issue, I describe the same problem in the last post. Since the issue has been closed for a while now, I also wanted to open a discussion here. I would be really interested what others think about this topic.

One thing that you could do as a workaround is to take a timestamp in Service 1 and add it to the message that gets propagated to service 2 and 3. Just before Service 3 finishes the work, you can calculate the total trace duration and set the value as a label. You can then alert based on this custom label.

3 Likes

Thanks Felix, I was able to implement everything like you suggested. The alerting and reporting now works excactly like we need it.

That's great to hear :tada: