I really like APM and what you can do with it, but in my opinion there is something missing when it comes to distributed tracing in a microservices architeture. This is due to the fact, that traces are not objects themselves, but just a grouping of transactions. When you now have a system consisting of distributed services, each contributing to a trace with seperate transactions, there is (to my knowledge) no way how to monitor the complete end-to-end workflow. You can for example set alarms for individual services, but not for the whole set of services.
This is also discussed in this github-issue, I describe the same problem in the last post. Since the issue has been closed for a while now, I also wanted to open a discussion here. I would be really interested what others think about this topic.
One thing that you could do as a workaround is to take a timestamp in Service 1 and add it to the message that gets propagated to service 2 and 3. Just before Service 3 finishes the work, you can calculate the total trace duration and set the value as a label. You can then alert based on this custom label.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.