I'm currently experimenting with APM toolings, to see how easily they could be applied to our systems. I like what I am seeing with Elastic APM, but I am a bit uncertain as to the approach around transactions and spans.
There are a number of cases where a code-path may need to create a transaction (because it is at the 'top level' for this system), or there may already be one (because it is a unit of work within another unit of work) - in which case I'm not sure - should it perhaps do nothing, or be creating a span rather than a transaction (because we do want to denote a boundary.
Once we start layering on top of that 'distributed' transactions, I'm unclear on how this is intended to fit together.
I.E: it appears as if the intent is that there is one and only one transaction (that appears at the 'top' of the graph of spans) - but it's unclear to me that the (java agent) API is the most useful for this (perhaps needing a 'create transaction, or use span if there already is one), and I'm not sure how this ties together when there are multiple systems with another 'transaction covering an end-to-end' scenario.
You can have many transactions in a app/service, just not nested within each other. Transactions are typically web requests or background jobs, and they contain spans within.
If you use auto-instrumentation, there is nothing to worry about it - the agent will get it right. If you want to instrument something yourself, you need call the functions in the right order, for instance:
- create transaction 1
- create span 1
- end span 1
- create span 3
- end span 3
- end transaction 1
and maybe on a different thread:
- create transaction 2
- end transaction 2
And so on...
This document outlines how it all comes together: https://www.elastic.co/guide/en/kibana/7.0/transactions.html
Does this help?
OK, but what about transactions that themselves are children of transactions?
- I have a public API, with an endpoint (say GET /quotes). Here the 'transaction' boundary is perhaps clear (it's wrapped around the call).
- I have an orchestration. Here this system is just one component in a wider integration - an integration endpoint is processing a number of things on behalf of a systems client - within a "transaction". How are these tied together?
- A background services task in the same system maybe tracks a number of quotes every hour. Its' transaction covers multiple calls to it's own endpoint. The top level transaction ought to be "quoteupdater", but with sub-transactions for each of the quotes (in-process).
- Some web clients might track their UI actions, and the real transaction scope is from "user clicks button" through to "results rendered", with the quote API call underneath that.
This all feels very similar to database transaction wrapping, where the semantics have a handful of different scenarios - which is often "Get the current transaction if one exists, or create it if it does not". I.E: sometimes this code may be the top-level transaction, and some times it might not.
I suspect that most of the time the instrumented code just needs a 'thing it can log timings and metadata into', and largely doesn't care if it's an ISpan or an ITransaction.
I also look at the W3C candidate for trace context - and ponder how that fits into this model. I'm also looking at the opentracing API, and that seems to just think in terms of spans, rather than there being 'special' spans called transactions.
I don't know how much of this is a fundamental divergence in approaches (and I may be missing something), or whether it's just requiring transactions is easier to implement in an elasticsearch model - but I note that elastic are listed as "organizations using OpenTracing" so I'm a bit confused as to what the direction is.
Here this system is just one component in a wider integration
You mean a different app or service? If so, a transaction in one service might be nested inside a transaction in other service, just not on the same one.
The top level transaction ought to be "quoteupdater", but with sub-transactions for each of the quotes
Right. Those sub-transactions are just spans then.
Some web clients might track their UI actions, and the real transaction scope is from "user clicks button" through to "results rendered", with the quote API call underneath that.
This is indeed how it works. Check https://www.elastic.co/guide/en/apm/get-started/current/distributed-tracing.html.
the instrumented code just needs a 'thing it can log timings and metadata into', and largely doesn't care if it's an ISpan or an ITransaction
I see your point. You can think of transactions and spans as a model with 2 constraints:
- A transaction might not contain a nested transaction within the same service.
- A span must be child of a transaction or of another span.
You then can use this model however makes more sense semantically to you.
opentracing API, and that seems to just think in terms of spans
Correct. All the agents have an OpenTracing bridge API to make them compatible: https://www.elastic.co/guide/en/apm/get-started/current/opentracing.html
I don't know how much of this is a fundamental divergence in approaches
Keep in mind that transactions and spans have different attributes, spans being more detailed (for instance, spans include stacktraces).
Because of this you can for instance have longer retention periods for transactions and shorter for spans.
Also: transactions are always sent when you have a sampling rate lower than 1, spans are discarded accordingly. So you will always have the histogram and high level timings, but can store more or less details as you see fit.
This separation generally allows you to balance the amount of information and storage cost.
Ok - thankyou for those details, that's very helpful.
I had a look at what was being stored "under the hood", and I can see the trace, transaction and span identifiers, and I had a look at the opentracing gateway.
I found an opentracing paragraph informative:
The first span of a service will be converted to an Elastic APM
Transaction , subsequent spans are mapped to Elastic APM
I think what I'm feeling is that's the right behaviour (and so the right API), because there are points (certainly in our code) where there are code paths where a transaction has already been created, and code paths where there has not (even in the non-distributed sense).
I.E: that 'transaction' is a low-level implementation detail (one I understand the need for), but that has been surfaced all the way up in the Agent API, where it's not a terribly useful distinction (vs 'span').
In a sense, we it's identical to the Java database transaction propagation of "REQUIRES" , which is
Support a current transaction, create a new one if none exists.
Either way, I'm now reassured that the back-end does 'the right thing', even if I don't like the agent APIs much - which I can solve by instead using OpenTracing as the interface instead.
Glad that the OpenTracing bridge helps.
Thanks for bringing in your thoughts, they are useful for us as well.
Have a great day,
This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.