Hi,
I'm experimenting with APM with the Java Agent and am exploring ways to reduce processing, network utilization and storage size on the server side without missing useful details. One of the applications I'm looking to monitor is very chatty in terms of # of SQL statements executed and also has polling based tasks that execute repeatedly every X seconds executing a large amount of SQL statements. The individual SQL statements are fast (in microseconds, < 1 millisecond) and the overall transaction times even with many of them are fine (e.g. under a second) but results in a lot of SQL spans getting generated until it hits the agent's default cap of 500 spans under the transaction.
I can reduce the sample_rate but my concern here is not getting the nested spans on actual interesting transactions. "Interesting" in this case would be:
Transactions that take over X time
Transactions that end in an error
I'm essentially looking to perform some post transaction decision on whether to send the captured spans or not. Looking into this, the sample decision itself has to be made upfront with minimal context since it determines whether the agent will do the work or not of capturing the spans. I'm actually not too worried about the agent overhead in this case since for the most part it's just capturing the SQL statements and timing them which relative to actual SQL statements being called shouldn't be too bad. The reporting of spans though happens immediately when the span is completed (well places it in the reporting queue at least) so it doesn't have any context at that point on the result of the transaction. What I would like though is some way to defer the reporting decision until the transaction completes or maybe just after X seconds to account for long running transaction where it's not desired to keep this data in memory for longer than needed.
So I guess I'm looking if it's possible to do some configuration like:
sample_rate = 1 (or something fairly high)
only_report_spans_if_above_ms = Xms (this configuration would report spans if the encompassing transaction is >= Xms OR if the transaction encountered an error)
Having something like the above would help me reduce storage size and processing pressure on Elasticsearch of the apm-spans index. I tried to look if there's some option to just discard these kind of transactions within the APM server or Elasticsearch but I'd also like to avoid the network utilization too by discarding these at the agent level. For Elasticseach I was trying to see if I could just drop "un-interesting" spans but it seemed like it would require some application processing to first fetch transactions that were slow or errored (via apm-transaction and apm-error) and then from there delete all the associated spans.
I started looking around and found in Jaeger they're trying to approach this same problem although discussion looks to be around having another local agent that could be emitted to that defers reporting until the transaction is complete - https://github.com/jaegertracing/jaeger/issues/425
Maybe something that could be incorporated into Beats? If possible though would be nice to not require another process.
There's nothing very concrete to point at just yet, but this is something we're going to be looking into in the near future. The current thinking is this (but this may change):
APM Server will have an option to buffer events for up to a configurable amount of time (say 5 minutes), and only indexed into Elasticsearch once its trace has been identified as one to keep. If after 5 minutes an event's trace has not been identified as one to keep, the event will be dropped from the buffer.
Traces would be kept based on the root transaction duration, or whether an error occurred somewhere in the trace.
Thanks for the response! My only concern with the consolidation happening at the APM server level is that for a subset of our applications (Java based) they are provided to our customers to run on-prem but we want to provide the monitoring/APM piece as an option to them with minimal setup by hosting the Elastic APM side. We're looking at provisioning them an APM API key and just giving them the APM server endpoint. So, there's a desire to weed out uninteresting traces as close to the source as possible to reduce network activity but still provide the most value (interesting traces + always top level transactions/aggregates). This may be a somewhat less common scenario (on-prem applications with hosted monitoring) but probably still some value in discarding non-interesting information on the same host if possible - distributed tracing probably complicates this though. We don't actually leverage distributed tracing in our case, generally just more of a traditional monolith Java app + relational DB type installs.
I think we could have them run Beats since there's a desire to pull in OS/network metrics that Beats would be useful to have capturing anyway. I'm not sure if trace aggregation potentially isn't a good fit for Beats since maybe it's a heavier amount of memory usage due to buffering? Our ideal flow is:
APM agent (on-prem) ==> APM server (hosted) => Elasticsearch (hosted)
The above just requires the least amount of on-prem infrastructure (just the agent) but I understand that aggregating this at the agent level maybe isn't performant due to having to buffer the spans in memory until the overall transaction completes.
Alternatively, with Beats agent as an aggregator:
APM agent (on-prem) => Beats (on-prem/installed per server?) ==> APM server (hosted) => Elasticsearch (hosted)
or if the APM server had some gateway mode where you could configure its output to be another APM server:
APM agent (on-prem) => APM Server (on-prem/installed per data center?) ==> APM server (hosted) => Elasticsearch (hosted)
Still would have traffic flowing over the LAN that would be nice to discard on the local host if possible.
This may be a somewhat less common scenario (on-prem applications with hosted monitoring) but probably still some value in discarding non-interesting information on the same host if possible - distributed tracing probably complicates this though.
One of the reasons we would do the tail-based sampling in the backend/APM Server is to enable consistently sampling events for an entire distributed trace. In your case this isn't applicable, but in many cases it is. Moreover, there may be multiple APM Servers involved in a distributed trace, so there will need to be some coordination among them.
APM agent (on-prem) => Beats (on-prem/installed per server?) ==> APM server (hosted) => Elasticsearch (hosted)
It's a bit of an implementation detail, but APM Server is built on top of the Beats infrastructure. There's not much difference to running APM Server on a host, vs. another Beat such as Filebeat, Metricbeat, etc. So what you're describing here is, I think, essentially the same as the alternative:
APM agent (on-prem) => APM Server (on-prem/installed per data center?) ==> APM server (hosted) => Elasticsearch (hosted)
This could be an option, and is something that has come up in discussions, but it's not currently something we're planning on implementing.
Another option would be to create customer-specific Elasticsearch API Keys, and configure the on-prem APM Servers to talk directly to the hosted Elasticsearch.
Would that work for you? If you're planning to run other Beats on the application machines, then I think you would need to do something like that anyway.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.