Reduce the amount of data for spans and transactions

Hi folks,
ES APM on Ruby v3.0.0. Wondering if it's possible to set the ELASTIC_APM_TRANSACTION_SAMPLE_RATE parameter to a lower value than 0.1 (say 0.05)? Reason is, our services generate a LOT of transactions (looks like around 100GB of index data per day) and while I've setup Elastic Cloud to handle this no problem, we're being killed by AWS on data transfer cross-zone costs.

I'm already at 0.1 for the key heavy volume services so trying to see how I can keep some APM value but reduce the volume of data sent for transactions and spans.



Hi again Dave! Putting Elastic APM to the test – I love it :slight_smile:

Setting it to 0.05 will work just as expected. The check looks like this, so anything that converts into a float will work:

rand <= config.transaction_sample_rate

You might be able to decrease the individual event size a bit too:

If you don't care (or don't care enough) about seeing source code in kibana, there's perhaps a lot of bytes to be saved by skipping those with the options source_lines_error_app_frames and source_lines_span_app_frames. They are both 5 lines by default, meaning the line in question and the 2 immediately before and after. Set them to 0 to disable source code lines entirely.

You could also disable capture_headers or capture_env but that's probably not many bytes after gzip.

And of course if there's entire libraries' spans that you don't care about, you can disable their auto-instrumentation with disabled_instrumentations.

These are all slightly worsening the overall APM experience but if it's between slightly worse and nothing at all they could be worth it.

Let me know how it goes!

1 Like

A colleague suggested looking at span_frames_min_duration too. It lets you set the minimum duration of a span for it to have its stacktrace included. Or, you can disable stacktraces altogether by setting it to 0.

1 Like

All of this is GOLD @mikker, thanks so much! I'll try tweaking these today starting with lowering the transaction rate on the services that have a high throughput. They are between 20 - 80,000 transactions per minute and almost all are "the same" transaction (or very close to it) so I feel there is a good data saving to be had here.

If we need more, I'll move onto pruning out the stack traces.

Thanks again!


1 Like

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.