Is it possible to record the transaction with sample rate as well, but not all of them?

kenshin · June 1, 2019, 2:08am

Hi there,

We are trying to use Elastic APM for performance monitoring and tracing. The official document said,

By default, the agent will sample every transaction (e.g. request to your service). To reduce overhead and storage requirements, you can set the sample rate to a value between 0.0 and 1.0. We still record overall time and the result for unsampled transactions, but no context information, tags, or spans.

We have tested it on our production, based on my rough estimation of our case, if all the unsampled transactions will be recorded, the total QPS is more than 100k, and the data volume in ES is also too expensive.

I'd like to ask if we can just record the things only if it is sampled, not just the spans/tags, but also the root transactions. I understood the cons is we can directly get the number like tpm via Kibana, but most of the features worked well like latency calculation and tracing. But the benefits are so big, we can save the cost and resources. I didn't find the similar configuration from the APM documentation, can anyone help?

Thanks

Kibana version: 7.1

Elasticsearch version: 7.1

APM Server version: 7.1

APM Agent language and version: Python/Java/Go

axw · June 4, 2019, 5:06am

What you could do is add a drop_event processor to apm-server.yml, conditional on transaction.sampled: false. You can find more information about processors in the Filebeat docs, but they're relevant to the APM Server too: https://www.elastic.co/guide/en/beats/filebeat/current/filtering-and-enhancing-data.html.

Longer term, we're looking into options for both server-side rollups of raw data (giving you the ability to control retention), and agent-side rollups (sending pre-aggregated statistics to APM Server to reduce the ingestion rate in the first place).

kenshin · June 21, 2019, 6:17pm

Thanks for the advice, I have tried the drop_event solution, but it doesn't work. Would you like to help me to check my configuration?

apm-server:
  host: 0.0.0.0:8200
output:
  elasticsearch:
    bulk_max_size: 4096
    hosts:
    - my-es
    protocol: http
  file:
    enabled: false
processors:
  drop_event:
    when:
      equals:
        transaction.sampled: false
setup:
  template:
    overwrite: true
    settings:
      index:
        number_of_shards: 5

axw · July 2, 2019, 2:46am

Sorry for the delay in replying. There's a slight error in your syntax. Processors should be a YAML list; you're missing a "-" before drop_event. It should look like this:

processors:
 - drop_event:
     when:
       equals:
         transaction.sampled: false

system · July 22, 2019, 10:51pm

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Transaction_sample_rate post 8 release versions Elasticsearch	5	232	October 17, 2023
Java APM agent sample rate ignored APM java	7	706	August 17, 2020
Is it possible to limit transaction sampling rate by transaction duration? APM nodejs	4	508	October 20, 2020
Transaction_sample_rate and error reporting behaviour APM php	1	396	October 21, 2021
Dropped unsampled transactions in >= 8.x poses a problem APM python , server	4	867	July 28, 2022

Is it possible to record the transaction with sample rate as well, but not all of them?

Related topics