We are running Elastic APM Node.JS on 5% sampling with the following environment variable.
ELASTIC_APM_TRANSACTION_SAMPLE_RATE: 0.05
On Storage Explore it shows 17% sampling, which is understandable, given that it will honor sampling of parent trace from api calls from other microservices.
However, There are around 3.13k requests made to this particular endpoint, but the Kibana shows only 139 requests in metrics (which is lower than the traces it shows as 147)
I don't know the code that makes some of the calculations (for example the effective "Sampling rate"), but I'll make some guesses.
3.13k requests in "the last 15 minutes"? Which is the time window in your screenshot.
I wonder if the over short time ranges some of the calculated values can be significantly off. E.g. does that effective "Sampling rate" change significantly if you slide the window to other 15-minute time ranges? Or look at longer time ranges?
My guess is that the Latency distribution | 139 total transactions is a figure derived from a metric and factoring in the effective sampling rate. My understanding is that the Trace sample |< < 1 of 147 > >| value is a search of actual stored transactions (capped to a limit of 500). I agree that those having different values is odd, but perhaps not surprising that there can be variance given a calculated effective sampling rate and time-bucketed metrics.
I don't know the Kibana APM app code, so apologies if my answer doesn't satisfy.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.