Original install method (e.g. download page, yum, deb, from source, etc.) and version:
Provided by Elastic Cloud subscription
Fresh install or upgraded from another version?
Upgraded from 7.1.1
We are using the Elastic Cloud services, so we are not running our stack locally.
I receive a lot of transactions that check for /ping or /healthcheck of a URL that doesn't exist anymore. I receive a lot of these every day and I just want to find a way to get rid of them.
Another question that I have is what if I reduce the transaction sample rate, do APM store transactions somewhere and then update them all together or do I lose access to those transactions? Currently, we are monitoring our website using APM so we can better understand customer's behaviors/issues, and my concern is that I might lose access to this information.
Our plan is to reduce the data that we receive from APM, and only gather the data that is useful to us. Using this approach can help us reduce the amount of data which ultimately reduces our shards and can contribute to the maintenance of our cluster.
I appreciate any feedback or resources that you might have, feel free to ask for more information if you think there is missing detail in this post.
The transaction_ignore_patterns config should suit your needs nicely. Add the transactions in question and they will be dropped silently.
Transaction sampling decisions happen at the agent level, and changes to this config do not affect data already in elasticsearch. You won't lose any historical data, but future data will be sampled at the lower rate.
Hi Colton, thanks for your suggestion/clarification.
Could you please elaborate on what you mean by adding the transactions to the questions? Do you mean the ignore filter?
For transaction_ignore_patterns, I believe I have add it to the Flask APM configuration in my Python code, right? I don't think it is for the apm-server.yml.
My question is if for one of my services, if I lower the transaction.sample.rate from 1.0 to 0.5, does that mean I lose half of my future data (not the data already in Elasticsearch) or does it queue it and send it to me in the future at a slower rate?
You will lose half of future data. Note that even unsampled transactions are recorded, but they only have the name of the transaction, the overall transaction time, and the result. They won't have any spans or additional data.
Makes sense. Is there any way to drop transaction samples that are set to false? When I set transaction.sample.count to 0, transaction.sample got set to false instead and I don't want to log these transactions anymore. I followed these steps (link) and added the lines in my apm-server.yml in Elastic Cloud Deployment configuration, but once I save, my cluster doesn't accept those changes.
I also figured out that it is a good idea to disable APM metrics for CPU and Memory since we don't monitor these metrics, but it didn't work, here is the documentation I used:
Yes I tried this, even with comma separated string, and it still sends CPU metrics. I also, tried metrics_interval set to 0 as well to completely stop the metrics from sending to Elasticsearch from APM side, and it still sent data.
I also saw some git issues opened for disabling APM metrics and I also saw this post, APM Disabling Metrics Issue. I'm not sure to what extent this issue is resolved, but for Python Flask web apps it doesn't work. Could you please test it manually as well just to confirm the issue?
sorry for the delay here, we haven't been able to reproduce this properly yet. As a workaround until we have a real solution, you could remove the CPU metrics from the METRICS_SETS setting. This setting is undocumented, as it can be a bit finikey. You can set it to