How to reduce APM transactions and metrics?

EZprogramming · January 15, 2020, 8:52pm

Kibana version:

7.3.2

Elasticsearch version:

7.3.2

APM Server version:

APM 7.3.2

APM Agent language and version:

Python Flask 4.2.2

Original install method (e.g. download page, yum, deb, from source, etc.) and version:

Provided by Elastic Cloud subscription

Fresh install or upgraded from another version?

Upgraded from 7.1.1

Additional information:

We are using the Elastic Cloud services, so we are not running our stack locally.

Problem:

I receive a lot of transactions that check for /ping or /healthcheck of a URL that doesn't exist anymore. I receive a lot of these every day and I just want to find a way to get rid of them.
Another question that I have is what if I reduce the transaction sample rate, do APM store transactions somewhere and then update them all together or do I lose access to those transactions? Currently, we are monitoring our website using APM so we can better understand customer's behaviors/issues, and my concern is that I might lose access to this information.

Objective:

Our plan is to reduce the data that we receive from APM, and only gather the data that is useful to us. Using this approach can help us reduce the amount of data which ultimately reduces our shards and can contribute to the maintenance of our cluster.

I appreciate any feedback or resources that you might have, feel free to ask for more information if you think there is missing detail in this post.

basepi · January 16, 2020, 2:34am

The transaction_ignore_patterns config should suit your needs nicely. Add the transactions in question and they will be dropped silently.
Transaction sampling decisions happen at the agent level, and changes to this config do not affect data already in elasticsearch. You won't lose any historical data, but future data will be sampled at the lower rate.

Hope that helps!

EZprogramming · January 16, 2020, 3:17am

Hi Colton, thanks for your suggestion/clarification.

Could you please elaborate on what you mean by adding the transactions to the questions? Do you mean the ignore filter?

For transaction_ignore_patterns, I believe I have add it to the Flask APM configuration in my Python code, right? I don't think it is for the apm-server.yml.

My question is if for one of my services, if I lower the transaction.sample.rate from 1.0 to 0.5, does that mean I lose half of my future data (not the data already in Elasticsearch) or does it queue it and send it to me in the future at a slower rate?

basepi · January 16, 2020, 6:31am

Yes, here is the flask-specific documentation for ignoring the routes: Flask support | APM Python Agent Reference [6.x] | Elastic

You will lose half of future data. Note that even unsampled transactions are recorded, but they only have the name of the transaction, the overall transaction time, and the result. They won't have any spans or additional data.

EZprogramming · January 16, 2020, 11:45pm

It worked!

Makes sense. Is there any way to drop transaction samples that are set to false? When I set transaction.sample.count to 0, transaction.sample got set to false instead and I don't want to log these transactions anymore. I followed these steps (link) and added the lines in my apm-server.yml in Elastic Cloud Deployment configuration, but once I save, my cluster doesn't accept those changes.

I also figured out that it is a good idea to disable APM metrics for CPU and Memory since we don't monitor these metrics, but it didn't work, here is the documentation I used:

Here is is my APM configuration in Python code:

from elasticapm.contrib.flask import ElasticAPM

app = Flask(__name__)
app.config['ELASTIC_APM'] = {
  'SERVICE_NAME': 'FLASK-APP-PROJECT',
  'SECRET_TOKEN': '***',
  'SERVER_URL': '***',
  'DISABLE_METRICS': ["*.cpu.*", "system.memory.*", "system.process.memory.*"]
}

apm = ElasticAPM(app)

Do you know what might be the issue?

basepi · January 17, 2020, 4:08pm

Hmm, you might have to show your apm-server.yml, perhaps you have a typo?

For the DISABLE_METRICS, the config needs to be a comma separated string, not a list:

app.config['ELASTIC_APM'] = {
  'SERVICE_NAME': 'FLASK-APP-PROJECT',
  'SECRET_TOKEN': '***',
  'SERVER_URL': '***',
  'DISABLE_METRICS': '*.cpu.*,system.memory.*,system.process.memory.*'
}

I think that should fix it.

EZprogramming · January 20, 2020, 9:19pm

I think I'll post a new question for this.

basepi:

For the DISABLE_METRICS , the config needs to be a comma separated string, not a list:
app.config['ELASTIC_APM'] = {
  'SERVICE_NAME': 'FLASK-APP-PROJECT',
  'SECRET_TOKEN': '***',
  'SERVER_URL': '***',
  'DISABLE_METRICS': '*.cpu.*,system.memory.*,system.process.memory.*'
}

Yes I tried this, even with comma separated string, and it still sends CPU metrics. I also, tried metrics_interval set to 0 as well to completely stop the metrics from sending to Elasticsearch from APM side, and it still sent data.

I also saw some git issues opened for disabling APM metrics and I also saw this post, APM Disabling Metrics Issue. I'm not sure to what extent this issue is resolved, but for Python Flask web apps it doesn't work. Could you please test it manually as well just to confirm the issue?

basepi · January 21, 2020, 4:24pm

Thanks for your patience, it does sound like something we need to check.

Can you please open an issue so that we can better track this?

EZprogramming · January 21, 2020, 7:36pm

No worries, glad I'm helping you guys. Eventually I need to sovle this issue as well.

I opened a Git ticket for this and tagged you, https://github.com/elastic/apm-agent-python/issues/696#issue-553080916

Let me know if you think something is missing.

EZprogramming · January 27, 2020, 4:23pm

Hey @basepi , any feedback/report so far? Let me know if you need additional details for your investigation.

beniwohli · January 27, 2020, 5:38pm

Hi @EZprogramming

sorry for the delay here, we haven't been able to reproduce this properly yet. As a workaround until we have a real solution, you could remove the CPU metrics from the METRICS_SETS setting. This setting is undocumented, as it can be a bit finikey. You can set it to

app.config['ELASTIC_APM']['METRICS_SETS'] = ["elasticapm.metrics.sets.transactions.TransactionsMetricSet",]

The TransactionsMetricSet is required to get the overall metrics how much time your app spends in database queries, http requests etc, so I recommend to leave it in.

Cheers
Beni

EZprogramming · February 5, 2020, 9:51pm

Yes that was the solution @beniwohli!

I really appreciate your answer, I tested it locally and it worked.

app.config['ELASTIC_APM'] = {
  'SERVICE_NAME': 'FLASK-APP-PROJECT',
  'SECRET_TOKEN': '...',
  'SERVER_URL': '...',
  'METRICS_SETS': "elasticapm.metrics.sets.transactions.TransactionsMetricSet"
}

system · February 26, 2020, 5:51pm

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Missing Transactions in New APM Versions - Python with FastAPI APM elastic-stack-monitoring , python , ui	4	95	August 22, 2024
APM is not show transaction APM python	2	1596	December 9, 2019
Duplicate transactions count in elastic apm APM java	6	632	October 21, 2020
Apm agent v 4.2.1 only collects metrics, not transations, for flask > 1.x APM	2	423	April 29, 2019
Rollups of APM high level data APM ilm-index-lifecycle-management , server , ui	2	559	November 18, 2020

How to reduce APM transactions and metrics?

Related topics