I hope this message finds you well. I’m currently exploring the implementation of custom metrics using the Elastic APM Python agent, as outlined in the documentation here.
While the existing documentation provides a good overview, I would appreciate a more detailed guide or additional examples on how to effectively implement and use custom metrics in Python applications. Specifically, insights into which metric type is best for different use cases—such as when to use gauge metrics versus counters, histograms, etc.—and whether specific metric types need to be plotted in certain ways would be extremely helpful. Additionally, more detailed documentation on the methods available for each metric type would greatly enhance my understanding and implementation.
Thank you for your assistance, and I look forward to your guidance!
Thank you so much for reaching out, @Divyanshu_Sharma. This sounds like it could be really great content for us to create in the future. I found this video a helpful resource in the past. I'd like to hear a bit more from you about what problems you faced and if there were any errors along the way.
I reviewed the video, and while it is quite useful, it unfortunately doesn't cover the topic of implementing custom metrics.
What I want to achieve with the Elastic Python APM Agent is to send a few custom metrics that Elastic doesn’t collect on its own (such as CPU and Memory) and have their graphical visualizations on the dashboard. Specifically, I’m interested in custom metrics like:
Processing time for a request, which would be a float value (e.g., 60.0s).
HTTP status code for a request, represented as an integer (e.g., 2XX, 5XX).
A value of "1" emitted for each request, which can be summed over a time range to show the total number of requests processed in that period.
There are additional scenario-specific metrics I’d like to track as well.
In the Elastic documentation, it mentions using the Prometheus client or MetricSet for this implementation. I tried using MetricSet, but I'm having difficulty understanding how to utilize its various data structures, such as gauge, counter, timer, and histogram. Could you provide a code example for one of the custom metrics I listed above?
Additionally, an API reference for the Elastic metrics API and perhaps an article demonstrating custom metric implementation with Elastic APM would also be very helpful.
Thanks for your follow-up and helpful feedback, @Divyanshu_Sharma. I might have confused it with another video I watched. Sorry about that. If I find the video I'm thinking of, I'll add it here. Have you seen this our API reference?
To capture processing times, could you manually start and end transactions to capture custom processing times?
For HTTP status code, would something like this work:
elastic_apm.label(http_status_code=200)
For the number of requests in a period, could you use this method to create a counter?
@jessgarson Yes I have been through the API reference. But I want to use Custom Metrics as defined here.
I don't want to have transactions & spans, but metrics so that I can plot a graph, currently I have achieved this by sending extra parameters in log & indexing them & then plotting them logger.info("Example message!", extra={"processing_time": 30.0})
Please find the code attached below as an example of what I am trying to do. I want to use histogram from MetricSet and use it to send the value 10 a 100 times when that API endpoint is hit.
from fastapi import FastAPI
from elasticapm.contrib.starlette import make_apm_client, ElasticAPM
from elasticapm.metrics.base_metrics import MetricSet
apm = make_apm_client({
'SERVICE_NAME': 'pw-ds-test',
'SERVER_URL': 'http://localhost:8200',
})
app = FastAPI()
app.add_middleware(ElasticAPM,client=apm)
metricset = apm.metrics.register(MetricSet)
@app.get("/health")
async def health_check():
for i in range(0,100):
metricset.histogram("test_histogram").update(10)
return {"status": "ok"}
But as you'll find from the screenschot attached below is that it showes a value of 8.5 with count of 100 as opposed to value of 10 with a count of 100
Thanks for your patience, @Divyanshu_Sharma. I've played with custom metrics but am still new to APM. After chatting with a coworker about this issue, I have a follow-up question. Are you using Linux? If not, you need to install psutil for metric set.
Thank you for your efforts and insights on this! @jessgarson . I wanted to clarify that I’m using macOS. But as per the documentation, it seems that psutil is only required for the CPU/Memory MetricSet when not using Linux, not for Custom MetricSet. Also I am able to send data points using the data structures of MetricSet as defined in base_metric.py, but as evident in the attached the code snippet & SS above it's not displaying the correct value in Elastic UI, 8.5 instead of the 10 that I sent. What I want to understand is how do I best utilize these data structures via some code examples & other documentation surrounding it & why that discrepancy in values.
Thanks for all your follow up, @Divyanshu_Sharma. I shared this post in an internal channel, and I'm doing some further testing here. I'll be back in touch shortly.
Thanks again for all your patience, @Divyanshu_Sharma. I chatted with another coworker about this issue, and they suggested using counter or gauge instead of histogram for metrics.
@Divyanshu_Sharma To provide more context here, you will want to use counter if the value only goes up (to calculate as a rate) and gauge if it goes up and down (to track the current value).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.