Kibana version:
7.16.2 Elasticsearch version:
7.16.2 APM Server version:
7.16.1 APM Agent language and version:
.NET Standard 2.0 Browser version:
NA Original install method (e.g. download page, yum, deb, from source, etc.) and version:
Fleet integration for APM server (agent installed through deb)
Steps to reproduce:
Faced quite a few "Queue is full" errors when deploying app using APM instrumentation.
Found this article suggesting that I should increase queue.mem.events value in apm-server.yml
Unable to find any docs on how I could set this value through the APM integration in Fleet. Please guide me ASAP, need to change this on priority.
@Nicolas_Ruflin@Nima_Rezainia@mostlyjason
(Sorry for tagging if not appropriate, had learnt a lot from your videos, thought you could help guide me in the right direction)
Unfortunately, it's not possible to tune these settings when using the APM Integration anymore. Starting with version 8.0.0, the APM Server will use a new Elasticsearch output by default that requires less manual tuning and offers better performance out of the box.
That same output is available in from 7.16.0 under an experimental flag and can be turned on via the Kibana Fleet UI.
To turn it on, Navigate to your Kibana instance > Fleet > Fleet Settings (top right hand corner of the screen), and add experimental: true in the Elasticsearch output text box (ignore all the other settings).
Thank you for the reply!
I've enabled this for now, will monitor how it improves things. Are you sure this will alleviate the "queue is full" issue?
It was pretty prominent in test environment itself, which is why I was hesitant about moving it to prod without increasing the queue.mem.events limit to much more than the default 4GB.
Also, any other recommendations for tuning the performance of APM server or ES? We will be receiving a lot of events, I don't want APM or ES to be a bottleneck if they can't keep up. Are any of the things mentioned in the article required?
Yes, the change will eliminate the "queue is full" error messages and should provide a higher throughput than previously observed. Please do monitor the available metrics to ensure that the change is behaving as expected.
Thanks a lot, Marc!
Yes, I've gone through all these docs already. Applied the ES data ingestion recommendations and transaction sampling is already set to 0.1 for all services. Also using RUM's active property to randomly sample only 10% of page loads.
Will keep monitoring for a few days and reply here if there's any anomalies. Thanks again! Elastic community is really the best out there.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.