How do I modify queue.mem.events property when the APM server is managed through Fleet integration?

Kibana version:
7.16.2
Elasticsearch version:
7.16.2
APM Server version:
7.16.1
APM Agent language and version:
.NET Standard 2.0
Browser version:
NA
Original install method (e.g. download page, yum, deb, from source, etc.) and version:
Fleet integration for APM server (agent installed through deb)

Steps to reproduce:
Faced quite a few "Queue is full" errors when deploying app using APM instrumentation.

Found this article suggesting that I should increase queue.mem.events value in apm-server.yml

Unable to find any docs on how I could set this value through the APM integration in Fleet. Please guide me ASAP, need to change this on priority.

@Nicolas_Ruflin @Nima_Rezainia @mostlyjason
(Sorry for tagging if not appropriate, had learnt a lot from your videos, thought you could help guide me in the right direction)

Welcome to our community!

Unfortunately, it's not possible to tune these settings when using the APM Integration anymore. Starting with version 8.0.0, the APM Server will use a new Elasticsearch output by default that requires less manual tuning and offers better performance out of the box.
That same output is available in from 7.16.0 under an experimental flag and can be turned on via the Kibana Fleet UI.

To turn it on, Navigate to your Kibana instance > Fleet > Fleet Settings (top right hand corner of the screen), and add experimental: true in the Elasticsearch output text box (ignore all the other settings).

Doing so will restart the APM Server using the new output. A message that contains using experimental model indexer will be logged by the APM Server.

Let me know if that helps.

1 Like

Thank you for the reply!
I've enabled this for now, will monitor how it improves things. Are you sure this will alleviate the "queue is full" issue?
It was pretty prominent in test environment itself, which is why I was hesitant about moving it to prod without increasing the queue.mem.events limit to much more than the default 4GB.

Also, any other recommendations for tuning the performance of APM server or ES? We will be receiving a lot of events, I don't want APM or ES to be a bottleneck if they can't keep up. Are any of the things mentioned in the article required?

Yes, the change will eliminate the "queue is full" error messages and should provide a higher throughput than previously observed. Please do monitor the available metrics to ensure that the change is behaving as expected.

I would also recommend reading our docs with regards of Processing and performance | APM User Guide [7.16] | Elastic which should provide an estimate on APM Server sizing and Tune Elasticsearch for data ingestion | APM User Guide [7.16] | Elastic for Elasticsearch.

Last, it may be worth configuring Transaction sampling | APM User Guide [7.16] | Elastic to reduce the amount of data while keeping a percentage of relevant samples. The sampling rate can be configured centrally using the Kibana app: APM Agent central configuration | Kibana Guide [7.16] | Elastic, eliminating the need to configure and restart your traced applications.

I hope this helps.

Thanks a lot, Marc!
Yes, I've gone through all these docs already. Applied the ES data ingestion recommendations and transaction sampling is already set to 0.1 for all services. Also using RUM's active property to randomly sample only 10% of page loads.
Will keep monitoring for a few days and reply here if there's any anomalies. Thanks again! Elastic community is really the best out there.

1 Like

Hi Marc, it's been 2 weeks and the service has been running flawlessly. Thanks again for your suggestion!

4 Likes

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.