APM Service keep on failing in ELK server

Hi Team,

Despite resolving the queue full issue on the ELK server, the APM service continues to fail intermittently. Our ELK server, hosted on a DigitalOcean droplet, is responsible for monitoring logs from the AWS environment.

Initially, the problem surfaced with a queue fill error due to the file system reaching its capacity. This led to Kibana console unavailability. To mitigate this, I manually cleared some indices on the ELK server and further managed them via the console.

Subsequently, I restarted both the ELK and APM services to ensure proper functionality. However, the APM service persists in failing after a short period of operation. Additionally, I noticed that APM logs were not updating prior to this incident.

Upon inspecting the Elasticsearch logs, we observed the following:

[2024-03-06T22:08:27,023][INFO ][o.e.x.s.a.AuthenticationService] [node-1] Authentication of [apm_system] was terminated by realm [reserved] - failed to authenticate user [apm_system]

Your prompt attention to this matter is greatly appreciated. Please let me know if you require any further details

Kinldy suggest the way to debug and fix

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.