Agents periodically disconnecting from Fleets

Version: 8.9.0

Hi there! Working on a SOC team where we manage agents for lots of local endpoints within our network. I'm pretty new to working with the stack, so I am hoping someone with more experience with agent issues can help out. An issue we've been working around for a while is that periodically (maybe after a day or so of being successfully enrolled/sending logs to our stack), agents will go offline and never come back on unless they are reinstalled, and when they are they just start working fine again until they decide they want to go offline. Went into the endpoints (one enrolled and working, one enrolled but not working) and executed .\elastic-agent.exe status under the C:\Program Files\Elastic\Agent directory and got the following outputs:

The healthy/working agent outputs:
┌─ fleet
│ └─ status: (HEALTHY) Connected
└─ elastic-agent
└─ status: (HEALTHY) Running

The enrolled but broken/offline agent outputs:
Error: failed to communicate with Elastic Agent daemon: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing open \\\\.\\pipe\\elastic-agent-system: The system cannot find the file specified." For help, please see our troubleshooting guide at https://www.elastic.co/guide/en/fleet/8.9/fleet-troubleshooting.html

Hi @squatchulator ,
I'm facing problem about this which is relatable with you. My fleet server status showing unhealthy. I couldn't understand what's the problem raised. It's quite difficult to find out the actual issue. here I attached my error log.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.