Hello everyone,
For a long time, I was running a Fleet server on a virtual machine.
Recently, I switched to running the Fleet server in a container using the Elastic image: elastic/elastic-agent.
Intermittently, agents that previously communicated properly are now experiencing communication errors with the Fleet server and switching to Offline status. They occasionally reconnect, but then the issue recurs.
The log error:
Cannot checking in with fleet-server, retrying
The only error that appears on the Fleet server, I'm not sure if it's related to the case:
error retrieving resource lock proj/elastic-agent-cluster-leader: leases.coordination.k8s.io "elastic-agent-cluster-leader" is forbidden User: "system:serviceaccount:proj:fleetserver-sa" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "proj"
Both setups were running in an insecure configuration, using the same ports and firewall settings.
What could be the issue?