Hi All,
I was recently doing some troubleshooting with a degraded Elastic Agent install, and I noticed that when you restart the elastic-agent it doesn't actually cause a restart of the Elastic Endpoint service, therefore leaving it in a degraded state. I was wonder if any else has come across this issue, and found a solution for it.
Elastic Agent version: 7.17.0
Install Method: .tar
OS:
:~$ cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.3 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.3 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
Elastic Agent Initial Status: sudo /opt/Elastic/Agent/elastic-agent status
Status: DEGRADED
Message: (no message)
Applications:
* osquerybeat (HEALTHY)
Running
* endpoint-security (DEGRADED)
Protecting with policy {3d5dda65-29b2-4abc-bbb3-11a39e5e443d}
* filebeat (HEALTHY)
Running
* metricbeat (HEALTHY)
Running
* filebeat_monitoring (HEALTHY)
Running
* metricbeat_monitoring (HEALTHY)
Running
Restart Elastic Agent: sudo /opt/Elastic/Agent/elastic-agent restart
Elastic Agent status (post restart): sudo /opt/Elastic/Agent/elastic-agent status
Status: DEGRADED
Message: (no message)
Applications:
* osquerybeat (HEALTHY)
Running
* endpoint-security (DEGRADED)
Protecting with policy {3d5dda65-29b2-4abc-bbb3-11a39e5e443d}
* filebeat (HEALTHY)
Running
* filebeat_monitoring (HEALTHY)
Running
* metricbeat_monitoring (HEALTHY)
Running
* metricbeat (HEALTHY)
Running
Elastic Agent systemctl status: sudo systemctl status elastic-agent
● elastic-agent.service - Elastic Agent is a unified agent to observe, monitor and protect your system.
Loaded: loaded (/etc/systemd/system/elastic-agent.service; enabled; vendor preset: enabled)
Active: active (running) since Mon 2022-04-11 10:42:26 EDT; 14min ago
Main PID: 1573592 (elastic-agent)
Tasks: 87 (limit: 9443)
Memory: 310.5M
CGroup: /system.slice/elastic-agent.service
├─1573592 /opt/Elastic/Agent/elastic-agent
Elastic Endpoint systemctl status: sudo systemctl status ElasticEndpoint
:
● ElasticEndpoint.service - ElasticEndpoint
Loaded: loaded (/etc/systemd/system/ElasticEndpoint.service; enabled; vendor preset: enabled)
Active: active (running) since Sun 2022-02-13 10:44:45 EST; 1 months 26 days ago
Main PID: 3123021 (elastic-endpoin)
Tasks: 41 (limit: 9443)
Memory: 427.5M
CGroup: /system.slice/ElasticEndpoint.service
└─3123021 /opt/Elastic/Endpoint/elastic-endpoint run
Because the ElasticEndpoint service never gets restarted the Elastic Agent can never go into a healthy state.
Manually restarting the ElasticEndpoint service: sudo systemctl restart ElasticEndpoint
Appears to fix this issue: sudo /opt/Elastic/Agent/elastic-agent status
Status: HEALTHY
Message: (no message)
Applications:
* metricbeat (HEALTHY)
Running
* osquerybeat (HEALTHY)
Running
* endpoint-security (HEALTHY)
Protecting with policy {3d5dda65-29b2-4abc-bbb3-11a39e5e443d}
* filebeat (HEALTHY)
Running
* filebeat_monitoring (HEALTHY)
Running
* metricbeat_monitoring (HEALTHY)
Running
Given that the ElasticEndpoint service is added as part of the Elastic Agent itself, I would've expected restarting the Elastic Agent itself to also restart ElasticEndpoint service.