Fleet - agent unhealthy

sranda.zitra · November 8, 2024, 1:55pm

Hello,

I am running Elastic cluster on version 8.15.3. I am trying to deploy fleet server on one of the nodes and I go through the setup of fleet server as advised by Kibana. The CLI installation of fleet is done via this command:

curl -L -O https://artifacts.elastic.co/downloads/beats/elastic-agent/elastic-agent-8.15.3-linux-x86_64.tar.gz
tar xzvf elastic-agent-8.15.3-linux-x86_64.tar.gz
cd elastic-agent-8.15.3-linux-x86_64
sudo ./elastic-agent install \
  --fleet-server-es=https://10.x.y.z:9200 \
  --fleet-server-service-token=AAEAAWVsYXN0aWMvZmxlZXQtc2VydmVyL3Rva2VuLTE3MzEwNjk2Nzk1NjQ6TzJSNXRQR05RVE9hWk9ld00wc2Zldw \
  --fleet-server-policy=fleet-server-policy \
  --fleet-server-es-ca-trusted-fingerprint=2DFA26BD931CA5E4FA29C63BDA41F0E6EC45CED1CF05AF6DDCA9D9FE01AA3E53 \
  --fleet-server-port=8220 --base-path /elk/fleet_server

The fleet server installs itself, I can see the fleet server with "fleet server policy" among agents. After installation (and any subsequent elastic-agent service restart). the agent representing fleet server becomes healthy for few minutes before going to "unhealthy" state again. I can't find what is unhealthy, why is that unhealthy etc. Any guidance will be appreciated, I have tried everything I could think of with no tangible result.

juancamiloll · November 12, 2024, 4:46pm

Hello,

I'm not much of an expert but I recently had similar problems although I don't quite remember how I solved it.

One of the possible causes for the agent not being healthy is that it is not reaching the https://artifacts.elastic.co site*.*

Also check entering the log details in the unhealthy agent.

Make sure that the necessary ports 8220 for fleet server and 9200 for logstash are open.

As I said I am not an expert but this is what I can share from my experience.

In my case I handle self-signed security certificates so I must add --insecure at the end of the code for the installation and enrollment of the agent. This issue of certificates is the most difficult one for me.

curl -L -O https://artifacts.elastic.co/downloads/beats/elastic-agent/elastic-agent-8.15.3-linux-x86_64.tar.gz
tar xzvf elastic-agent-8.15.3-linux-x86_64.tar.gz
cd elastic-agent-8.15.3-linux-x86_64
sudo ./elastic-agent install \
  --fleet-server-es=https://10.x.y.z:9200 \
  --fleet-server-service-token=AAEAAWVsYXN0aWMvZmxlZXQtc2VydmVyL3Rva2VuLTE3MzEwNjk2Nzk1NjQ6TzJSNXRQR05RVE9hWk9ld00wc2Zldw \
  --fleet-server-policy=fleet-server-policy \
  --fleet-server-es-ca-trusted-fingerprint=2DFA26BD931CA5E4FA29C63BDA41F0E6EC45CED1CF05AF6DDCA9D9FE01AA3E53 \
  --fleet-server-port=8220 --base-path /elk/fleet_server --insecure

sranda.zitra · November 14, 2024, 10:24am

Hi, @juancamiloll,

thanks a lot for your response. Unfortunately, log details of unhealthy agent are (as per your screenshot instruction) empty.

Telnet to localhost to both 8220 (fleet server) and 9200 (elasticsearch) works well. The elasticsearch server will also host the fleet server (if I will be able to configure it one day).

What is the most puzzling aspect of it all for me is the fleet server having logs full of these:

{"log.level":"info","@timestamp":"2024-11-14T10:19:36.555Z","message":"Running on policy with Fleet Server integration: fleet-server-policy","component":{"binary":"fleet-server","dataset":"elastic_agent.fleet_server","id":"fleet-server-default","type":"fleet-server"},"log":{"source":"fleet-server-default"},"ecs.version":"1.6.0","service.name":"fleet-server","service.type":"fleet-server","state":"HEALTHY","ecs.version":"1.6.0"}

Healthy? Really?

sranda.zitra · November 14, 2024, 11:37am

Problem seems to be fixed. In the Fleet->Settings->Output, there's an URL pointing to Elastic (i.e. https://<ip_addr>:9200).

I am not sure what caused the issue in the first place but this is what worked for me:

I have created new output which I made default.
ES has self-signed cert on 9200, so I added its fingerprint
Type of output was: elasticsearch
IP address of ES was the one that is on the NIC (from 10.0.0.0/8 subnet)

I have changed the ES's URL to https://127.0.0.1:9200

That I find quite odd to be honest as I have this configuration in elasticsearch.yml:

... nevertheless, since the change implementation stated above, agent is in green.

Topic		Replies	Views
Why Fleet server is not healthy? Beats fleet	9	514	May 1, 2024
Fleet Server is Not Healthy Kibana fleet	4	2218	February 20, 2023
Elastic Agents Unhealthy Status Elastic Agent elastic-stack-monitoring	1	411	December 5, 2022
Fleet Server displaying as not Healthy Endpoint Security fleet	1	843	August 28, 2022
Elastic agent unhealthy Elastic Cloud on Kubernetes (ECK)	2	476	December 8, 2023

Fleet - agent unhealthy

Related topics