Fleet - agent unhealthy

Hello,

I am running Elastic cluster on version 8.15.3. I am trying to deploy fleet server on one of the nodes and I go through the setup of fleet server as advised by Kibana. The CLI installation of fleet is done via this command:

curl -L -O https://artifacts.elastic.co/downloads/beats/elastic-agent/elastic-agent-8.15.3-linux-x86_64.tar.gz
tar xzvf elastic-agent-8.15.3-linux-x86_64.tar.gz
cd elastic-agent-8.15.3-linux-x86_64
sudo ./elastic-agent install \
  --fleet-server-es=https://10.x.y.z:9200 \
  --fleet-server-service-token=AAEAAWVsYXN0aWMvZmxlZXQtc2VydmVyL3Rva2VuLTE3MzEwNjk2Nzk1NjQ6TzJSNXRQR05RVE9hWk9ld00wc2Zldw \
  --fleet-server-policy=fleet-server-policy \
  --fleet-server-es-ca-trusted-fingerprint=2DFA26BD931CA5E4FA29C63BDA41F0E6EC45CED1CF05AF6DDCA9D9FE01AA3E53 \
  --fleet-server-port=8220 --base-path /elk/fleet_server

The fleet server installs itself, I can see the fleet server with "fleet server policy" among agents. After installation (and any subsequent elastic-agent service restart). the agent representing fleet server becomes healthy for few minutes before going to "unhealthy" state again. I can't find what is unhealthy, why is that unhealthy etc. Any guidance will be appreciated, I have tried everything I could think of with no tangible result.

Hello,

I'm not much of an expert but I recently had similar problems although I don't quite remember how I solved it.

One of the possible causes for the agent not being healthy is that it is not reaching the https://artifacts.elastic.co site*.*

Also check entering the log details in the unhealthy agent.

Make sure that the necessary ports 8220 for fleet server and 9200 for logstash are open.

As I said I am not an expert but this is what I can share from my experience.

In my case I handle self-signed security certificates so I must add --insecure :arrow_left: at the end of the code for the installation and enrollment of the agent. This issue of certificates is the most difficult one for me.

curl -L -O https://artifacts.elastic.co/downloads/beats/elastic-agent/elastic-agent-8.15.3-linux-x86_64.tar.gz
tar xzvf elastic-agent-8.15.3-linux-x86_64.tar.gz
cd elastic-agent-8.15.3-linux-x86_64
sudo ./elastic-agent install \
  --fleet-server-es=https://10.x.y.z:9200 \
  --fleet-server-service-token=AAEAAWVsYXN0aWMvZmxlZXQtc2VydmVyL3Rva2VuLTE3MzEwNjk2Nzk1NjQ6TzJSNXRQR05RVE9hWk9ld00wc2Zldw \
  --fleet-server-policy=fleet-server-policy \
  --fleet-server-es-ca-trusted-fingerprint=2DFA26BD931CA5E4FA29C63BDA41F0E6EC45CED1CF05AF6DDCA9D9FE01AA3E53 \
  --fleet-server-port=8220 --base-path /elk/fleet_server --insecure

Hi, @juancamiloll,

thanks a lot for your response. Unfortunately, log details of unhealthy agent are (as per your screenshot instruction) empty.

Telnet to localhost to both 8220 (fleet server) and 9200 (elasticsearch) works well. The elasticsearch server will also host the fleet server (if I will be able to configure it one day).

What is the most puzzling aspect of it all for me is the fleet server having logs full of these:

{"log.level":"info","@timestamp":"2024-11-14T10:19:36.555Z","message":"Running on policy with Fleet Server integration: fleet-server-policy","component":{"binary":"fleet-server","dataset":"elastic_agent.fleet_server","id":"fleet-server-default","type":"fleet-server"},"log":{"source":"fleet-server-default"},"ecs.version":"1.6.0","service.name":"fleet-server","service.type":"fleet-server","state":"HEALTHY","ecs.version":"1.6.0"}

Healthy? Really?

Problem seems to be fixed. In the Fleet->Settings->Output, there's an URL pointing to Elastic (i.e. https://<ip_addr>:9200).

I am not sure what caused the issue in the first place but this is what worked for me:

  • I have created new output which I made default.
  • ES has self-signed cert on 9200, so I added its fingerprint
  • Type of output was: elasticsearch
  • IP address of ES was the one that is on the NIC (from 10.0.0.0/8 subnet)

I have changed the ES's URL to https://127.0.0.1:9200

That I find quite odd to be honest as I have this configuration in elasticsearch.yml:

... nevertheless, since the change implementation stated above, agent is in green.

1 Like