Fleet of agents healthy but not sending data

I have a Fleet of Elastic Agents running on a fresh k8s cluster (running latest ECK). I added a couple integrations via Kibana, like the Kubernetes metrics. But the Kubernetes Dashboards are all empty, and nothing shows up in Metrics.

image

The agents all report healthy in Fleet Management. I can spot no clues from the logs on fleet-server nor elastic-agent pods. Any help would be greatly appreciated.

===

Here are some more details of my debugging journey:

Initially, I saw that the official quickstart has a couple of typos, but the one specific to this issue was: https://fleet-server-agent-http.default.svc:8220 should be https://fleet-server-quickstart-agent-http.default.svc:8220 After fixing and applying this, the elastic-agent logs still kept saying:

2021-08-25T04:54:11.727Z WARN [transport] transport/tcp.go:52 DNS lookup failure "fleet-server-agent-http.default.svc": lookup fleet-server-agent-http.default.svc on 10.100.0.10:53: no such host
2021-08-25T04:54:11.728Z ERROR fleet/fleet_gateway.go:180 failed to dispatch actions, error: fail to communicate with updated API client hosts: Get "https://fleet-server-agent-http.default.svc:8220/api/status?": lookup fleet-server-agent-http.default.svc on 10.100.0.10:53: no such host

I forced restart the elastic agents and kibana, but that didn't work. I had to go into the kibana UI and find a random Fleet setting that still said fleet-server-agent... which I had to change to fleet-server-quickstart-agent... .

All of that takes me to my current situation.

Bump

If you aren't seeing data, make sure that the Agent can connect to Elasticsearch correctly. I ran into a similar issue. The agent reports healthy if it can talk to the Fleet Server, it doesn't take into account whether the agent can talk to Elasticsearch.

There can be a few issues:

  • The URL to Elasticsearch could contain a typo, resulting in the agent trying to talk to the wrong thing
  • The Agent could be missing the root CA file (if your Elasticsearch cluster is protected by a private CA or self-signed cert)

Also, with Agent using Kubernetes, make sure you read the known limitation about the Agent needing to be in the same namespace as the Elasticsearch cluster.

Hey @yzpls, thanks for your question and finding the mistake in the quickstart. It should be fixed soon.

Regarding the need to update those settings via UI, that's something we are looking at as well.

As to your issue, I'd start with looking at the logs of individual Beats. You can do this by kubectl exec -it $POD_NAME bash into the Elastic Agent Pod and looking at files in /usr/share/elastic-agent/state/data/logs/default. Those should contain logs that better indicate what is the issue Beats are having.

You can also try to apply any of our other configuration examples. Depending on the integration enabled and their configuration, different RBAC rules or Pod template might be needed.

Let me know if you'll have any more questions, I'll be glad to help.

Thanks,
David