Fleet Server Agent not listening

I'm trying to add a fleet server agent to my stack so I can get rid of the one living on old hardware.

I've re-installed the fleet server agent (after running into Adding a Fleet Server integration to an agent that was not bootstrapped with a Fleet Server panics · Issue #2170 · elastic/elastic-agent · GitHub) and that agent's status says all is good.

But none of my normal agents can connect. They have a connection refused message in their status. And I see logs like this in my normal agents logs:

{"log.level":"error","@timestamp":"2023-03-23T11:43:44.771-0700","log.origin":{"file.name":"fleet/fleet_gateway.go","file.line":194},"message":"Cannot checkin in with fleet-server, retrying","log":{"source":"elastic-agent"},"error":{"message":"fail to checkin to fleet-server: all hosts failed: 2 errors occurred:\n\t* requester 0/2 to host https://< new broken fleet >.example.org:8220/ errored: Post \"https://< new broken fleet >.example.org:8220/api/fleet/agents/bfeccd8d-6289-4c9c-89a9-cf66edbadb3b/checkin?\": dial tcp < internal ip >:8220: connect: connection refused\n\t* requester 1/2 to host https://ws-prod-sql-01.example.org:8220/ errored: Post \"https://< turned off working fleet >.example.org:8220/api/fleet/agents/bfeccd8d-6289-4c9c-89a9-cf66edbadb3b/checkin?\": dial tcp < turned off working fleet internal ip >:8220: connect: connection refused\n\n"},"request_duration_ns":4164942,"failed_checkins":3,"retry_after_ns":310915958531,"ecs.version":"1.6.0"}

I've confirmed the firewall has 8220 open. But nmap from a normal host to the fleet host says the port is closed. (not filtered, so the firewall is open.) On the vm with the fleet server agent, netstat shows something listening on localhost:8220, but nothing on the actual vm ip. That's with fleet server configured to use "". So I'd think it would be listening on all the NICs assigned to the host vm.

This is all on 8.6.2 for ES/Kibana/Agent. The vm's are running Ubuntu 22.04. ES is installed via apt repos.

Any ideas on what might be going on?

Er, anyone?

So, oddly, on the server with a working fleet-server integration, it seems to be listening on ipv6. Which I am not exactly sure why it is working when we don't have ipv6 configured on our network...

Working node:

# netstat -pln | grep 8220
tcp6       0      0 :::8220                 :::*                    LISTEN      30586/fleet-server  

Not working node:

# netstat -pln | grep 8220
tcp        0      0*               LISTEN      510388/fleet-server 

I also tried setting the Host field in the integration settings to the specific ip address of the nic I want fleet-server on. It did not make any difference.

Just to double check, the column value means that fleet-server is listening for connections on localhost, right? And the* column value means it will accept connections from any address and port, right?

So what I want to see is in the column that currently has . Correct?

I pulled down the output of elastic-agent diagnostics. Diffing the components.yaml, computed-config.yaml, pre-config.yaml, state.yaml, and variables.yaml files show no differences that didn't boil down to different hosts or the order things were output in.

I also tried enabling ipv6 on the non-working vm. Didn't help.

Any other ideas?

Apparently https://github.com/elastic/elastic-agent/pull/2198#discussion_r1093441143 the Host field in the Kibana fleet server integration settings page does nothing.

And because I'm using insecure mode, Agent defaults to binding only on localhost. Which you need to override with the --fleet-server-host flag on install.

I never found any documentation stating that you need to use the --fleet-server-host flag when installing insecurely. A nice person on Slack happened to know what was going on and clued me in.

Just... Sheesh.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.