Can't connect hosts to the Fleet Server

Hi all!
I need your help!
I have a self-hosted ELK (not cloud)
I've enrolled the Fleet Server for SIEM using this manual for a self-managed server with default parameters in elastic-agent.yml
I need to connect hosts to my elk using elk agents.

It looks like Fleet Server works well (netstat and kibana screenshots)
It is healthy and has a connection to elastic.

But when I'm trying to connect another ELK Agent, the Fleet server refuse it

Iptables is ok (not using it because of cloud infra), the port is open
My infra hosted in the GCP
The firewall has been configured to allow connections inside the local network from target hosts to the fleet server.
If I make curl -k https://FLEET_SERVER:8220/api/status from the target machine to fleet
I’ll get {"name":"fleet-server","status":"HEALTHY"}
So it has a connect to the fleet, I think, am I?
what is the problem? or maybe you have any advice?
Because now I have no idea at all.
Also I've made tcpdump from two different host. maybe it'll be useful (can't attach the dump file)

Thank you for any help!

Hi @Dmitriy_Esin

Have you tried https and / or enrolling with the --insecure flag ?

Also why are you running it standalone?

Hi @zx8086!
Yes, I've used --insecure flag with this command:

sudo ./elastic-agent install -f --url=http://FLEET_SERVER_HOST --enrollment-token=TOKEN --insecure

It's how I've installed Fleet Server
sudo ./elastic-agent install -f --url=http://FLEET_SERVER_HOST:8220 --fleet-server-es=https://ELK_SERVER_HOST:9200 --fleet-server-es-ca=/path_to/elastic.pem --fleet-server-service-token=TOKEN

Why standalone? Hmm, it's enrolled in Kubernetes cluster with terraform before I came. It's not the cloud version provided by Elastic, I know.

I now see self-managed and thought it was standalone, misread.

What error are you getting ?

@zx8086 As I said, I got this error when trying to enroll ELK Agent with Endpoint Security and System integrations.

But If I make curl -k https://FLEET_SERVER:8220/api/status from the target machine to Fleet Server I’ll get {"name":"fleet-server","status":"HEALTHY"}.
Fleet doesn't reject/reset my connection.

You are not enrolling with the correct protocol.

Use https instead of http.

Same as you are doing with the api check. HTTPS

@zx8086 Oh, God!
Thank you so much!
My bad.
Now it works!

1 Like

@zx8086 sorry, maybe you know why the elastic agent has gone offline after being successfully installed?

I see this in logs

{"log.level":"error","@timestamp":"2021-10-12T10:22:39.984Z","log.origin":{"":"fleet/fleet_gateway.go","file.line":205},"message":"Could not communicate with fleet-server Checking API will retry, error: fail to read original error: read tcp> read: connection reset by peer","ecs.version":"1.6.0"}

But when I make curl -k I still get OK status {"name":"fleet-server","status":"HEALTHY"}

Maybe communication between the Server and Host, did they all initialize and go Healthy before ?

@zx8086 Yes, it was healthy before.

so, I've unenrolled all agents and the fleet server and want to start it from the beginning.

Earlier I've used this command to install a fleet server (with HTTP), which was successful.

sudo ./elastic-agent install -f --url= --fleet-server-es=https://ELK_HOWT:9200 --fleet-server-es-ca=/root/ca/elastic.pem --fleet-server-service-token=TOKEN

but now I use this command (with HTTPS)

sudo ./elastic-agent install -f --url= --fleet-server-es=https://ELK_HOWT:9200 --fleet-server-es-ca=/root/ca/elastic.pem --fleet-server-service-token=TOKEN

and got this message

2021-10-12T10:47:29.604Z	INFO	cmd/enroll_cmd.go:354	Generating self-signed certificate for Fleet Server
2021-10-12T10:47:32.036Z	INFO	cmd/enroll_cmd.go:668	Waiting for Elastic Agent to start Fleet Server
2021-10-12T10:47:33.037Z	INFO	cmd/enroll_cmd.go:651	Waiting for Elastic Agent to start
2021-10-12T10:47:35.101Z	INFO	cmd/enroll_cmd.go:701	Fleet Server - Starting
2021-10-12T10:47:37.104Z	INFO	cmd/enroll_cmd.go:682	Fleet Server - Running on default policy with Fleet Server integration; missing config (expected during bootstrap process)
2021-10-12T10:47:37.690Z	INFO	cmd/enroll_cmd.go:414	Starting enrollment to URL: https://jumphost:8220/
Error: fail to enroll: fail to execute request to fleet-server: fail to decode enrollment response: context canceled
Error: enroll command failed with exit code: 1

if use installation command with HTTP, it'll be successfully installed and being healthy
I think maybe my previous way of installation (with HTTP) was wrong and it is why agents have no API connection to the fleet.

I would start from scratch for the server and the agents all on https and make sure your fleet settings is correct as well, to the correct protocol HTTPS

@zx8086 yes, all is correct

Could be your DNS ? Is jumphost a reachable DNS address from the agents ?

@zx8086 okay, my bad again, added it in /etc/hosts to agent instances.
but installation on the target instance still fail (fleet server)

i would recommend using the same IP Address or DNS for the Fleet setting, the enrollment URL and connecting to the hosts. They are mismatched right now and therefore you bring in the element of having to troubleshoot your own network layers.

Use the same url everywhere.

@zx8086 okay, will try!
Thank you for your help!
You are awesome!