Elastic Endpoint cannot connect to agent

Hi,

I have enrolled Elastic Endpoint via Fleet, but I cannot seem to get it working and there is very little troubleshooting information available.

I am receiving the below errors when the Endpoint is starting and I cannot find any explanation to the error messages. Can anyone point me in the correct direction?

Logs retrieved in /opt/Elastic/Endpoint/state/log/abcdefg.log:

{"@timestamp":"2024-06-25T09:12:32.0370974Z","agent":{"id":"","type":"endpoint"},"ecs":{"version":"8.10.0"},"log":{"level":"warning","origin":{"file":{"line":180,"name":"AgentConnectionInfo.cpp"}}},"message":"AgentConnectionInfo.cpp:180 Failed to established stage 1 connection to
 agent","process":{"pid":48269,"thread":{"id":48295}}}
{"@timestamp":"2024-06-25T09:12:32.0371631Z","agent":{"id":"","type":"endpoint"},"ecs":{"version":"8.10.0"},"log":{"level":"error","origin":{"file":{"line":959,"name":"AgentComms.cpp"}}},"message":"AgentComms.cpp:959 Unable to retrieve connection info from Agent(Timeout occurred)
","process":{"pid":48269,"thread":{"id":48295}}}
root@server:/opt/Elastic/Endpoint# elastic-agent status
┌─ fleet
│  └─ status: (HEALTHY) Connected
└─ elastic-agent
   ├─ status: (DEGRADED) 1 or more components/units in a failed state
   └─ endpoint-default
      ├─ status: (FAILED) Failed: endpoint service missed 3 check-ins
      ├─ endpoint-default
      │  └─ status: (FAILED) Failed: endpoint service missed 3 check-ins
      └─ endpoint-default-fc4e451a-8346-46d5-bb74-fbac2ad9960a
         └─ status: (FAILED) Failed: endpoint service missed 3 check-ins01:/opt/Elastic/Endpoint# elastic-agent status
┌─ fleet
│  └─ status: (HEALTHY) Connected
└─ elastic-agent
   ├─ status: (DEGRADED) 1 or more components/units in a failed state
   └─ endpoint-default
      ├─ status: (FAILED) Failed: endpoint service missed 3 check-ins
      ├─ endpoint-default
      │  └─ status: (FAILED) Failed: endpoint service missed 3 check-ins
      └─ endpoint-default-fc4e451a-8346-46d5-bb74-fbac2ad9960a
         └─ status: (FAILED) Failed: endpoint service missed 3 check-ins

Please remember to provide version when asking questions. The products are constantly evolving, so it adds a lot to your question and helps future reader of this thread.

/opt/Elastic/Agent/elastic-agent version
/opt/Elastic/Endpoint/elastic-endpoint version

Elastic Endpoint communicates with Elastic Agent via localhost TCP connection. In your case the connection is broken. It can be caused by 3rd party security product intercepting every TCP connection, or even because such a trivial thing like misconfigured hosts file.

You can try ping -4 localhost if it works on the machine.

Further you can generate a diagnostic bundle from Endpoint

sudo /opt/Elastic/Endpoint/elastic-endpoint diagnostics

There will be a file analysis.txt with information about the TCP connection.

PS. I can see the version in the log, 8.10.0, thanks

Hi,

Thank you for your response!

I'm actually runnin 8.13.4 so I'm not sure why it says 8.10.0 in the logs.

However, I will try to do the diagnostics and then also check the networking for connecting to the agent.

Thanks!

Hi again,

This is the output of the analysis.txt, but it doesn't really say much about what went wrong. Will try to look at the rest of the diagnosis material as well.

One detail is that it is not possible to telnet to localhost:6788

Output from netstat shows that elastic-agent is listening on three ports, but the Endpoint being in state "SYN_SENT" does inflict that the agent does not respond.


root@server:/tmp/diag£ netstat -nepal | grep elastic-agent
tcp        0      0 127.0.0.1:6788          0.0.0.0:*               LISTEN      0          1816627    53420/elastic-agent
tcp        0      0 127.0.0.1:6789          0.0.0.0:*               LISTEN      0          1815557    53420/elastic-agent
tcp        0      0 127.0.0.1:6791          0.0.0.0:*               LISTEN      0          1815572    53420/elastic-agent

...

tcp        0      1 127.0.0.1:46507         127.0.0.1:6788          SYN_SENT    0          2032393    53696/elastic-endpo

== Agent<->Endpoint connection analysis ==
Agent and Endpoint are not actively connected

Active connections that are for or may conflict with Agent and Endpoint:
* /opt/Elastic/Agent/data/elastic-agent-8.13.4-a2e31a/elastic-agent (pid 53420) is root/admin and using or recently used connection 'LISTEN (ipv4,tcp) 127.0.0.1:6788 <-> 0.0.0.0:0'
* /opt/Elastic/Agent/data/elastic-agent-8.13.4-a2e31a/elastic-agent (pid 53420) is root/admin and using or recently used connection 'LISTEN (ipv4,tcp) 127.0.0.1:6789 <-> 0.0.0.0:0'
* /opt/Elastic/Agent/data/elastic-agent-8.13.4-a2e31a/elastic-agent (pid 53420) is root/admin and using or recently used connection 'ESTABLISHED (ipv4,tcp) 127.0.0.1:6789 <-> 127.0.0.1:52072'
* /opt/Elastic/Agent/data/elastic-agent-8.13.4-a2e31a/components/metricbeat (pid 53576) is root/admin and using or recently used connection 'ESTABLISHED (ipv4,tcp) 127.0.0.1:52114 <-> 127.0.0.1:6789'
* /opt/Elastic/Agent/data/elastic-agent-8.13.4-a2e31a/components/metricbeat (pid 53486) is root/admin and using or recently used connection 'ESTABLISHED (ipv4,tcp) 127.0.0.1:52034 <-> 127.0.0.1:6789'
* /opt/Elastic/Agent/data/elastic-agent-8.13.4-a2e31a/components/metricbeat (pid 53493) is root/admin and using or recently used connection 'ESTABLISHED (ipv4,tcp) 127.0.0.1:52052 <-> 127.0.0.1:6789'
* /opt/Elastic/Agent/data/elastic-agent-8.13.4-a2e31a/elastic-agent (pid 53420) is root/admin and using or recently used connection 'ESTABLISHED (ipv4,tcp) 127.0.0.1:6789 <-> 127.0.0.1:52114'
* /opt/Elastic/Endpoint/elastic-endpoint (pid 53696) is root/admin and using or recently used connection 'SYN_SENT (ipv4,tcp) 127.0.0.1:36705 <-> 127.0.0.1:6788'
* /opt/Elastic/Agent/data/elastic-agent-8.13.4-a2e31a/elastic-agent (pid 53420) is root/admin and using or recently used connection 'ESTABLISHED (ipv4,tcp) 127.0.0.1:6789 <-> 127.0.0.1:52052'
* /opt/Elastic/Agent/data/elastic-agent-8.13.4-a2e31a/components/filebeat (pid 53479) is root/admin and using or recently used connection 'ESTABLISHED (ipv4,tcp) 127.0.0.1:52016 <-> 127.0.0.1:6789'
* /opt/Elastic/Agent/data/elastic-agent-8.13.4-a2e31a/elastic-agent (pid 53420) is root/admin and using or recently used connection 'ESTABLISHED (ipv4,tcp) 127.0.0.1:6789 <-> 127.0.0.1:52016'
* /opt/Elastic/Agent/data/elastic-agent-8.13.4-a2e31a/elastic-agent (pid 53420) is root/admin and using or recently used connection 'ESTABLISHED (ipv4,tcp) 127.0.0.1:6789 <-> 127.0.0.1:52094'
* /opt/Elastic/Agent/data/elastic-agent-8.13.4-a2e31a/elastic-agent (pid 53420) is root/admin and using or recently used connection 'ESTABLISHED (ipv4,tcp) 127.0.0.1:6789 <-> 127.0.0.1:52034'
* /opt/Elastic/Agent/data/elastic-agent-8.13.4-a2e31a/components/filebeat (pid 53516) is root/admin and using or recently used connection 'ESTABLISHED (ipv4,tcp) 127.0.0.1:52072 <-> 127.0.0.1:6789'
* /opt/Elastic/Agent/data/elastic-agent-8.13.4-a2e31a/components/metricbeat (pid 53562) is root/admin and using or recently used connection 'ESTABLISHED (ipv4,tcp) 127.0.0.1:52094 <-> 127.0.0.1:6789'

Potential problems:
* There is no active connection between Endpoint and Agent

yes, this is the culprit, Endpoint <-> Agent cannot talk to each other

OK - I found the solution.

The problem was that the communication between 127.0.0.1 and 127.0.0.1 was not permitted properly by iptables.

I might lack understanding of iptables, but I had already before my original post allowed TCP-traffic on port 6787-6789 on input and all established or related connections on output.

However - the only way to resolve it was to allow any traffic between 127.0.0.1 and 127.0.0.1 on the OUTPUT as well as allowing port 6787-6789 on input.

INPUT:
If protocol is TCP and destination is 127.0.0.1 and destination ports are 6787,6788,6789
OUTPUT:
If source is 127.0.0.1 and destination is 127.0.0.1

Thank you for your effords lesio!