Checking runner ip connectivity... FAILED Can't connect $RUNNER_HOST_IP [172.xx.x.xx:22000]: ConnectionError

vjam · July 16, 2021, 6:58pm

Hello All - We are trying to install Elastic (cloud-enterprise-version 2.8.1) in the following environment.

AWS, RHEL-8
Docker (docker-ce-19.03.15)

but seem to be constantly getting the error below
Checking runner ip connectivity... FAILED Can't connect $RUNNER_HOST_IP [172.xx.x.xx:22000]: ConnectionError

when we run the command below
bash <(curl -fsSL https://download.elastic.co/cloud/elastic-cloud-enterprise.sh) install --cloud-enterprise-version 2.8.1 --availability-zone ElasticAZ1 --roles "director,coordinator" --memory-settings '{"runner":{"xms":"1G","xmx":"1G"},"zookeeper":{"xms":"4G","xmx":"4G"},"director":{"xms":"1G","xmx":"1G"},"constructor":{"xms":"4G","xmx":"4G"},"admin-console":{"xms":"4G","xmx":"4G"}}'

it's like the box is trying to connect to the same box in port 22000.

We have checked all the security connections and opened all the ports for the box , but we still seem to be getting the error

Any thoughts or ideas will really help

Alex_Piggott · July 16, 2021, 7:26pm

That command spins up a mini HTTP server on 22000, opens up a docker container and from within that container tries to connect to 22000. So if you're getting an error, either iptables or your docker config must be blocking it

vjam · July 16, 2021, 7:38pm

Thanks Alex - Appreciate the quick response. How do i look at iptables in docker

Alex_Piggott · July 16, 2021, 8:10pm

Can you check if you are running firewalld - that could cause this I think: Limitations and known problems | Elastic Cloud Enterprise Reference [2.10] | Elastic

Otherwise run iptables list and check that the default action is allow. If not you might need to do something like iptables -A INPUT -i docker0 -j ACCEPT

vjam · July 16, 2021, 8:25pm

Hi Alex - we don't have firewalld. I ran the "iptables -A INPUT -i docker0 -j ACCEPT" - I possibly need to modify it.

I have attached screenshot of the iptables --list.

I don't see the default action. I still see the same error as before.

Alex_Piggott · July 16, 2021, 9:04pm

Thanks for posting, and sorry you are running into issues. I'm asking around since I can't immediately see anything wrong. (Are you running selinux, in what mode if so?)

vjam · July 17, 2021, 5:02pm

Thanks you - We are running selinux in permissive mode. Here are the settings in the file

=====================================
/etc/selinux/config

# This file controls the state of SELinux on the system.

SELINUX= can take one of these three values:

#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=permissive  #<[CHANGED FROM ENFORCING TO PERMISSIVE]>
# SELINUXTYPE= can take one of these three values:
#     targeted - Targeted processes are protected,
#     minimum - Modification of targeted policy. Only selected processes are protected.
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted

Configure kernel boot arguments.

Alex_Piggott · July 19, 2021, 2:22pm

Your iptables looks the same as mine, except my INPUT chain is empty, though ACCEPT everywhere should be equivalent

Here's what I just did to check connectivity:

Locally (use your IP):

bash> nc -lvp 22000 &
bash> echo test | nc -q 0 192.168.44.10 22000
Connection from [192.168.44.10] port 22000 [tcp/*] accepted (family 2, sport 60890)

Then repeat in a docker container (use your IP; any docker container running linux with nc installed should work):

bash> nc -lvp 22000 &
bash> docker run --network=host -it docker.elastic.co/cloud-ci/elastic-cloud-enterprise:2.8.1 bash -c "echo test | nc -q 0 192.168.44.10 22000 && exit"
Connection from [192.168.44.10] port 22000 [tcp/*] accepted (family 2, sport 46164)
(some random stuff)

I believe this is what the install test is doing so this should fail also, and give a faster feedback loop

You could try turning iptables off completely, disabling selinux to try to isolate the problem.

vjam · July 19, 2021, 3:46pm

Hi Alex - Disragard the last comment - it wasn't actually working .

Here are this details of the nc command
nc -lvp 22000
Ncat: Version 7.70 ( Ncat - Netcat for the 21st Century )
Ncat: Listening on :::22000
Ncat: Listening on 0.0.0.0:22000

Alex_Piggott · July 20, 2021, 12:34pm

The nc -lvp sets up the listener, can you now try connecting to it with the ...echo test... commands as per the snippets above?

vjam · July 21, 2021, 2:38pm

Hi Alex - Sorry it got crazy busy yesterday.

This worked for us - we had to have a --host parameter and specify the Public IP parameter.

Do you think its ok to use Public Host IP - will we run into issues later (please let us know your thoughts).
Otherwise this issue is resolved. - Thanks

Here is the command.

bash <(curl -fsSL https://download.elastic.co/cloud/elastic-cloud-enterprise.sh) install
--cloud-enterprise-version 2.8.1
--availability-zone ElasticAZ1
--roles "director,coordinator,proxy,allocator"
--memory-settings '{"runner":{"xms":"1G","xmx":"1G"},"zookeeper":{"xms":"4G","xmx":"4G"},"director":{"xms":"1G","xmx":"1G"},"constructor":{"xms":"4G","xmx":"4G"},"admin-console":{"xms":"4G","xmx":"4G"}}'
--host-ip PUBLICIP

vjam · July 22, 2021, 1:45pm

Alex - Do you have someone who has installed elastic in AWS. I think we may have to install it with the Public IP (instead of private ip) in the host parameter - but want to confirm if this is the right approach for installing in AWS.

bash <(curl -fsSL https://download.elastic.co/cloud/elastic-cloud-enterprise.sh) install
--cloud-enterprise-version 2.8.1
--availability-zone ElasticAZ1
--roles "director,coordinator,proxy,allocator"
--memory-settings '{"runner":{"xms":"1G","xmx":"1G"},"zookeeper":{"xms":"4G","xmx":"4G"},"director":{"xms":"1G","xmx":"1G"},"constructor":{"xms":"4G","xmx":"4G"},"admin-console":{"xms":"4G","xmx":"4G"}}'
--host-ip PUBLICIP

Alex_Piggott · July 22, 2021, 3:43pm

If you have to use the public IP, you have something somewhere that is blocking "docker -> internal IP" comms, which is a bit concerning.

I run ECE on AWS and I do not use the public IP for --host, I leave it at its default (and AWS is one of the most common platforms ECE is run on in general)

Looking on my ec2 setup, it appears that attempting to route to the public address goes offbox:

elastic@ip-192-168-44-10:~$ tracepath 54.(redacted)
 1?: [LOCALHOST]                                         pmtu 9001
 1:  ip-192-168-44-1.ec2.internal                          0.133ms pmtu 1500
 1:  no reply
 2:  no reply
...

(It leaves via 44.1 and I think can't get back in again because of my security groups rules, hence the no reply)

If that is the case for you I'd be hesitant to run it that way. As an example docker containers route some traffic in the clear to the host because of the assumption that this traffic will not go off box. And there could be AWS billing implications as well depending on how far off it goes

vjam · July 22, 2021, 6:48pm

tracepath for both internal and external ip's seem to make it back.

Alex_Piggott · July 22, 2021, 7:30pm

I think it's probably fine to run that way for anything other than production deployments

(I don't think it will be difficult to fix the IP filtering issues at that point, I'm sure there's just a 1-line change in some OS config somewhere that is blocking it, eg net.ipv4.ip_forward = 1 in /etc/sysctl.conf?)

vjam · July 26, 2021, 8:09pm

We ran this and it is working successfully but it still in not installing

vjam · July 27, 2021, 7:55pm

Alex - Thanks for helping us. Another question - you mentioned that Elastic installation in AWS is standard. but have you has Elastic installed in RHEL8 + AWS. We are thinking Rhel8 and docker don't work well together since RHEL8 wants to create its own containers and they might have disabled some docker Container Libraries (hard to tell - but that's what it seems like).

Alex_Piggott · July 27, 2021, 8:09pm

We definitely have customers who have used ECE on RHEL8

I don't know if we have customers on RHEL8 and AWS

vjam · July 27, 2021, 8:12pm

Thanks Alex -
Also we have this (below) in /etc/sysctl.conf - Should we not be having this in the file

mung · July 27, 2021, 10:16pm

Hi Alex,
I work with Venkat and wanted to share some tests I've done. We were able to successfully verify that we could transfer files on port 22000 through ncat. I've attempted several fresh installations of ECE on this specific AMI, RHEL-8.4.0_HVM-20210504-x86_64-2-Hourly2-GP2 in AWS and receive a 'ConnectionError' message. I've disabled selinux and firewalld as well as trying a force install option to no avail.
I then tried the same install on an Ubuntu instance using 'ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20210430'. It worked no issues.
Could you possibly attempt to replicate our issue in doing a quick install on a RHEL 8 AMI in AWS?

Topic		Replies	Views
Checking runner ip connectivity... FAILED Elastic Cloud Enterprise (ECE)	5	960	February 4, 2020
ECE medium installation example, AWS, runner ip connectivity Elastic Cloud Enterprise (ECE)	7	1215	November 22, 2019
Elasticsearch, docker and iptables Elasticsearch	2	2906	July 5, 2017
AWS - Unable to connect to EC2 instance publically Elasticsearch	5	510	July 6, 2017
ECE Install Fail Elastic Cloud Enterprise (ECE)	3	1823	February 8, 2018

Checking runner ip connectivity... FAILED Can't connect $RUNNER_HOST_IP [172.xx.x.xx:22000]: ConnectionError

SELINUX= can take one of these three values:

Related topics