Multi-host docker-compose setup woes

bobus · December 7, 2022, 3:09pm

On three hosts A, B, and C, where a v5 cluster is already running, I need to setup a new v7 cluster, so that at some point I can migrate to v7 and drop the v5.

This means I have to map ports 9200 & 9300 to different ports on the docker host, so i have chosen 9207 & 9307 on all three hosts.

Then I get this inscrutable error message:

es72-01 | {"type": "server", "timestamp": "2022-12-07T14:37:58,625Z", "level": "WARN", "component": "o.e.d.HandshakingTransportAddressConnector", "cluster.name": "es-cluster-test-72", "node.name": "es72-01:9307", "message": "[connectToRemoteMasterNode[172.16.0.149:9307]] completed handshake with [{es72-02:9307}{pSAcmx1sQbeOIwz0ZDO08A}{mur64cg4QUiwSotQ5DlteQ}{10.40.4.2}{10.40.4.2:9300}{cdfhilmrstw}{ml.machine_memory=4122955776, ml.max_open_jobs=512, xpack.installed=true, ml.max_jvm_size=536870912, transform.node=true}] but followup connection failed"

es72-01 is on host A, es72-02 is on host B. 10.40.4.2:9300 is the container IP for all 3 hosts.

Is there a place where I can find a docker-compose file (for each host) that is known to work in my setup?

Not three nodes on the SAME host, that I can do easily already. Three nodes on THREE separate hosts.

I'll be happy to post my setup for any kind soul willing to assist, but a ready, known-to-work setup would be fantastic, and benefit others in the future with similar problems.

DavidTurner · December 7, 2022, 3:29pm

The message is improved (slightly) in Remove unnecessary fork in HandshakingTAConnector by DaveCTurner · Pull Request #85107 · elastic/elasticsearch · GitHub (8.2). More generally, 8.x is better than 7.x in very many ways, so I'd recommend going to 8.x if at all possible.

To break the message down, this means we connected to a node at 172.16.0.149:9307 and the node reported that its canonical "publish" address was 10.40.4.2:9300, but we could not connect to this node at 10.40.4.2:9300.

There could be many explanations for this but the most common one is that you're trying to use a Docker bridge network across hosts: the Docker docs say:

Bridge networks apply to containers running on the same Docker daemon host. For communication among containers running on different Docker daemon hosts, you can either manage routing at the OS level, or you can use an overlay network.

If it's not that, you'll need to dig deeper into your Docker networking setup. These Elasticsearch docs might help too.

bobus · December 7, 2022, 4:16pm

Thanks David,

That cannot be the answer, because the already running v5 cluster uses bridge networking as well. Unless something has changed from v5 to v7.

Additionally, I have network.publish_host=<HOST_IP_ADDRESS> in both nodes.

DavidTurner · December 7, 2022, 4:56pm

Elasticsearch is perhaps not seeing this configuration? The node mentioned in this message definitely has a publish address of 10.40.4.2:9300 at which it's not accessible. If its publish address were 172.16.0.149:9307 then I think it'd work.

system · January 4, 2023, 4:56pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Cannot setup cluster of ES with Docker of 2 EC2 machines on 3 nodes Elasticsearch docker	5	960	November 13, 2021
Dockerised elasticsearch cluster in different hosts Elasticsearch docker	7	3539	March 21, 2019
How is Docker/Docker-Compose getting in the way? Elasticsearch docker	15	2631	May 26, 2020
Elasticsearch clustering on two hosts using docker-compose Elasticsearch	3	1050	August 6, 2019
ES 5.6.16 Docker cluster on multiple physical machines via Docker on custom ports Elasticsearch	1	738	August 20, 2019

Multi-host docker-compose setup woes

Related topics