Multi-host docker-compose setup woes

On three hosts A, B, and C, where a v5 cluster is already running, I need to setup a new v7 cluster, so that at some point I can migrate to v7 and drop the v5.

This means I have to map ports 9200 & 9300 to different ports on the docker host, so i have chosen 9207 & 9307 on all three hosts.

Then I get this inscrutable error message:

es72-01 | {"type": "server", "timestamp": "2022-12-07T14:37:58,625Z", "level": "WARN", "component": "o.e.d.HandshakingTransportAddressConnector", "cluster.name": "es-cluster-test-72", "node.name": "es72-01:9307", "message": "[connectToRemoteMasterNode[172.16.0.149:9307]] completed handshake with [{es72-02:9307}{pSAcmx1sQbeOIwz0ZDO08A}{mur64cg4QUiwSotQ5DlteQ}{10.40.4.2}{10.40.4.2:9300}{cdfhilmrstw}{ml.machine_memory=4122955776, ml.max_open_jobs=512, xpack.installed=true, ml.max_jvm_size=536870912, transform.node=true}] but followup connection failed"

es72-01 is on host A, es72-02 is on host B. 10.40.4.2:9300 is the container IP for all 3 hosts.

Is there a place where I can find a docker-compose file (for each host) that is known to work in my setup?

Not three nodes on the SAME host, that I can do easily already. Three nodes on THREE separate hosts.

I'll be happy to post my setup for any kind soul willing to assist, but a ready, known-to-work setup would be fantastic, and benefit others in the future with similar problems.

The message is improved (slightly) in Remove unnecessary fork in HandshakingTAConnector by DaveCTurner · Pull Request #85107 · elastic/elasticsearch · GitHub (8.2). More generally, 8.x is better than 7.x in very many ways, so I'd recommend going to 8.x if at all possible.

To break the message down, this means we connected to a node at 172.16.0.149:9307 and the node reported that its canonical "publish" address was 10.40.4.2:9300, but we could not connect to this node at 10.40.4.2:9300.

There could be many explanations for this but the most common one is that you're trying to use a Docker bridge network across hosts: the Docker docs say:

Bridge networks apply to containers running on the same Docker daemon host. For communication among containers running on different Docker daemon hosts, you can either manage routing at the OS level, or you can use an overlay network.

If it's not that, you'll need to dig deeper into your Docker networking setup. These Elasticsearch docs might help too.

Thanks David,

That cannot be the answer, because the already running v5 cluster uses bridge networking as well. Unless something has changed from v5 to v7.

Additionally, I have network.publish_host=<HOST_IP_ADDRESS> in both nodes.

Elasticsearch is perhaps not seeing this configuration? The node mentioned in this message definitely has a publish address of 10.40.4.2:9300 at which it's not accessible. If its publish address were 172.16.0.149:9307 then I think it'd work.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.