[5.6.x] Cluster stability above Docker Swarm

Daniel_Rogatchevsky · July 31, 2018, 7:38am

Hello all,

We have an ES cluster running in our pre-production environment, using official Docker images and Swarm.
The cluster contains 3 nodes that are running as both data and master and a single agent node. Attaching stack YAML file link.
Docker configurations is attached as well. link

Environment:
• Single bare metal server 256GB RAM, 72 cores
• 4.15.0-29-generic #31~16.04.1-Ubuntu
• Docker 18.06.0-ce
• docker-compose version 1.21.0

We are facing a directional connectivity loss between services (ES nodes - Node not connected).
It means ES node 01 can reach node 02, but 02 cannot reach 01.
Obviously it cased cluster stability issues, master re-election, unassigned shards and etc.

Log samples attached bellow. link-1 link-2 link-3

The issues occurs both when no data been streamed (except monitoring) and when we are streaming significant amount of data (10-15K pps).
First two log samples are from when the cluster is in standby (almost no data ingested, only couple MB is stores) while third one is from when the cluster is populated with 1.1TB.
Once the cluster is populated – each glitch causes painful recovery.
Docker log is free of errors.

Note: same behavior was observers when the swarm is running above number of VMs; with ES versions 5.6.4, 5.6.9 and 5.6.10

Any thoughts on how to proceed with troubleshooting?

Thanks,
Daniel

Daniel_Rogatchevsky · August 1, 2018, 8:58am

The solution was found here:

endpoint_mode: dnsrr

system · August 29, 2018, 8:58am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Multi node cluster failing to connect Elasticsearch docker	5	708	April 17, 2023
ElasticSearch docker swarm unreachable Elasticsearch docker	3	113	May 14, 2024
Docker swarm discovery problems Elasticsearch docker	8	3454	March 26, 2020
Masters leave for no apparent reason Elasticsearch	3	2042	August 22, 2018
Elastic cluster on Docker swarm Elasticsearch docker	1	783	December 13, 2023

[5.6.x] Cluster stability above Docker Swarm

Related topics