Cluter removed timeout Coordinating node

chengdihua · January 16, 2023, 2:49am

Hello, everyone!
We have 15 nodes in our ES cluster, including 3 master nodes, 9 data nodes, and 3 coordinate/client nodes.
There are 3 physical hosts in the cluster, and 5 ES nodes are deployed on each host. (1 master, 1 client/coordinate, 3 data) Each host has 46 CPUs and 512 GB of RAM.
Every day 3 client nodes randomly leave the cluster and automatically join it again after 10 minutes or so. During the time of the problem, there were operations doing queries and writes, but there were not many requests and the hosts had more than enough resources, so there was no resource shortage.
We have been pinging and the network is fine, no packet loss.
Do you have any friends who have encountered similar problems?

This is the coordination node log

[2022-12-29T01:28:50,400][INFO ][o.e.d.z.ZenDiscovery ] [xxxx-001-kzx_client] master_left [{xxxx-003-kzx_master}{Qxixi5PtQbOVI9lUOz94nA}{RozHaYueRsCPj1X_4DxEuA}{xxxx.40}{xxxx.40:9300}{xpack.installed=true}], reason [failed to ping, tried [3] times, each with maximum [30s] timeout]
[2022-12-29T01:28:50,401][WARN ][o.e.d.z.ZenDiscovery ] [xxxx-001-kzx_client] master left (reason = failed to ping, tried [3] times, each with maximum [30s] timeout)

This is the master node log

DavidTurner · January 16, 2023, 6:45am

Hi there @chengdihua and welcome!

You're using version 6.8.5 which became unsupported years ago. It definitely had bugs which could lead to these symptoms. You should upgrade to a supported version ASAP.

system · February 13, 2023, 6:45am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Data node left the cluster due to `master_left - failed to ping, tried [3] times, each with maximum [30s] timeout]` Elasticsearch	6	5548	February 4, 2019
Cluster failures Elasticsearch	2	284	July 6, 2017
Nodes leaves cluster and rejoin after sometime Elasticsearch	3	668	March 20, 2018
ES nodes disconnects intermittently from the cluster Elasticsearch	1	633	February 8, 2018
ES 6.0 timeout on cluster Elasticsearch	9	1162	January 18, 2018

Cluter removed timeout Coordinating node

Related topics