ES nodes disconnects intermittently from the cluster


(Neeraj Gupta) #1

I have a 12 node cluster for elasticsearch(version 2.4), all the machines are on EC2-instances. IN between I am getting errors like master left due to which unassigned shards increases and cluster health turns to red.

[2018-01-11 07:48:44,828][INFO ][discovery.zen ] [esnode21] master_left [{esnode12}{-YkCOqMRSoSR8CiQwZG_cw}{x.x.x.x}{x.x.x.x:9300}], reason [failed to ping, tried [3] times, each with maximum [1h] timeout]

Before these logs its just a normal GC activity no errors.
We have tested the network configuration even ran a test to test network drops and everything seems fine.

Increased the fault-detection to following :-
discovery.zen.fd.ping_interval: 1800s
discovery.zen.fd.ping_timeout: 3600s

After going through various blogs I came across Nodes randomly disconnected from the ES cluster. I have also truned of scatter-gather using ethtool as mentioned in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1317811.

Still these errors are occurring intermittently. Please suggest something.

Thanks


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.