Data nodes leaving the cluster randomly


(Ajesh) #1

Hi,
We have a cluster consisting of 5 data nodes where we have around 10 indices and some indices are large (around 800gb). Sometimes some data node randomly leaves the cluster. We are unable to figure out the reason for this behaviour. We have a total of 28gb RAM in the node, out of which we have allocated 14gb for jvm heap.
One more point to add is that we have not disabled swapping (bootstrap.mlockall = true in elasticsearch.yml). Is it a possibility that lucene is using high amount of the remaining memory and thus the elastics search process is being swapped out by the OS? Please let me know if anyone else has faced this issue and how can this be resolved.

Thanks & Regards
Ajesh


(Harlin) #2

How many shards are in your cluster per node?


(Ajesh) #3

We have around 60 shards per node


(Harlin) #4

With only 60 shards per node that shouldn't be the issue. Could this be an issue with the network? What is your doscover.zen.ping_timeout set at?


(system) #5