Intermittent No node available exception

I need some pointers on an intermittent no node available exception that I am facing
we have few VMs running wb services and a 6 node ES cluster . WE hit the cluster from these web services
Recently We have been observing intermittent no node available exceptions.
Whats common among these errors is that they all come from just one server and all the errors are centred around the same time(few seconds in a day).
Rest of the time the entire setup works fine.
nproc and nofile have been set to sufficiently high numbers in limits.conf

* soft nproc 256000
* hard nproc 256000
* hard nofile 1048576
* soft nofile 1048576

So I don't think It could be the case of sufficient file descriptors not being available. I am using elasticsearch trasnport client with sniff set to false.
How can I debug this ?
Is it possible that because of high load the server can not make ES connections ?
How can this be cofirmed ?
This seems like an open ended question but any help is appreciated

Is there anything in the Elasticsearch logs around the time you are seeing these issues? Are you distributing requests evenly across all nodes in the cluster? Which version are you using? What is the specification of the nodes in the cluster?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.