NoNodeAvailableException, unrecoverable


I've searched the myriad threads here on the topic of NoNodeAvailableException.

I wanted to check that the information gathered is up to date.

We're using Elasticsearch 2.4.4, and the Java TransportClient. We experience periodic instances of this that I think are ephemeral and can be fixed with better configuration options.

However, yesterday (April 2nd), we experienced an outage that our servers didn't recover from - tons of requests failing with "NoNodeAvailableException"s. This was eventually fixed by just refreshing (replacing) the instances of our service.

This leads me to believe we had cached address that was no longer relevant.

There are a lot of posts on this, some of which suggest setting operating system-specific timeout settings (which isn't very viable or clean. Some suggest writing a daemon that updates the addresses attached to the TransportClient.

This link has some suggestions for altering some ping and DNS settings, which make sense and may help.

The real clarification I'm seeking is this: If I pass in a name into my TransportClient, is that immediately resolved and then never updated? Or is it the case of a stale address external to the client, and fixable within the constraints of the JVM's caching of domains?

We plan to at least perform the tuning in the link above, but would ideally like confirmation if we have to go the full route of manually writing code that will refresh the client when the IP address behind the domain changes.

Ok, looks like this was "solved" here:

We don't want to upgrade to 5 yet so I guess we'll write code to refresh our addresses.

Hi Brian,

All access to cloud clusters goes through ELBs which do rotate IPs from time to time.

I've moved this to the Elasticsearch forum since how to configure DNS refresh for 2.4.4 is mainly a client implementation question so it should be able to get a better answer there.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.