Possible causes for 'transport disconnected' errors in node discovery?

Thanks very much for your reply. I'm quickly increasing my knowledge of
tcp and keep alives.

Yes it is very likely there were no writes to those nodes for a long time,
though there were reads.

I will try reducing the tcp keep alive time on the nodes, though it seems a
shame to have to change these machine wide settings to keep the ES cluster
healthy during quiet periods - I would expect a product like ES to be more
resilient than this by default. And we have the added restriction that a
SAN connected to one of the servers requires specific tcp keep alive
settings.

I'm still struggling to understand exactly what's going on here... ES is
sending a ping message every second, and doing it via the same transport
object on which the disconnect occurs (if my understanding of the code is
correct), and hence the same tcp connection. The ping message is
effectively data packets going across the tcp connection, removing the
importance/need for tcp keep alives to maintain the connection. So I'm
confused as to why the keep-alive is important. Looking closer at the
code, when the transport to a given node disconnects, it attempts to
establish the connection again once, which is what the "with verified
connect" part of the log message seems to refer to.

On Tuesday, September 17, 2013 8:49:15 PM UTC+1, Jörg Prante wrote:

Is it possible you had no traffic between the locations for some hours? If
so, ES needs tcp keepalive message on the long living connections to keep
them persistent. Check your underlying OS tcp keepalive timeout (default on
Linux is something of 7200 seconds) which should be as low as, e.g. 600
seconds, after that time, the first tcp keepalive message is sent. Also
consider a lower interval for the keepalive messages.

I found this hint at

Redirecting to Google Groups

More info

Using TCP keepalive under Linux

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.