Sporadic NodeDisconnectedException and NoNodeAvailableException Failures

Hi,

We have been seeing sporadic NodeDisconnectedException and
NoNodeAvailableException in our ES cluster (0.90.7).

Our cluster is made up of 2 data nodes. One data node has a single primary
shard and one data node has a single replica shard. We connect to using the
Java TransportClient configured with both hosts.

We're able to connect and index and query 98% of the time. I have played
around with client.transport.ping_timeout and that seems to address our
NoNodeAvailableException.

However, we haven't been able to figure out the NodeDisconnectedException.

2014-03-21 12:53:18,322 DEBUG [ I/O worker #9}] [APP]

[.elasticsearch.transport.netty] [Pisces] disconnected from
[[#transport#-1][inet[prod-elasticsearch.domain/127.0.0.1:9300]]], channel
closed event

2014-03-21 12:53:22,402 DEBUG [[generic][T#57]] [APP]
[.elasticsearch.transport.netty] [Pisces] connected to node
[[#transport#-1][inet[prod-elasticsearch.domain/127.0.0.1:9300]]]

which is then immediately followed by:

Caused by: org.elasticsearch.transport.NodeDisconnectedException:

[inet[prod-elasticsearch.domain/127.0.0.1:9300]][index] disconnected

These logs are all generated on the client side and there is nothing that
sticks out in the logs on either of the nodes.

I've seen in other posts that there might be network issues or that there
might not be enough resources (cpu and/or memory).

Does anyone have experience with these errors or know where I might want to
be looking?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4494bbe9-e3f8-4069-a093-daa103e8f980%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

You should update your version to latest 0.90.x version or 1.0.1 although it probably won't solve your "network" issue.

I suppose you don't have anything in nodes logs?
How much HEAP did you give to your nodes?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 21 mars 2014 à 22:26, Matt Greenfield matthew.d.greenfield@gmail.com a écrit :

Hi,

We have been seeing sporadic NodeDisconnectedException and NoNodeAvailableException in our ES cluster (0.90.7).

Our cluster is made up of 2 data nodes. One data node has a single primary shard and one data node has a single replica shard. We connect to using the Java TransportClient configured with both hosts.

We're able to connect and index and query 98% of the time. I have played around with client.transport.ping_timeout and that seems to address our NoNodeAvailableException.

However, we haven't been able to figure out the NodeDisconnectedException.

2014-03-21 12:53:18,322 DEBUG [ I/O worker #9}] [APP] [.elasticsearch.transport.netty] [Pisces] disconnected from [[#transport#-1][inet[prod-elasticsearch.domain/127.0.0.1:9300]]], channel closed event
2014-03-21 12:53:22,402 DEBUG [[generic][T#57]] [APP] [.elasticsearch.transport.netty] [Pisces] connected to node [[#transport#-1][inet[prod-elasticsearch.domain/127.0.0.1:9300]]]

which is then immediately followed by:

Caused by: org.elasticsearch.transport.NodeDisconnectedException: [inet[prod-elasticsearch.domain/127.0.0.1:9300]][index] disconnected

These logs are all generated on the client side and there is nothing that sticks out in the logs on either of the nodes.

I've seen in other posts that there might be network issues or that there might not be enough resources (cpu and/or memory).

Does anyone have experience with these errors or know where I might want to be looking?

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4494bbe9-e3f8-4069-a093-daa103e8f980%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/DA454BFE-5A5E-4061-BCF7-87215E820923%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Upgrading to 1.0.1 is in the works.

There really isn't any useful information in the logs for the nodes from
what I can tell. I do see some of these TRACE logs:

[2014-03-22 14:08:20,732][TRACE][transport.netty ]

[prod-elasticsearch.domain] channel closed: [id: 0x2d18782f, /127.0.0.1:53420
=> /127.0.0.1:9300]
[2014-03-22 14:08:21,016][TRACE][transport.netty ]
[prod-elasticsearch.domain] channel closed: [id: 0xf843b82b, /127.0.0.1:53433
=> /127.0.0.1:9300]
[2014-03-22 14:08:25,675][TRACE][transport.netty ]
[prod-elasticsearch.domain] channel opened: [id: 0xc6c0eef1, /127.0.0.1:41107
=> /127.0.0.1:9300]
[2014-03-22 14:08:25,676][TRACE][transport.netty ]
[prod-elasticsearch.domain] channel opened: [id: 0x4838c71f, /127.0.0.1:41106
=> /127.0.0.1:9300]

There appear to be more of the *closed *logs when the exception occurs.

Our HEAP size on each node is 6gb and the total memory for each node is
8gb.

On Friday, March 21, 2014 11:59:32 PM UTC-4, David Pilato wrote:

You should update your version to latest 0.90.x version or 1.0.1 although
it probably won't solve your "network" issue.

I suppose you don't have anything in nodes logs?
How much HEAP did you give to your nodes?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 21 mars 2014 à 22:26, Matt Greenfield <matthew.d....@gmail.com<javascript:>>
a écrit :

Hi,

We have been seeing sporadic NodeDisconnectedException and
NoNodeAvailableException in our ES cluster (0.90.7).

Our cluster is made up of 2 data nodes. One data node has a single primary
shard and one data node has a single replica shard. We connect to using the
Java TransportClient configured with both hosts.

We're able to connect and index and query 98% of the time. I have played
around with client.transport.ping_timeout and that seems to address our
NoNodeAvailableException.

However, we haven't been able to figure out the NodeDisconnectedException.

2014-03-21 12:53:18,322 DEBUG [ I/O worker #9}] [APP]

[.elasticsearch.transport.netty] [Pisces] disconnected from
[[#transport#-1][inet[prod-elasticsearch.domain/127.0.0.1:9300]]], channel
closed event

2014-03-21 12:53:22,402 DEBUG [[generic][T#57]] [APP]
[.elasticsearch.transport.netty] [Pisces] connected to node
[[#transport#-1][inet[prod-elasticsearch.domain/127.0.0.1:9300]]]

which is then immediately followed by:

Caused by: org.elasticsearch.transport.NodeDisconnectedException:

[inet[prod-elasticsearch.domain/127.0.0.1:9300]][index] disconnected

These logs are all generated on the client side and there is nothing
that sticks out in the logs on either of the nodes.

I've seen in other posts that there might be network issues or that there
might not be enough resources (cpu and/or memory).

Does anyone have experience with these errors or know where I might want
to be looking?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4494bbe9-e3f8-4069-a093-daa103e8f980%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/4494bbe9-e3f8-4069-a093-daa103e8f980%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a97033c5-1426-4b44-9c45-86d224a09e6e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.